mcculley@alien.enet.dec.com (05/26/90)
In article <9005241526.AA12356@fsucs.cs.fsu.edu>, groh@fsucs.cs.fsu.edu (Jim Groh) writes... >In article <4228@hcx1.SSD.CSD.HARRIS.COM>, steve@SSD.CSD.HARRIS.COM (Steve Brosky) writes: >> >>* contiguous disk files -- >> allows faster disc I/O because seeks are eliminated > > Isn't that also dependent on file size and disk configuration? > Not only that, it makes a lot of assumptions about the nature of the disk i/o (IMHO questionable ones at that). Basically, seeks can be eliminated only for the file blocks contained in the track(s)/cylinder(s) presently under the heads (and maybe the next in sequence). So the elimination of seeks will be true only for one limited portion of one (possibly large?) file. Access to other files, and other parts of that file, will require seeks. The overhead involved in accomplishing those seeks may or may not be increased due to the simplistic contiguous file structure (in my experience Murphy will insist that it is). There is a related assumption that there will be only one single access stream to a disk at any time, or else seeks will still be required. So that multiprocessor configuration either needs a disk per processor, or some of them will have to wait for that single access stream to become available in order to hit the disk without causing those dreaded seeks. And I won't ask about extending files... Contiguous files are a very simple structure, as such they minimize overhead at the expense of capabilities. Might be the best thing for a system requiring only one disk access stream, on the other hand if you have multiple processors wouldn't multiple disk access streams be nice too? Like any tradeoff, you pays your money, you takes your choice. - Bruce McCulley RSX-11 Software Development Digital Equipment Corp. Nashua, NH USA
sccowan@watmsg.uwaterloo.ca (S. Crispin Cowan) (05/26/90)
In article <4228@hcx1.SSD.CSD.HARRIS.COM> steve@SSD.CSD.HARRIS.COM (Steve Brosky) writes: >CX/RT, from Harris Computer Systems, is a Unix compatible operating system that >runs on the Night Hawk platform: a tightly-coupled, shared-bus symmetric >multiprocessor containing up to 8 MC68030 CPUs, per-processor external caches >and local memories, and a large global memory. All times listed below >are for a 20MHz 68030. [stuff] >* fast context switch times -- 40-60 microseconds This is obviously crucial to real-time response, and seems much faster than conventional Unix context switching time. How did you get Unix to context switch so quickly? Crispin ---------------------------------------------------------------------- (S.) Crispin Cowan, CS grad student, University of Waterloo Office: DC3548 x3934 Home phone: 570-2517 Post Awful: 60 Overlea Drive, Kitchener, N2M 1T1 UUCP: watmath!watmsg!sccowan Domain: sccowan@watmsg.waterloo.edu "You can have peace. Or you can have freedom. Don't ever count on having both at once." -Lazarus Long "You can't separate peace from freedom because no one can be at peace unless he has his freedom." -Malcolm X
uemura@isvax.isl.melco.co.jp (Joe Uemura) (05/29/90)
In article <4228@hcx1.SSD.CSD.HARRIS.COM>, steve@SSD.CSD.HARRIS.COM (Steve Brosky) writes: <lots of stuff deleted> > > > The operating system supports real-time features like: . . > * real-time process synchronization -- > very fast synchronization primitives to coordinate access to shared data > between cooperating processes (this is not AT&T system V IPC semaphores, > they are too slow!). The synchronization primitives which block, will > also place a bound on priority inversion. > Could you please elaborate more on what you mean by placing "a bound on priority inversion"? Is this some form of avoidance protocol? I would also like to hear if anyone knows of other realtime Unix adaptations which deal with priory inversion. Joe Uemura Mitsubishi Electric Co ISED Lab Parallel Computing Group Kamakura, Japan
buck@siswat.UUCP (A. Lester Buck) (05/30/90)
In article <17536@isvax.isl.melco.co.jp>, uemura@isvax.isl.melco.co.jp (Joe Uemura) writes: > > * real-time process synchronization -- > > The synchronization primitives which block, will > > also place a bound on priority inversion. > > > Could you please elaborate more on what you mean by placing "a bound on > priority inversion"? Is this some form of avoidance protocol? > > I would also like to hear if anyone knows of other > realtime Unix adaptations which deal with priory inversion. I am not sure this is what you want, but an excellent tutorial on realtime scheduling theory and priority inversion problems in Ada is in the following article: "Real-Time Scheduling Theory and Ada", by Lui Sha and John B. Goodenough IEEE Computer, April 1990. This is a very readable tutorial on the rate-monotonic theory of realtime scheduling. The rate-monotonic algorithm is the simple - it assigns the higher priority to tasks with the shortest periods. It is analogous to linear system theory - most everything is exactly solvable, but the processor utilization is somewhat lower than a scheduler that does not follow the rate-monotonic algorithm. References in the article extend the theory to multi-processors. Then the authors show how the Ada tasking model matches the rate-monotonic algorithm well, in allowing designers to ignore time line type scheduling and depend on the Ada runtime to schedule tasks successfully according to the algorithm. But they also point out how realtime programming is somewhat of a joke in Ada. "Of course, telling programmers to assign 'Rate_Monotonic_Priorities' to tasks but not to use pragma PRIORITY surely says we are fighting the language rather than taking advantage of it. But, the important point is that no official revisions to the language are needed." And all this from a language _designed_ to be used for realtime, embedded systems! That's what happens when the language is designed by a committee of academics and finalized by a commercial company that didn't do its homework, years before a working compiler existed and the people who would be using the language got a chance to code something. "Priority inversion, what's that?" Various Ada runtime environments are adding rate-monotic support, and from a phone conversation with Lui Sha I learned that Lynx Realtime Systems wants to add the rate-monotonic algorithm to its scheduler. It is no small coincidence that Lynx OS was chosen for the space station, since the rate-monotic algorithm is an excellent fit to the open realtime system planned for that project. As long as all vendors follow the rate-monotonic discipline with their realtime tasks, they can code and combine their tasks independently and know that the tasks will meet their deadlines. As an aside, the space station leads to some unique realtime problems, with its open environment and the requirement that once it starts up, it should never shut down, even during software upgrades. By contrast, the shuttle on-board software was written by exactly one company (IBM) and can be upgraded between missions. The only advantage the space station has is that if a system fails for a few minutes, (usually) nothing really bad happens. Life support, ventilation, experiments, telemetry, all can stand a few minutes while systems reboot or whatever (unless the system has done something _really_ stupid, of course...). -- A. Lester Buck buck@siswat.lonestar.org ...!texbell!moray!siswat!buck
monty@SSD.CSD.HARRIS.COM (Monty Norwood) (05/30/90)
The benefit of contiguous files is indeed the elimination of seek times. As noted by many posters to this group, the success of this is dependent on many things, not just the contiguous nature of the file. The primary benefit is in cases where a disk, in a real time application, is dedicated as an archive. Essentially a large circular (possibly) buffer saving the last x seconds, minutes, or hours of the raw data gathered. Telemetry applications often need this. Missile range stuff where raw data comes in fast and furious but only for a short duration of time can benefit from this use. On the Night Hawk system, contiguous disk files are typically used with the direct disk I/O feature (bypassing the buffer cache) so that large chunks of data can be written to the disk on a periodic basis as quickly as possible. These features are intended to be used in stringent real time environments where the environment is controlled. Clearly, if there is other activity to the same disk or the accesses to the file are random (as opposed to sequential) then all the noted problems in other postings make this feature a moot point. Expanding the file is indeed a problem. It is not perfect, but very useful in some applications, particularly real time.
steve@SSD.CSD.HARRIS.COM (Steve Brosky) (05/30/90)
> What are spin locks?? > (hope this isn't a frequently asked question!! ) spin locks, or busy waiting locks, are locks for which a user never blocks. When the user tries to lock a spin lock which is already locked, he spins in a tight loop, waiting for the lock to be freed. Spin locks are only used to protect critical sections which hold the lock for a very short period of time, and therefor not worth the overhead of blocking the process. Steve Brosky sabrosky@ssd.csd.harris.com Harris Computer Systems Division 2101 W Cypress Creek Rd Fort Lauderdale, Fla 333122
steve@SSD.CSD.HARRIS.COM (Steve Brosky) (05/30/90)
>>* fast context switch times -- 40-60 microseconds >This is obviously crucial to real-time response, and seems much faster >than conventional Unix context switching time. How did you get Unix to >context switch so quickly? A lot of effort was put into optimizing context switches. Sorry I can't tell you about some of the more subtle tricks we've used. However I should explain what is meant by this ambiguous term "context switch time". The 40-60 microseconds is the time to: - save the state (registers) of the old process (note that this machine has an enhanced floating point unit so this also includes 8 extra floating point registers) - select the new process - restore the state (VM, registers) of the new process - invalidate the caches This context switch time does not include the time used to decide that a context switch is required. For example, in the case where a process expires its quantum (determined in a clock interrupt routine), the processing time consumed by the clock interrupt is not included in the quoted context switch time. Certain people are more interested in the time it takes to transition from process A to process B when process A takes some action that causes process B to run. There are a number of different primitives available for this on our system that take the form of a system call. The "real-time" primitives take from 170-200 microseconds for this to happen. This time is measured from immediately before process A makes the "wake-up" system call (actually the call to the library interface to the syscall) to immediately after process B returns from the "blocking" system call. This is actual wall time from the last useful instruction in user process A to the first useful instruction in user process B. The 40-60 microsecond context switch is part of this time. Depending on the synchronization method used, these times vary considerably. Signals, for example, are far less efficient than the mechanisms alluded to above. There are other cases where the times are better and approach 100 microseconds for a transition from last user mode instruction to first user mode instruction. Steve Brosky sabrosky@ssd.csd.harris.com Harris Computer Systems Division 2101 W Cypress Creek Rd Fort Lauderdale, Fla 333122
drk@athena.mit.edu (David R Kohr) (06/04/90)
Monty Norwood's discussion of the contiguous disk file facility in Harris' Nighthawk multiprocessor system got me wondering about how other people out there handle the recording of massive amounts of instrumentation data which occurs in short but dense bursts. The project I'm on here at Lincoln involves recording radar data at a rate of 13 Mb./sec. in the current version of the system, for a period of a minute or two; we may ultimately produce a system which records up to 40 Mb./sec. We had to buy a very expensive digital tape recorder from AMPEX Corporation to handle this rate, and have lots of specialized hardware built and tied together with a custom high-speed data bus, to feed the AMPEX recorder quickly enough. But this system was essentially designed three years ago, and I have been wondering if there are other alternatives available nowadays for handling this kind of data rate. -- David R. Kohr M.I.T. Lincoln Laboratory Group 45 ("Radars 'R' Us") email: DRK@ATHENA.MIT.EDU (preferred) or KOHR@LL.LL.MIT.EDU phone: (617)981-0775 (work) or (617)527-3908 (home)
steve@SSD.CSD.HARRIS.COM (Steve Brosky) (06/05/90)
>> * real-time process synchronization -- >> very fast synchronization primitives to coordinate access to shared data >> between cooperating processes (this is not AT&T system V IPC semaphores, >> they are too slow!). The synchronization primitives which block, will >> also place a bound on priority inversion. >> > Could you please elaborate more on what you mean by placing "a bound on > priority inversion"? Is this some form of avoidance protocol? These blocking primitives implement basic priority inheritance. This means that when process A attempts to lock one of these locks, and it is already locked by the lower priority process B, we guarantee that process B will get an effective priority at least as high as process A. This allows process B to run until he releases the lock, and loses his higher effective priority. The highest priority process that was waiting for the lock will now be run. This scheme does not eliminate priority inversion, note that while process B was running inside the critical section above a priority inversion was in effect. However the length of the priority inversion is bounded to the longest hold time of the lock. Basic priority inheritance is also not the only approach to priority inversion. The priority ceiling protocol provides better worst-case bounds but is more difficult to implement. For more information see the up coming summer USENIX proceedings for the article "An Implementation of Real-Time Thread Synchronization" by Mark Heuser. Steve Brosky sabrosky@ssd.csd.harris.com Harris Computer Systems Division Fort Lauderdale, Fla.
geoff@modcomp.UUCP (06/07/90)
The discussion on contiguous disk files has concentrated on the elimination of seek times as being the major benefit. Another important benefit is the ability to pass more data in each disk i/o operation, reducing the total number of such operations. An additional benefit on reads is that modern disks often have considerable amounts of RAM and autonomously read ahead into this local cache. Contiguous files benefit from this feature. Here at Modcomp we have produced REAL/IX, a Unix System V based realtime operating system which I'll use as the basis for discussion. In addition to the standard System V file system, we have yet another fast file system. The allocater on the latter is designed to place file blocks contiguously wherever possible, making a "mostly contiguous file system". Files are normally accessed via the buffer cache. If a sequential read is detected, REAL/IX takes advantage of any contiguity by reading several blocks from disk into several buffers in the cache in a single I/O operation. Similarly, writes to contiguous disk blocks can be combined into a single operation. We find that for applications that make intensive use of sequential I/O to large files, throughput can be approximately tripled. In a development environment, the number of disk I/O operations is reduced by, roughly, a factor of 30%. Your mileage may vary. The trade off for using mostly contiguous files is the increased overhead of the block allocater. For very small files or for temporary files that will never touch the disk the standard allocater will be marginally better. The use of larger block sizes is an alternative method of reducing the number of I/O operations, involving a different set of trade offs. There is nothing revolutionary about these techniques, variants of which have been adopted by some other vendors. Their absence from the Unix porting base has lead to their absence from most Unix implementations, leading to the urban myth that Unix filesystems have intrinsically poor performance. Explicitly preallocated contiguous disk blocks are the current empirical method for obtaining maximum disk I/O throughput. If disk and controller are used for no other purpose, guaranteed and small access times result (with the caveat that file accesses cannot be made via the buffer cache). REAL/IX also provides prioritisation of disk I/O requests from realtime processes to allow shared use while providing access time guarantees. The use of contiguous files with asynchronous I/O operations is a natural combination when attempting to obtain the maximum possible throughput. Here we find another benefit of contiguity, simpler asynchronous file I/O. As has been noted, there are problems with explicitly preallocated contiguous files, particularly what to do when you come to the end. There is probably no single right way to deal with this, so we allow the user to specify what action to take. Finally, I'd like to follow up on Steve Brosky's interesting figures for the Harris Night Hawk system. I appreciated the care used in explaining just what the numbers referred to. One of the benefits of the Posix 1003.4 standard, when it is eventually finished, should be that a set of common performance metrics will come into use, allowing users to directly compare systems. Geoff Hall uunet!modcomp!geoff Modcomp modcomp!geoffi@uunet.UU.NET 1650 West McNab Rd Ft Lauderdale, FL 33340