lee@iris.ucdavis.edu (Peng Lee) (09/08/89)
I notice X server for HPUX 6.5 doesn't have the rtprio (real time priority) option as the earlier X server. Would someone tell me why? Since I am can su in this system and my system adminstrator doesn't want to set my group privileges to rtprio, I have no choice but to use the slowwww server without rtprio if I want to use some new features (such as disable the reset key in xlock program). Thanks -Peng (lee@iris.ucdavis.edu)
jack@hpindda.HP.COM (Jack Repenning) (09/14/89)
> I notice X server for HPUX 6.5 doesn't have the rtprio (real time > priority) option as the earlier X server. Would someone tell me > why? I wasn't on the development team for X11, but I am a notes loud-mouth (inside the company as well as outside), and I was involved in the discussions at the time. I also worked in HP's Real Time Executive (RTE) environment for a number of years, where half the programs in the system run as real-time, so I have some experience in real time programming. Here's a totally unofficial summary of why the X server business went the way it did: Writing a program to work reliably as a real-time process is complex, arcane, and costly, and when an rtprio program doesn't "work reliably," what that generally means is your system is hung, and has to be power-cycled to reboot (dirty file systems, work destroyed, that sort of thing). The team didn't have the resources to do a good job of it at the time, and a poor job would not have been a good idea at all. Furthermore, making a whole system work reliably when it includes real-time processes is an almost equally black art, even when all the processes in the system work well by themselves. The work consists "merely" of picking the right ordering of the priorities, for all programs that interact in any way. But there are an endless number of ways for programs to "interact," mistakes are costly, and problems can be hard to test and reproduce. Here, the problem would have to be handled by explaining the details to the end users (you and your SysAdmin, for example), and getting that much explanation across is even harder than doing it oneself. You may ask, "but if it worked with the earlier one, why wouldn't it work with the newer one?" It's a good question; answering it is part of the complex, arcane, costly process of making it work. But it most definitely did *not* work with the new one. As a matter of fact, even the earlier servers weren't supposed to have the rtprio feature: it was yanked early-on from the design and the documents. Someone just forgot to yank it from the code. It's all rather a shame. I used to use it, too (that's how I, along with many others, discovered that it was *not* working with the new one), and I miss it. You can achieve some of the same effect, by nice(1)ing most everything but the server - nice(1) is considerably safer than rtprio(1), because priorities are still allowed to drift, so the dispatcher can cover for your mistakes. I've turned my .x11start file into a /bin/ksh file, and made sure that the ksh has "bgnice" set. That way, everything started from .x11start is niced a bit. Try that out. ------------------------------------------------------------- Jack Repenning - Information Networks Division, Hewlett Packard Corporation uucp: ... {allegra,decvax,ihnp4,ucbvax} !hplabs!hpda!jack or: ... jack@hpda.hp.com HPDesk: Jack REPENNING /HP6600/UX USMail: 43LN; 19420 Homestead Ave; Cupertino, CA 95014 Phone: 408/447-3380 HPTelnet: 1-447-3380 -------------------------------------------------------------
grzm@zyx.SE (Gunnar Blomberg) (09/14/89)
In article <4310057@hpindda.HP.COM> jack@hpindda.HP.COM (Jack Repenning) writes: >> I notice X server for HPUX 6.5 doesn't have the rtprio (real time >> priority) option as the earlier X server. Would someone tell me >> why? >[...] >You can achieve some of the same effect, by nice(1)ing most everything >but the server - nice(1) is considerably safer than rtprio(1), because >priorities are still allowed to drift, so the dispatcher can cover for >your mistakes. I've turned my .x11start file into a /bin/ksh file, >and made sure that the ksh has "bgnice" set. That way, everything >started from .x11start is niced a bit. Try that out. We used to run our X servers at real time priority too (though we did it ourselves -- nobody had any idea that there was any support for it in the X server), but after some troubles with crashes (which, by the way, did not seem to be caused by that anyways), we changed to using nice(1). The way we do it is by running the X server (as well as the window manager) "unniced". This gives the server higher priority than practically everything (and not only the stuff explicitely given lower priority), which is definitely what I want. The way I do it is by running x11start at negative niceness of 20 and then in my .x11start I run .realx11start at a (relative) positive niceness of 20. This runs the X server at high priority and anything I start in .realx11start at normal priority. I guess you could get the same effect in a more centralized way, but this is works for me... -- "The CPU that has most influenced | Gunnar Blomberg the Unix system is unquestionably | ZYX Sweden AB, Bangardsg 13, the Intel 80386." | S-753 20 Uppsala, Sweden --David Fiedler, BYTE, May 1989 | email: grzm@zyx.SE
raveling@isi.edu (Paul Raveling) (09/15/89)
In article <4310057@hpindda.HP.COM>, jack@hpindda.HP.COM (Jack Repenning) writes: > > Writing a program to work reliably as a real-time process is complex, > arcane, and costly, and when an rtprio program doesn't "work > reliably," what that generally means is your system is hung, and has > to be power-cycled to reboot (dirty file systems, work destroyed, that > sort of thing). ... > > Furthermore, making a whole system work reliably when it includes > real-time processes is an almost equally black art, even when all the > processes in the system work well by themselves. ... Sounds like we come from real time system environments that must have different architectures. Ever since beginning to refine my favorite architecture (in 1972) my experience has been the opposite. Given a good kernel architecture, it's been easy to get real time applications to work, even running ALL processes with preemptive scheduling. Of the family of systems I usually mention in this context, EPOS (vintage 1975) is the one I refer to most often. The last bug report I heard on it before we shut down the last PDP-11/45's running it was that the timer keeping track of how long it had been up overflowed at somewhere around 7 months. I keep wishing UNIX had the sort of facilities EPOS and others had to handle multi-process applications. It's MURDER to try to do the same things on UNIX. ---------------- Paul Raveling Raveling@isi.edu
jack@hpindda.HP.COM (Jack Repenning) (09/19/89)
> Given a good kernel architecture, it's been easy to get real time > applications to work, even running ALL processes with preemptive > scheduling. Of the family of systems I usually mention in this That's interesting. Although I can point to a number of details in the kernel (UNIX or otherwise) that can make things easier or harder, I've always found that the most intransigent problems were resource deadlocks between the individual programs. How can EPOS deal with that? j
cjames@hpldsla.HP.COM (Craig James) (09/21/89)
> Sounds like we come from real time system environments that > must have different architectures. ... > > Given a good kernel architecture, it's been easy to get real time... > applications to work, even running ALL processes with preemptive > scheduling. > > Paul Raveling > Raveling@isi.edu It's hard to see how that is possible with real-time capabilities. After all, if I start up a process that runs until it explicitely gives up the CPU, and that process goes into an infinite loop, there is no escape but to reboot. Perhaps the system you described didn't provide such abilities? Craig James, HP Labs "These are my opinions, not HP's"
raveling@isi.edu (Paul Raveling) (09/21/89)
In article <4310059@hpindda.HP.COM>, jack@hpindda.HP.COM (Jack Repenning) writes: > > Given a good kernel architecture, it's been easy to get real time > > applications to work, even running ALL processes with preemptive > > scheduling. Of the family of systems I usually mention in this > > That's interesting. Although I can point to a number of details in > the kernel (UNIX or otherwise) that can make things easier or harder, > I've always found that the most intransigent problems were resource > deadlocks between the individual programs. How can EPOS deal with > that? EPOS didn't have a sure-fire cure for a classic deadlock, but its architecture made it easy for multiple processes to use cooperative rather than competitive techniques. Most multiprocess systems in the '70's tended to use some variant of semaphores for interprocess synchronization and "communication". EPOS used a signal/wait facility that was based on message transmission and queueing rather than on semaphores. When there's a bit more time I may be able to dig up some prose describing this signal/wait facility. EPOS also had lock/unlock system calls to handle the sort of resource locking that semaphores are natural for, but it wasn't necessary to use them very often. Most of the lock/unlock usage was for things like storage allocation, where a call to allocate or release memory locks the control blocks for a particular storage pool while it works. At the kernel level there were a number of spots where EPOS used an even simpler lock when possible: Disabling interrupts. Critical regions doing things like manipulating queues of control blocks simply disabled for the minimum possible time, often as little as two machine instructions. So we didn't attempt deadlock resolution because we never got one. (A simple deadlock would be something like process A locks resource 1, process B locks resource 2, then A locks 2 & B locks 1.) I did have some ideas for untangling deadlocks that involve signal/wait and the resource wait processes used by the lock/unlock mechanism, but we just didn't need to work on the problem. BTW, the resource locks were implemented with signal/wait. The conceptually simplest way to do this is about as like this (guess I'd better mention that event data (from signals) was queued if the receiving process wasn't waiting for an the event in question, so that multiple requests for a resource go onto the lock process' event queue): ProcessId resource_process; lock(resource_process) /* Used to lock resource */ { signal (LOCK_EVENT, resource_process); waits (LOCK_EVENT, resource_process, NULL); } unlock(resource_process) /* Used to unlock resource */ {signal (UNLOCK_EVENT, resource_process);} /* Housekeeping calls: */ Resource *create_lock() {return create_process(lock_process);} delete_lock(resource_process) {delete_process(resource_process);} lock_process () /* Process to control resource */ { ProcessId locker; while (TRUE) { waits (LOCK_EVENT, NULL, &owner); signal (LOCK_event, locker); waits (UNLOCK_EVENT, owner, NULL); } } In truth we used some extra logic to avoid context switches in the most common cases -- locking an unlocked resource and unlocking a resource noone was waiting for. Also, the signals were a slightly more complicated because resource wait processes belonged to the kernel job, while the lock/unlock calls often came from a user job. (Jobs were isolated from each other, including having separate process name spaces, except for the cross-job interface between the kernel and other jobs. cross-job signalling could be done only from code executing in kernel mode.) The deadlock untangling scheme I was thinking about would add some different signal/wait interaction to lock processes and would probably add a system process (daemon if you prefer) to supervise deadlock resolution. ---------------- Paul Raveling Raveling@isi.edu
raveling@isi.edu (Paul Raveling) (09/23/89)
In article <3140004@hpldsla.HP.COM>, cjames@hpldsla.HP.COM (Craig James) writes: > > It's hard to see how that is possible with real-time capabilities. > After all, if I start up a process that runs until it explicitely gives up > the CPU, and that process goes into an infinite loop, there is no escape but > to reboot. > > Perhaps the system you described didn't provide such abilities? In such a case the machine would go CPU-bound, but nothing drastic would happen unless the looping process was a crucial system process, such as a device driver. Processes in loaded applications were offspring of an Exec process, which was the first process created in each job except the kernel job. The Exec ran as a VERY high priority process, and its offspring were restricted to priorities below those of the Exec and all system processes other than the Idle process. The normal way out of the loop was for the user to hit control-C on the keyboard. The terminal i/o process for the associated terminal signalled the Exec, which then preempted the looping process. For ^C the Exec froze its offspring, including the looper, and resumed accepting commands from the terminal. The usual next command was "MEND", which stood for Multi-Environment Native Debugger. This allowed usual sorts of debugging operations on the job's processes, including single-stepping the loop or patching data up and resuming execution. For a different answer, a prior incarnation of the same kernel architecture also implemented time slicing. This this proved unnecesary in EPOS' event-driven real time environment, but would be appropriate in a more system with more diversified use. ---------------- Paul Raveling Raveling@isi.edu
raveling@isi.edu (Paul Raveling) (09/23/89)
(About what happens if a process goes into a loop under EPOS)... On looking at my last response, it may not have mentioned clearly enough that all activity necessary for the system to run and stay healthy occurs in processes with very high priorities. A "normal" application process, even at a priority that allows real time response and bursts of high CPU use, doesn't interfere with system activity in device driver processes, Execs, etc. Also, none of the high priority processes did anything that monopolized the CPU for long. Context switching was fast (45 times faster than UNIX), so that the device drivers could quickly get control, respond to an event such as an interrupt or i/o request, and relinquish control back to the previously active process. This quick-response event-handling type of operation is well suited to preemptive scheduling. BTW, a "normal" application on this system handled real time speech. In speech conferencing it used both LPC and CVSD vocoders, with CVSD demanding the higher bandwidth. My memory's unclear now, but I think the meter we put on the PDP-11/45's bus showed about 33% CPU utilization for a single CVSD channel at 10 kHz bandwidth (or maybe it was 16 kHz). ---------------- Paul Raveling Raveling@isi.edu
jack@hpindda.HP.COM (Jack Repenning) (09/30/89)
EPOS does seem to have provided a lot of good stuff for developing real-time, event-driven, light-weight processes. Unfortunately, we were trying to achieve the same effect with preexisting, multi-user, time-sliced UNIX processes. And in the case raised by the basenote, we were trying to take a small piece of this emulation of a pure real-time system, and apply it freely, from the outside, to programs never developed for that environment. The latter problem is the one I meant was "hard". j
raveling@isi.edu (Paul Raveling) (10/04/89)
In article <4310061@hpindda.HP.COM>, jack@hpindda.HP.COM (Jack Repenning) writes: > EPOS does seem to have provided a lot of good stuff for developing > real-time, event-driven, light-weight processes. BTW, EPOS processes weren't light-weight. LWP's are becoming popular mainly because current OS's (at least UNIXes) lack adequate facilities for multi-process applications using "normal" processes. ---------------- Paul Raveling Raveling@isi.edu