marg@cunixf.cc.columbia.edu (Margarita Suarez) (07/12/90)
We upgraded our 3 Sun-4/280's to SunOS 4.1 in early June. We also have compiled ksh88d using the patches posted here by Doug Kingston (dpk@morgan.com). We noticed problems, however, when once in a while programs like "more" and "less" at the end of a pipeline would appear to hang the terminal. Symptoms: all input to the terminal was ignored, although the process could be killed from another terminal. We finally traced the problem down to some questionable code in xec.c. The shell would fork up a kid for each member of the pipeline, and try to assign them to the same process group as the "leader" of the pipeline. However, if by chance the tail process of the pipeline tried to call setpgid with the pid of the pipleline "leader" *before* the pipeline "leader" had established itself as the process group leader, the setpgid() would return with a ENOPERM error. Problem was, in the event of an error, the process group would be set to be the same as the current process' pid, and the setpgid was called again with this new pgrp. This confuses ioctl() terribly because now we have a piece of a pipeline which is not in the same process group as the rest of the pipeline. That explains why all keybord input was ignored, etc. To fix, check for the case where setpgid returns ENOPERM, and if so, sleep a bit and try again until the process group leader has been properly established. By the way, does anyone know where ksh bug reports should be sent? Also, out of curiosity, how many people are actually running ksh88d under 4.1, and have you had many problems? Margarita M. Suarez Unix Systems Group VOICE: w:212-854-5434 h:212-932-3023 INTERNET: marg@cunixf.cc.columbia.edu BITNET: marg@cunixf.bitnet UUCP: !rutgers!columbia!cunixf!marg
chet@po.cwru.edu (07/17/90)
In article <9833@brazos.Rice.edu> marg@cunixf.cc.columbia.edu (Margarita Suarez) writes: [An explanation of why sometimes the last process in a pipeline would hang ksh88d, and finally traced it down to a setpgid() race condition where a process in the pipeline would try to set its pgrp before the process group `leader' had established the process group. This resulted in a pipeline with pieces in different process groups.] >To fix, check for the case where setpgid returns ENOPERM, and if so, sleep >a bit and try again until the process group leader has been properly >established. A better fix is to do the setpgid() in both the parent and child, so the setpgid() succeeds before anything else happens. It doesn't matter which succeeds, as long as the pgrp is set before anything else depends on it being so. Chet Ramey "...but worst of all, young man, Network Services Group you've got Industrial Disease!" Case Western Reserve University chet@ins.CWRU.Edu