tml@hemuli.atk.vtt.fi (Tor Lillqvist) (07/06/89)
We are experiencing lots of zombie processes on an HP9000 Series 800 running HP-UX 3.1 (the same occurred also in 2.1 and 3.01). They are all children of remshd (rshd in BSD) processes (which have no other children). All these remshds are sleeping on selwait. We have a configuration with a bunch of Series 300 workstations running X clients on the 840. Right now, for instance, there are 25 of these zombies when the system has been up for three days, with perhaps ten active workstation users. What could be the problem? Is there any cure, except writing a perl script that scans ps now and then, and kills off remshd processes with a zombie child? -- Tor Lillqvist Technical Research Centre of Finland, Computing Services (VTT/ATK) tml@hemuli.atk.vtt.fi [130.188.52.2]
jack@hpindda.HP.COM (Jack Repenning) (07/11/89)
I've seen persistent zombie children of remshd when people remsh processes that they put into background, e.g.: remsh m840 hpterm -display $DISPLAY \& & If this is happening, you should find active processes belonging to the same users (the hpterm, in the example). These will be the program started by the zombie (which was a shell). Unfortunately, they'll be a little hard to track down from ps: although they'll be in the same process group as the zombie, that's not visible, and they'll show PPID of "1". If this is your problem, then to make these zombies go away (and the remshd, and the remsh back on the workstation as well), you need to arrange to close stdin, stdout, and stderr on the "real" process (again, the hpterm in the example). Jack Repenning jack@hpda.hp.com
scott@grlab.UUCP (Scott Blachowicz) (07/15/89)
/ grlab:comp.sys.hp / jack@hpindda.HP.COM (Jack Repenning) / 1:19 pm Jul 10, 1989 / > I've seen persistent zombie children of remshd when people remsh > processes that they put into background, e.g.: > > remsh m840 hpterm -display $DISPLAY \& & > > If this is happening, you should find active processes belonging to > the same users (the hpterm, in the example). These will be the > program started by the zombie (which was a shell). This isn't exactly the same problem you're seeing, but... I got tired of seeing remshd hanging out in memory waiting on stuff, so I wrote a remsh replacement (I call qremsh) that just does the remote schedule without wait & dies. It closes off stdin/out/err before doing the remote schedule (uses inetd). It was both an exercise in using the network programming and conserving memory. I'm not sure if I'm missing something, but it seems to work fine for what I want to do (hpterm,xload,etc) since none of them care about stdin/out/err that they inherit. Let me know if you want it & I'll try to package it up. Scott Blachowicz USPS: Graphicus UUCP: ...!hpubvwa!grlab!scott 150 Lake Str S, #206 VoicePh: 206/828-4691 Kirkland, WA 98033 FAX: 206/828-4236