ast@cs.vu.nl (Andy Tanenbaum) (06/01/88)
Sort of by accident I just discovered a fatal error in MINIX. Try this: sync cp /usr/bin/sleep x chmem =60000 x for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 do x 10 & done The shell keeps forking off 'x' until it can fork no more, and then stops. Within 10 seconds all the children exit, but because the last command to the shell was &, the shell is reading tty, not doing a wait. As a result, all the children have become zombies and are still tying up resouces (memory and proc table slots). Since the kernel is told that processes are gone at the moment they become zombies (line 5802 in the book), the F1 key does not show them any more. Now type sync You will see that the shell can't fork and the system is totally hung. To dehang the shell, type exec sync This causes the shell to exec instead of fork, the exec succeeds, the sync succeeds and exits. At this point the zombies are orphans, and are inherited by the shell's parent, init, which is doing a wait, and which cleans them all up. This gets the system back to normal. There is one fix that helps part of the problem, namely, by having zombies release memory when they exit, not at cleanup time. This can be accomplished by moving lines 5874 - 5878 just after line 5802. You also have to change 'child' to 'rmp' declare 's' at the top of mm_exit. Now you won't have the situation where the shell fails to fork due to lack of memory. But it still fails due to lack of proc table slots. This bug is hard to fix, but in practice, it doesn't happen except in test programs. You can always make NR_PROCS larger if you want. Andy Tanenbaum (ast@cs.vu.nl)