dannyb@kulcs.UUCP (Danny Backx) (01/20/88)
I have a few questions concerning XENIX. We are running XENIX System V on a genuine IBM AT, equiped with two 30Mbyte fixed disks, and 2.5Mbytes of RAM. My first question is on that second disk, which was only recently installed. When XENIX boots, at some point the following is displayed : | The IBM Personal Computer XENIX | Version 2.00 | (c) Copyright IBM Corp. 1984, 1985 | (c) Copyright Microsoft Corp. 1983, 1984, 1985 | | Reserved Memory = 2K | Kernel Memory = 176K | Buffers = 100K | User Memory = 2282K | bad signature (B66D) on drive 1 | | Type Ctrl-d to proceed with normal startup, | (or give root password for system maintenance):_ | My question is now : what does the BAD SIGNATURE message really mean, and how do I get this fixed. I must say that the second disk is not currently in use for XENIX, (or anything), so we just put one large DOS partition on it for testing. It seems to work just fine. Also, the diagnostics program on the diagnostics diskette delivered with the AT shows no errors. My second question concerns the configuration of the XENIX kernel. We are using the XENIX system as a development system for network drivers. This means we are adding several device drivers to it, (for PCnet, ethernet, and in the near future Token Ring), and we are currently adding gateway software to the system. Now we also use some XENIX System III (which is XENIX 1.) systems, om which the same software is added. What I'd like to know more about is the kernel memory assignment. In the configuration files, a lot of parameters are set, concerning things such as kernel buffers for IPC, and for disk access (if I recall correctly). Does anybody know what these parameters exactly mean ? We don't use the IPC facilities such as messages or shared memory. Do you know a way to get the buffers for these things out of our new kernels ? Basically : is it safe to put a zero value for some of these 'tunable parameters' ? Please mail your answers directly to me. I will summarize on the net. Thanks everybody. Danny -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Danny Backx | mail: Katholieke Universiteit Leuven Tel: +32 16 200656 x 3537 | Dept. Computer Science E-mail: dannyb@kulcs.UUCP | Celestijnenlaan 200 A ... mcvax!prlb2!kulcs!dannyb | B-3030 Leuven dannyb@kulcs.BITNET | Belgium
ct@tcom.stc.co.uk (Clive Thomson) (06/28/90)
Hello, I have recently started the long journey that will hopefully lead to UN*X enlightenment, and have some questions I hope somebody will answer for me. 1) The document for the dup call says that it will return the lowest numbered file descriptor not used by the process. With the exception of one line in "The design and implementation of the BSD 4.3 UNIX operating system" (Leffler et al) I have seen no documentation to say open, creat and socket will do the same. Observation of open seems to suggest that the lowest fd is used, but I would like to be sure. 2) When I am doing socket programming (ULTRIX 3.0 and SunOS4), and I do a bind, if the program terminates abnormally, I find that when I re-run the program the bind will fail with an "in use" error. Is there any way to convince the system that it is no longer "in use" (assuming of course uid, gid etc are the same). 3) I am a little confused by the "death of child signal". Is the following correct. If the parent ignores this signal, the kernel will release entries for zombie processes automatically. If the parent uses the default handler, it must wait() for the death of each child, or the child will become a zombie. If the parent invokes its own handler, in this handler a wait should be invoked, otherwise the child will become a zombie. If the parent dies before the children, all children are adopted by the init process, and the programmer need no longer worry about zombie processes. Thanks for your time. -- ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + Clive Thomson ...!mcvax!ukc!stc!ct + + ct@tcom.stc.co.uk + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
jik@athena.mit.edu (Jonathan I. Kamens) (06/29/90)
In article <1716@jura.tcom.stc.co.uk>, ct@tcom.stc.co.uk (Clive Thomson) writes: |> 1) The document for the dup call says that it will return the lowest numbered |> file descriptor not used by the process. With the exception of one line |> in "The design and implementation of the BSD 4.3 UNIX operating system" |> (Leffler et al) I have seen no documentation to say open, creat and socket |> will do the same. Observation of open seems to suggest that the lowest fd |> is used, but I would like to be sure. All file descriptor allocation in the kernel works on the "use the lowest available fd" system. |> 2) When I am doing socket programming (ULTRIX 3.0 and SunOS4), and I do a |> bind, if the program terminates abnormally, I find that when I re-run the |> program the bind will fail with an "in use" error. Is there any way to |> convince the system that it is no longer "in use" (assuming of course |> uid, gid etc are the same). I've noticed that the Kernel sometimes gets confused about the state of a socket which isn't being used anymore; a program exiting abnormally is one way to cause this to sometimes occur (although it doesn't always occur). What ends up happening is that socket stays around in CLOSE_WAIT status so that no new connections can be made to it. Occasionally, the CLOSE_WAIT eventually goes away and it's once again possible to connect to the socket. However, if you don't want to wait and see if that'll happen, and you don't want to have to reboot the system in order to get the socket to go away, there is a way to force the ability to connect to the socket. What you need to do (at least in BSD; I don't know what happens with things like this in SysV) is to use the setsockopt() call to set the SO_REUSEADDR option on your new socket, before you attempt to connect to the socket which is busy. Keep in mind that this option works for all socket connections, not just the ones that in CLOSE_WAIT, so if another program really is using the socket and you try to connect to it again with SO_REUSEADDR set, you'll connect to it and the other program could very well lose. |> 3) I am a little confused by the "death of child signal". Is the following |> correct. If the parent ignores this signal, the kernel will release |> entries for zombie processes automatically. If the parent uses the default |> handler, it must wait() for the death of each child, or the child will |> become a zombie. If the parent invokes its own handler, in this handler |> a wait should be invoked, otherwise the child will become a zombie. If |> the parent dies before the children, all children are adopted by the init |> process, and the programmer need no longer worry about zombie processes. Unfortunately, it's impossible to generalize how the death of child processes should behave, because the exact mechanism varies over the various flavors of Unix. Perhaps someone who's "in the know" (or at least more so than I am) about POSIX can tell us what the POSIX standard behavior (if there is any) for this is. First of all, by default, you have to do a wait() for child processes under ALL flavors of Unix. That is, there is no flavor of Unix that I know of that will automatically flush child processes that exit, even if you don't do anything to tell it to do so. Second, allegedly, under some SysV-derived systems, if you do "signal(SIGCHLD, SIG_IGN)", then child processes will be cleaned up automatically, with no further effort in your part. However, people have told me that they've never seen this actually work; the best way to find out if it works at your site is to try it, although if you are trying to write portable code, it's a bad idea to rely on this in any case. If you can't use SIG_IGN to force automatic clean-up, then you've got to write a signal handler to do it. It isn't easy at all to write a signal handler that does things right on all flavors of Unix, because of the following inconsistencies: On some flavors of Unix, the SIGCHLD signal handler is called if one *or more* children have died. This means that if your signal handler only does one wait() call, then it won't clean up all of the children. Fortunately, I believe that all Unix flavors for which this is the case have available to the programmer the wait3() call, which allows the WNOHANG option to check whether or not there are any children waiting to be cleaned up. Therefore, on any system that has wait3(), your signal handler should call wait3() over and over again with the WNOHANG option until there are no children left to clean up. On SysV-derived systems, SIGCHLD signals are regenerated if there are child processes still waiting to be cleaned up after you exit the SIGCHLD signal handler. Therefore, it's safe on most SysV systems to assume when the signal handler gets called that you only have to clean up one signal, and assume that the handler will get called again if there are more to clean up after it exits. On older systems, signal handlers are automatically reset to SIG_DFL when the signal handler gets called. On such systems, you have to put "signal(SIGCHILD, catcher_func)" (where "catcher_func" is the name of the handler function) as the first thing in the signal handler, so that it gets reset. Unfortunately, there is a race condition which may cause you to get a SIGCHLD signal and have it ignored between the time your handler gets called and the time you reset the signal. Fortunately, newer implementations of signal() don't reset the handler to SIG_DFL when the handler function is called. The summary of all this is that on systems that have wait3(), you should use that and your signal handler should loop, and on systems that don't, you should have one call to wait() per invocation of the signal handler. Also, if you want to be 100% safe, the first thing your handler should do is reset the handler for SIGCHLD, even though it isn't necessary to do this on most systems nowadays. Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8495 Home: 617-782-0710
cpcahil@virtech.uucp (Conor P. Cahill) (06/29/90)
In article <1716@jura.tcom.stc.co.uk> ct@tcom.stc.co.uk (Clive Thomson) writes: > >1) The document for the dup call says that it will return the lowest numbered > file descriptor not used by the process. With the exception of one line > in "The design and implementation of the BSD 4.3 UNIX operating system" > (Leffler et al) I have seen no documentation to say open, creat and socket > will do the same. Observation of open seems to suggest that the lowest fd > is used, but I would like to be sure. Zillions of unix commands would be broken if open/creat did not return the lowest file descriptor available. All of the std{in|out|err} redirection code requires this behavior. >3) I am a little confused by the "death of child signal". Is the following > correct. If the parent ignores this signal, the kernel will release > entries for zombie processes automatically. If the parent uses the default > handler, it must wait() for the death of each child, or the child will > become a zombie. If the parent invokes its own handler, in this handler > a wait should be invoked, otherwise the child will become a zombie. If > the parent dies before the children, all children are adopted by the init > process, and the programmer need no longer worry about zombie processes. If you ignore the signal, a child just goes away, never becoming a zombie. If you default the signal, a child becomes a zombie until you wait for it. If you catch the signal, a child becomes a zombie until you wait for it, and you are sent a signal when the child exits (and then should wait on the child). For most cases, a programmer does not need to worry about zombie processes unless she is writing a program that will spawn many children, or will be a long-running program that will spawn children every once in a while. The key is the number of children that can be reasonably expected during a single iteration of the parent and the life of the parent itself. Low numbers of children (< 5 or so) and/or short lived parents (<1hr or so) normally do not have to worry about zombies. The safest thing to do is determine if the child actually has to run asyncronously or can the parent just wait for it to finish. If the parent can't wait, use a signal handler, otherwise use the wait following the fork and exec or whereever it is appropriate. In general, zombies are not a problem for a system (your system will probably handle Zillions of processes that exit over the next week or so and 99.99999% of them will become zombies (at least momentarily)). -- Conor P. Cahill (703)430-9247 Virtual Technologies, Inc., uunet!virtech!cpcahil 46030 Manekin Plaza, Suite 160 Sterling, VA 22170
jik@pit-manager.mit.edu (Jonathan I. Kamens) (07/02/90)
One final note about SIGCHLD signal handlers.... Don Libes has informed me in E-mail that SunOS 4 is one of the operating systems under which signal(SIGCHLD, SIG_IGN) will cause dying child processes to be cleaned up automatically. The people with whom I've discussed this in the past have implied that this feature is a "SysV-derived" feature, but since SunOS is BSD-derived (or, at least, I *thought* it was), perhaps it's no longer safe to make that generalization. I guess the only way to generalize is to say that vendors which have decided to put this feature in have done so, and those which haven't, haven't -- check your manual for more information, or write a program to test it :-). However, it's probably still not a good idea to rely on this if you're trying to write portable code -- you should install a signal handler of your own, using wait3 if it's available, or wait if it isn't (or wait4, I guess :-). Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8495 Home: 617-782-0710
libes@cme.nist.gov (Don Libes) (07/03/90)
In article <1990Jul1.213022.26393@athena.mit.edu> jik@pit-manager.mit.edu (Jonathan I. Kamens) writes: > One final note about SIGCHLD signal handlers.... Don Libes has >informed me in E-mail that SunOS 4 is one of the operating systems >under which signal(SIGCHLD, SIG_IGN) will cause dying child processes >to be cleaned up automatically. > > The people with whom I've discussed this in the past have implied >that this feature is a "SysV-derived" feature, but since SunOS is >BSD-derived (or, at least, I *thought* it was), perhaps it's no longer >safe to make that generalization. I guess the only way to generalize >is to say that vendors which have decided to put this feature in have >done so, and those which haven't, haven't -- check your manual for >more information, or write a program to test it :-). Guy Harris has informed me that I was wrong about signal(SIGCHLD,SIG_IGN) reaping child processes automatically on SunOS. He is correct. I had the unfortunate luck to be calling a subroutine that someone else wrote which did the old wait in a loop trick, making me believe that this long-standing behavior had been changed on my system. Jonathon was right, originally. SunOS 4.1 works the way that BSD systems have worked all along. Sorry Jonathan. Don Libes libes@cme.nist.gov ...!uunet!cme-durer!libes
gwyn@smoke.BRL.MIL (Doug Gwyn) (07/04/90)
In article <4919@muffin.cme.nist.gov> libes@cme.nist.gov (Don Libes) writes: >Guy Harris has informed me that I was wrong about signal(SIGCHLD,SIG_IGN) >reaping child processes automatically on SunOS. He is correct. >I had the unfortunate luck to be calling a subroutine that someone >else wrote which did the old wait in a loop trick, making me believe >that this long-standing behavior had been changed on my system. I can't tell from the sparse information, but if you were using the System V environment it is possible you were getting an emulation of the System V SIGCLD behavior. I know it was present in my original System V emulation for 4BSD, even though I personally think it is a horrible design and that the originator of that kludge in UNIX System V should be hanged by its thumbs.
guy@auspex.auspex.com (Guy Harris) (07/05/90)
>I can't tell from the sparse information, but if you were using the >System V environment it is possible you were getting an emulation of >the System V SIGCLD behavior. He wasn't; we didn't put that into the S5 environment in SunOS. The problem lay elsewhere....