jdudeck@polyslo.CalPoly.EDU (John R. Dudeck) (04/26/91)
I am porting an application (the PP X.400 system) from bsd to HP/UX, on a system running HP/UX ver. 6.5B. The code includes three daemons. I am having problems that seem to be related to the wait(2) call. The original code defines a wait structure: struct wait w; then later on does a fork and a wait, etc, in the normal fashion. Then there is a line such as if (WIFEXITED(w)) return w.w_retcode; This does not compile under HP/UX, with an "operands of cast have incompatible types" error. The sys/wait.h file defines WIFEXITED using a cast of struct to int, as opposed to bsd which does not. So I changed the above line to if (WIFEXITED(w.w_status)) return w.w_retcode; which compiles ok. This same scenario exists 5 places in the package. Now I am at a point were different parts of the packgage crash with a Bus Error at certain points in the code. It doesn't crash in this code, but since this is the part I changed, I tend to suspect a problem here. I suspect that if the code returns the wrong status here, it could provoke a crash elsewhere in the system. Furthermore, the daemon isn't cleaning up its zombies like it should. There also are a couple of wait3() calls in the system, which I didn't make any changes to. In the man page for wait(2), there is a line which says: "The third parameter to wait3 is currently unused, and must always be a null pointer". In one place this is not null in my code. My questions are these: 1. Did I do something wrong in the changes I made? 2. Is there a difference in the way wait() works on HP/UX? 3. What happens if the third parameter to wait3 isn't a null pointer? -- John Dudeck "You can only push jdudeck@Polyslo.CalPoly.Edu simplicity so far." ESL: 62013975 Tel: 805-545-9549 -- AT&T promotional brochure
decot@hpcupt1.cup.hp.com (Dave Decot) (04/30/91)
See the "Notes" section on the HP-UX wait(2) man page (I hope it was in 6.5, but it is certainly there in 7.0 and 8.0). You may want to consider updating to HP-UX 7.0 and/or HP-UX 8.0. > 2. Is there a difference in the way wait() works on HP/UX? Yes, these are differences between POSIX and BSD. In particular, the value returned in the variable to which wait's argument points can no longer be decoded using the WIF* functions used in BSD. Unfortunately, POSIX chose to use those macro names for a different interface for decoding the value, and HP-UX followed POSIX. However, the BSD macros are still available by defining the _BSD symbol (using the -D_BSD option on the cc command line). I don't know if this worked in 6.5; it's been quite a while since that release was current. > 1. Did I do something wrong in the changes I made? Yes. Either change the code back to the BSD way and compile the code with the -D_BSD option (this is best if you want the code to still port to BSD 4.3 or earlier), or convert the code to use the POSIX version of these macros as described on HP-UX's wait(2) man page (this will work with BSD 4.4 or later). > 3. What happens if the third parameter to wait3 isn't a null pointer? "The third parameter to wait3 is currently unused and must always be a null pointer." If it isn't, no warranty is expressed or implied, since you have violated the requirements of the documentation. Among the possible results are a memory fault, or mysterious changes to unrelated variables. Dave
rml@hpfcdc.HP.COM (Bob Lenk) (05/08/91)
> Then there is a line such as > if (WIFEXITED(w)) return w.w_retcode; > > This does not compile under HP/UX, with an "operands of cast have > incompatible types" error. The sys/wait.h file defines WIFEXITED using > a cast of struct to int, as opposed to bsd which does not. > > So I changed the above line to > if (WIFEXITED(w.w_status)) return w.w_retcode; > which compiles ok. This should work fine. The HP-UX macros are compatible with POSIX rather than BSD. In newer versions of HP-UX, <sys/wait.h> has a BSD compatible version of these macros within #ifdef _BSD. > This same scenario exists 5 places in the package. Now I am at a point > were different parts of the packgage crash with a Bus Error at certain > points in the code. It doesn't crash in this code, but since this is > the part I changed, I tend to suspect a problem here. I suspect that > if the code returns the wrong status here, it could provoke a crash > elsewhere in the system. I don't think there's any relationship. > Furthermore, the daemon isn't cleaning up its zombies like it should. > > There also are a couple of wait3() calls in the system, which I didn't > make any changes to. > > In the man page for wait(2), there is a line which says: > "The third parameter to wait3 is currently unused, and must always be > a null pointer". > In one place this is not null in my code. I believe wait3() will return an EINVAL error. (That's what the manual says. I don't have a 6.5 system or source handy to check - but you should be able to verify with a small program if you like). This could easily be causing the unreaped zombies. It's possible that some code isn't detecting the error, is expecting some returned values to have useful data, and is causing the core dumps. There is no supported way to get the functionality of the third parameter to wait3(). In order to port this, you need to check how the code uses the rusage structure. You can get the same information as in the CPU time fields (ru_utime and ru_stime) with times(2) (call it before and after wait/wait3/waitpid - difference in child times is time for newly reported child). The information in the other rusage fields is not available. > My questions are these: > > 1. Did I do something wrong in the changes I made? Only in supplying the non-NULL third parameter to wait3(). > 2. Is there a difference in the way wait() works on HP/UX? Only in (a) not supporting the third parameter to wait3() and (b) difference from BSD on type of status argument and thus on type of argument to the WIF*() macros. > 3. What happens if the third parameter to wait3 isn't a null pointer? See above (EINVAL error, I think). Bob Lenk rml@fc.hp.com {uunet,hplabs}!fc.hp.com!rml Normal disclaimer - not an official response from HP.
carllp@diku.dk (Carl-Lykke Pedersen) (05/10/91)
rml@hpfcdc.HP.COM (Bob Lenk) writes: >> So I changed the above line to >> if (WIFEXITED(w.w_status)) return w.w_retcode; >> which compiles ok. > >This should work fine. The HP-UX macros are compatible with POSIX rather >than BSD. In newer versions of HP-UX, <sys/wait.h> has a BSD compatible >version of these macros within #ifdef _BSD. But WTERMSIG and WSTOPSIG still seems to be defined the POSIX-way (in hpux 7.0). Regards Carl-Lykke -- Carl-Lykke Pedersen (System Administrator) Email: carllp@diku.dk DIKU (Dept. Comp. Sci. Univ. Copenhagen) Fax: +45 31 39 02 21 Universitetsparken 1 DK-2100 Copenhagen, Denmark