[comp.unix.programmer] question about dbx and fork

jrb@jove.cs.pdx.edu (James Binkley) (04/12/91)

Can somebody explain why the following happened?

Take a look at the dbx session below. The code is listed.
This was done on a Sun 4.1 system.
The machine is a 68020. I set a breakpoint
on main and then "run" to it. I then singlestep past
the fork. There is a fork() where the child/inferior
loops in an infinite loop. The parent waits. The wait
returns 85 as status, meaning the child died courtesy
of a SIGTRAP plus a core file was generated.
However the breakpoint was on "main" in the parent.
I could understand this if the breakpoint had been
set in the text of the child. Apparently the child
has somehow been awarded the signal anyway.  Something
to do with virtual memory or??? 

gdb 3.5 on the same machine does not exhibit the
same behavior; i.e., wait() blocks as one would expect.
However it does exhibit the same behaviour on a Sequent 
running Dynix. 

				Jim Binkley
				jrb@jove.cs.pdx.edu

script of dbx session follows:
------------------------------------------
Script started on Thu Apr 11 20:22:48 1991
jove% dbx fork
Reading symbolic information...
Read 51 symbols
(dbx) list 1,35
    1   /*
    2    * fork.c
    3    */
    4   main()
    5   {
    6
    7           int x;
    8           int pid, status;
    9
   10           x = 1;
   11           /* parent forks child here
   12           */
   13           if ( fork() == 0) {
   14                   /* set a breakpoint here
   15                   */
   16                   x = 2;
   17                   printf("child pid is %d\n", getpid());
   18                   printf("child %d\n", x);
   19                   while (x == 2)
   20                           ;
   21                   x = x + 1;
   22                   exit(1);
   23           }
   24
   25           /* parent process waits for child to exit
   26           */
   27           pid = wait(&status);
   28
   29           /* pay attention to what is printed out here
   30           */
   31           printf("status %x\n", status);
   32           x = x + 1;
   33           printf("x %d\n", x);
   34           exit(0);
   35   }
(dbx) stop in main
(2) stop in main
(dbx) status
(2) stop in main
(dbx) run
Running: fork
stopped in main at line 10 in file "fork.c"
   10           x = 1;
(dbx) s
stopped in main at line 13 in file "fork.c"
   13           if ( fork() == 0) {
(dbx) s
stopped in main at line 27 in file "fork.c"
   27           pid = wait(&status);
(dbx) s
stopped in main at line 31 in file "fork.c"
   31           printf("status %x\n", status);
(dbx) s
status 85
(dbx) quit

torek@elf.ee.lbl.gov (Chris Torek) (04/13/91)

In article <2340@pdxgate.UUCP> jrb@jove.cs.pdx.edu (James Binkley) writes:
>This was done on a Sun 4.1 system.  The machine is a 68020. ...
>I then singlestep past the fork ... [and find that] the child died courtesy
>of a SIGTRAP plus a core file was generated.

If the machine were a SPARC you would have to put up with this, but it
is not so you do not, but to fix it you will need sources.

fork() copies the memory image of the parent process, which means it
copies any breakpoint instructions that dbx or gdb has installed.  If
there is such a breakpoint just `after' the fork, the child process
will hit the breakpoint and die with a SIGTRAP.

>gdb 3.5 on the same machine does not exhibit the
>same behavior; i.e., wait() blocks as one would expect.

Most peculiar.

Single stepping, on the 68020, is done by setting the T bit in the
PSR.  If the fork code fails to clear the T bit in the child's PSR, any
parent that is being stepped would cause the child to step and thus
trap.  Since gdb does not cause a trap, we can presume that the kernel
got this right.  Chances are, then, that dbx left a breakpoint
somewhere.  (Examine the PC in the child's core dump to verify this.)

Single stepping on the SPARC is more interesting, as there is no T bit.
Debuggers must plant breakpoints at all of the possible `endpoints'
of each instruction, then run the process for that single instruction
(plus the breakpoint).  This means it is impossible to single step
across fork() on a SPARC.

In general, debugging processes that fork is tricky.  The usual approach
is to replace (temporarily) the fork with `pid = 0' or `pid = 12345' so
as to run just the parent or child code.
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov