[comp.unix.questions] A debugging problem

makani@marque.mu.edu (Gautam Makani) (02/22/89)

I was'nt too sure whether this was a UNIX or a C problem hence the
cross-posting.
I have a piece of code to read in line entries from an object file
and filter out some information. I get an EMT (core dump) and cannot
figure out why.

The problem happens when I access a particular part of a struct, *but*
 i) I can print out the constituent data in that area. 
 ii) Using "sdb" (UNIX debugger), I can access (look at) the variable.
 iii) Further, what puzzles me is that this happens the *second* time through a 
      loop.I know that seems stupid ...

 I would appreciate it if someone tells me what's going on. 
 The struct I'm looking at is
struct lineno
{
	union
	{
		long	l_symndx ;
		long	l_paddr ;
	}		l_addr ;
	unsigned short	l_lnno ;	
} ;
#define LINENO struct lineno

The problem comes when I access l_symndx as
       tmplptr->l_addr.l_symndx in any form (in an assignment, printf etc)
Further on this machine,(3b15) sizeof(int) = sizeof(long) (if that helps)

Code segement follows:
----------------------
    for (i = 0, lindex = 0; i <= nlentries ;i++ )
    {
       tmplptr = (LINENO *)lineptr;
       tmp = (char * ) tmplptr;
       printf("tmplptr.l_lnno = %lx\n", (long )tmplptr[0].l_lnno);
       for (ret = 0; ret < 6; ret ++)
       {
         printf("%x ", (char )*tmp);
         tmp++;
       }
       printf("\n");
       printf("symndx = %lx\n",tmplptr->l_addr.l_symndx );
       lineptr = (char *)((long) lineptr + LINESZ);
     }
Program run follows: 
--------------------
c$ tline tline
tmplptr.l_lnno = 0
0 0 0 b7 0 0
symndx = b7
tmplptr.l_lnno = 1
80 80 1	1d 0 1
EMT trap - core	dumped

Sdb session follows:
---------------------
c$ sdb tline
0x8080030b in line:88:	      printf("symndx = %lx\n",tmplptr->l_addr.l_symndx );
*b
line:88	b
*R tline
tmplptr.l_lnno = 0
0 0 0 b7 0 0
Breakpoint at
line:88:	printf("symndx = %lx\n",tmplptr->l_addr.l_symndx );
*c
symndx = b7
tmplptr.l_lnno = 1
80 80 1	1d 0 1
Breakpoint at
line:88:	printf("symndx = %lx\n",tmplptr->l_addr.l_symndx );
*tmplptr->l_addr.l_symndx/lx
0x8080011d
*s
EMT (7)	(sig 7)
0x8080030b in line:88:	      printf("symndx = %lx\n",tmplptr->l_addr.l_symndx );
*q
-------------
I realize this is utilizing precious bandwidth. Please reply by e-mail
and thanks in advance.

----------------------------------------------------------------------
Makani Gautam
Department of MSCS           email : makani@marque.mu.edu 
Marquette University               { uwvax!uwmcsd1 | uunet }!marque!makani
----------------------------------------------------------------------

crossgl@ingr.com (Gordon Cross) (02/23/89)

In article <382@marque.mu.edu> makani@marque.UUCP (Gautam Makani) writes:
>
>I have a piece of code to read in line entries from an object file
>and filter out some information. I get an EMT (core dump) and cannot
>figure out why.
>
>The problem happens when I access a particular part of a struct, *but*
> i) I can print out the constituent data in that area. 
> ii) Using "sdb" (UNIX debugger), I can access (look at) the variable.
> iii) Further, what puzzles me is that this happens the *second* time through
>      a loop.I know that seems stupid ...
>
> I would appreciate it if someone tells me what's going on. 

This is a common problem on many architectures.  In almost all likelyhood
you are suffering from an alignment problem (notice this line from your code):

>       lineptr = (char *)((long) lineptr + LINESZ);

It would appear that you have read the entire quantity of line number entries
into a buffer.  Then you are incrementing a pointer in the (apparantly) logical
way to access each line entry in turn.  Note that the macro LINESZ is defined
to be 6 (check /usr/include/linenum.h)!!  But if you try this:

printf ("Size of line entry is %d.\n", sizeof (struct lineno));

I will bet that (I am not familiar with the 3b15 in particular) the answer will
be 8!!  Have you figured it out yet?? If so hit 'n'...

The object file is written out without regard to the alignment requirements of
the struct lineno.  They pack them as tightly as possible to save on file size.
The reason why you are failing on the SECOND loop pass is that the field
'l_symndx' will be correctly aligned for every other struct lineno.  For
example (assumming 4-byte alignment on longs) if your buffer begins at 0x10000
you have:

         lineno #1 ([0]) at address 0x10000 (correctly aligned)
         lineno #2 ([1]) at address 0x10006 (incorrectly aligned)
         lineno #3 ([2]) at address 0x1000c (correctly aligned)
                               .
                               .
                               .

You are going to have to do one of several things to correct this problem:

1) copy every other struct lineno into a static location (with proper alignment
   of course - simply declare a struct lineno variable and you've got it) as
   you peruse your buffer.

2) read the struct lineno's from the file one at a time into an array using
   ptr++ instead of ptr = (struct lineno *)((char *)ptr + LINESZ).  You can
   do this yourself or use the 'ld' routines described in section 3X of the
   AT&T manuals.

I like solution (1) because it will usually run faster than solution (2) but
solution (2) has the advantage of simplicity...

As regards sdb??  It seemingly works because of the nature of ptrace (which
sdb uses to copy data from your process for examination).  If the symbol
table says something is a long, sdb gets the data realigned as a side effect
of copying it!!

>I realize this is utilizing precious bandwidth. Please reply by e-mail
>and thanks in advance.

Ooops!  Well I guess this is of interest to everyone...
-- 

Gordon Cross             UUCP:      uunet!ingr!crossgl     "all opinions are
111 Westminister Way     INTERNET:  crossgl@ingr.com        mine and not those
Madison, AL 35758        MA BELL:   (205) 772-7842          of my employer."