[comp.lang.c] Bus Error

Lynn.Lively@p4694.f506.n106.z1.fidonet.org (Lynn Lively) (03/10/90)

Does anyone know what a "Bus Error" is? I've got a very complicated program  
running that monitors and schedules the activities of about 10 son processes.   

It uses alot of dynamic memory, an IPC message queue, I'm also catching some  
signals (SIGALRM, SIGCLD). The program dies in various spots, so I'm obviously  
 
dealing with a dynamic condition of sorts, but don't have a clear idea where  
to look in the program for the problem. Any help would be appreciated.
 Your Servant,
      Lynn
 

henry@utzoo.uucp (Henry Spencer) (03/12/90)

In article <16139.25F89344@urchin.fidonet.org> Lynn.Lively@p4694.f506.n106.z1.fidonet.org (Lynn Lively) writes:
>Does anyone know what a "Bus Error" is? ...

You don't say what machine this is on, and that's important, because the
definition of "bus error" is system-specific.  Typical cause is accessing
a multi-byte value at an improper alignment or accessing a part of your
address space which has not been allocated.  The usual reason is trashed
pointers; common underlying problems are running off the end of an array
and trashing other variables, forgetting to initialize a pointer before
using it, or continuing to use a memory area after handing it to free().
-- 
MSDOS, abbrev:  Maybe SomeDay |     Henry Spencer at U of Toronto Zoology
an Operating System.          | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

ping@cubmol.bio.columbia.edu (Shiping Zhang) (03/13/90)

In article <16139.25F89344@urchin.fidonet.org> Lynn.Lively@p4694.f506.n106.z1.fidonet.org (Lynn Lively) writes:
>
>Does anyone know what a "Bus Error" is? I've got a very complicated program  
From my own experiences, "Bus Error" results from index of arrays
reaching beyond their ranges. But I'm not sure if it's the only
cause for this kind of errors.


-ping

zech@leadsv.UUCP (Bill Zech) (03/15/90)

In article <1990Mar13.042241.17357@cubmol.bio.columbia.edu>, ping@cubmol.bio.columbia.edu (Shiping Zhang) writes:
> In article <16139.25F89344@urchin.fidonet.org> Lynn.Lively@p4694.f506.n106.z1.fidonet.org (Lynn Lively) writes:
> >
> >Does anyone know what a "Bus Error" is? I've got a very complicated program  
> From my own experiences, "Bus Error" results from index of arrays
> reaching beyond their ranges. But I'm not sure if it's the only
> cause for this kind of errors.
> 

In the Motorola 68xxx series chips, a Bus Error is the result of some
peripheral (I/O, memory, etc.), or possibly the cpu board itself, 
asserting the BERR line to the 68xxx chip.

For C programs, the most common source of bus error is asserted by the
cpu board, resulting from a timeout while waiting for DTACK (or DSACK
on 68020s) from the memory or I/O system.  What this boils down to
is that you tried to read or write to non-existant memory or I/O space,
and since nobody responded to the 68xxx's request, the cpu board
timed out and asserted BERR to the 68xxx to make it stop waiting.
BERR causes an exception and your program blows up.  The contents
of the stack at that point contains the address in question and
your program's last instruction address.  Note that because of pre-fetch, your 
program counter could be several bytes ahead of the actual instruction
that failed.

- Bill

jensting@skinfaxe.diku.dk (Jens Tingleff) (03/16/90)

zech@leadsv.UUCP (Bill Zech) writes:

>In article <1990Mar13.042241.17357@cubmol.bio.columbia.edu>, ping@cubmol.bio.columbia.edu (Shiping Zhang) writes:
[..]

>In the Motorola 68xxx series chips, a Bus Error is the result of some
[....]

Or, of course, trying to fetch a long value from an address thats odd.. .

Dereferencing a char pointer to get an `int', or `long', should thus get you
in real trouble every second time.... .

	jens
jensting@diku.dk is
Jens Tingleff MSc EE, Research Assistent at DIKU
	Institute of Computer Science, Copenhagen University
Snail mail: DIKU Universitetsparken 1 DK2100 KBH O

dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) (03/16/90)

In article <1990Mar16.083052.20554@diku.dk> jensting@skinfaxe.diku.dk (Jens Tingleff) writes:
}zech@leadsv.UUCP (Bill Zech) writes (at least I think he wrote it, ed. Dolf):
}
}>In the Motorola 68xxx series chips, a Bus Error is the result of some
}
}Or, of course, trying to fetch a long value from an address thats odd.. .
}

No, that's an Address Error. The stack frame looks like a Bus Error, but it
is a different exception. Note that the MC680[234]0 allow fetching (long)
word data accesses on odd byte boundaries. Instructions and stackpointer must
be on word boundaries.
-- 
Dolf Grunbauer          Tel: +31 55 433233  Internet dolf@idca.tds.philips.nl
Philips Telecommunication and Data Systems  UUCP ....!mcsun!philapd!dolf
Dept. SSP, P.O. Box 245, 7300 AE Apeldoorn, The Netherlands         n   n   n
It's a pity my .signature is too small to show you my solution of  a + b = c

zech@leadsv.UUCP (Bill Zech) (03/20/90)

In article <690@ssp11.idca.tds.philips.nl>, dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) writes:
> In article <1990Mar16.083052.20554@diku.dk> jensting@skinfaxe.diku.dk (Jens Tingleff) writes:
> }zech@leadsv.UUCP (Bill Zech) writes (at least I think he wrote it, ed. Dolf)
		    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	
> }
> }>In the Motorola 68xxx series chips, a Bus Error is the result of some
> }
> }Or, of course, trying to fetch a long value from an address thats odd.. .
> }
> 

Actually, no, I didn't write that....

I wrote the long response that described the bus error from a 
hardware perspective.  

An just in case people are interested, a Spurious Interrupt is
a bus error during and interrupt acknowlege cycle.

No, I guess you weren't interested.  Oh well.

- Bill Zech

gday@digigw.lab.digital.co.jp (Gordon Day) (03/20/90)

> In article <16139.25F89344@urchin.fidonet.org> Lynn.Lively@p4694.f506.n106.z1.fidonet.org (Lynn Lively) writes:
>
>Does anyone know what a "Bus Error" is? I've got a very complicated program  

Look, you people have been going on about the hardware architecture of the
68000 for some time now, but is that really the point?  IMHO Ms. Lively, while
perhaps fascinated by the intricacies of memory management and fetch/decode
cycles, really wanted a bit of help on where to look for the problem in her
code =:) (If I'm wrong on this, I DO apologise!)

I expect that you are either using C or doing pointer manipulation in some
other language.  What any system trap like "Segmentation Fault", "Bus Error",
etc, boils down to is that your program has clobbered some part of memory that
it had no right to.  In C, the way this happens is you are forgetting to 
allocate space, or assuming it is allocated when it's not, then stuffing data
into said space.  The "solution":

- use lint.

- if you define a pointer somewhere, check very carefully that space has been
  allocated.  In the C world VERY FEW library routines allocate pointers they
  are passed.

- remember: a pointer fault can show up in mysterious ways if it is overwriting
  your stack space (arguments get changed on their way to functions, Bus Errors
  occur, etc).

- practice.

I know the above comment is bloody simplistic, but I felt that was what was
wanted.  Sorry again if I'm wrong.

=:! gday@digital.co.jp%uunet.uu.net

jensting@skinfaxe.diku.dk (Jens Tingleff) (03/20/90)

dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) writes:

>In article <1990Mar16.083052.20554@diku.dk> jensting@skinfaxe.diku.dk (Jens Tingleff) writes:
>}zech@leadsv.UUCP (Bill Zech) writes (at least I think he wrote it, ed. Dolf):
>}
>}>In the Motorola 68xxx series chips, a Bus Error is the result of some
>}
>}Or, of course, trying to fetch a long value from an address thats odd.. .
>}

>No, that's an Address Error.
[....]

Oooooops, quite so. 

Thanks to the nice people who managed yo show me the error of my way
without flaming me. What a nice change in comp.lang.c (1/2 ;^) ).

My only excuse is the pressure of age (I turned 24 recently..).

	Jens

jensting@diku.dk is
Jens Tingleff MSc EE, Research Assistent at DIKU
	Institute of Computer Science, Copenhagen University
Snail mail: DIKU Universitetsparken 1 DK2100 KBH O

ok@goanna.oz.au (Richard O'keefe) (03/22/90)

In article <16139.25F89344@urchin.fidonet.org> Lynn.Lively@p4694.f506.n106.z1.fidonet.org (Lynn Lively) writes:
>Does anyone know what a "Bus Error" is? 
In article <307@digigw.lab.digital.co.jp>,
gday@digigw.lab.digital.co.jp (Gordon Day) writes:
> IMHO Ms. Lively, ...
> really wanted a bit of help on where to look for the problem in her code.

Just so.  Unfortunately, what Gordon Day goes on to describe is likely
causes of a Segmentation Violation.  To a *very* rough approximation

	Segmentation Violation =	pointer into non-existent memory
					(dereferencing NULL is a common cause)

	Bus Error =			malformed or misaligned pointer
					(e.g. short *p = 1; ... *p ...
					on a PDP-11)

Watch out for integers cast to pointers, for unions that overlay pointers
and integers, for missing arguments in function calls (so that a numeric
argument has been mistaken for an accidentally omitted pointer).  Use lint.

Basically, the best way of looking for the cause of either signal is to use
sdb or dbx or dbxtool or whatever debugger you have and see where the
program stopped.  It should be quite clear which pointer variable was being
dereferenced.  Then try to figure out how it got that value.  With a Bus
Error the value can *never* have been a valid pointer.