bogatko@lzga.ATT.COM (George Bogatko) (02/09/90)
Help. We have a program that we do not have source for that is dumping core with Bus Error. Does anybody have, or can point me to, a list of what causes the major core dumps, I.E. Bus, EMT, Memory Fault, etc. Specifically, what kind of C errors will cause a Bus error as opposed to an EMT trap, or a Memory Fault. I don't read this group often, so email will be faster. Please, no flames. Yours, in anticipation George Bogatko.
aryeh@eddie.mit.edu (Aryeh M. Weiss) (02/11/90)
In article <1810@lzga.ATT.COM> bogatko@lzga.ATT.COM (George Bogatko) writes: >Help. We have a program that we do not have source for that is dumping >core with Bus Error. Does anybody have, or can point me to, a list of >what causes the major core dumps, I.E. Bus, EMT, Memory Fault, etc. A program dumps core when it receives a signal that is not currently being caught or ignored and causes a core dump. (Signals that cause core dumps are (SIG) QUIT, ILL, TRAP, IOT, EMT, FPE, BUS, SEGV, and SYS.) Any of these signals can be sent to a process via kill(2S). SIGQUIT is usually caused by the quit key (^\). SIGILL by execution of an illegal instruction (this may be indicative of a trashed stack causing a procedure to return to a random location in the code). SIGTRAP, IOT, and EMT are caused by executing special processor machine instructions. The names are throwbacks to the pdp-11 days and are named after instructions in the pdp-11 instruction set. These are obviously machine dependent, but seem to have equivalents on various popular hardware platforms (Vaxes, 68000, 80x86). Trap instructions are used by debuggers to set breakpoints in the code of a process being traced, but I don't know how they interact with SIGTRAP when being used for this purpose. SIGFPE are caused by floating point errors, such as divide by 0, overflow, and (on Intel x86/x87 system) FPU stack overflows (Xenix 386 users may be familiar with this last one). Now the tricky ones: SIGSEGV is caused when a process addresses a location outside of its (code or data) address space. This is typically caused by overrunning an array, incrementing (and dereferencing) a pointer beyond the end of process memory, and, most familiar to all programmers of non-Vax Unix systems, dereferencing the dreaded NULL pointer. SIGBUS errors are quite machine dependent, but in my experience can be caused by two circumstances: (1) reference to an impossible machine address (this would occur on 68000 systems if you went beyond address 2^24 and may occur on 386/286 systems if you load a segment register with an absurd segment number) and (2) reference an odd address with a word oriented instruction (this is a no-no on Vaxes and 68000's, but 80x86 systems don't mind). SIGSYS is for bad arguments to a system call, but this has never happened to me and I do not know how bad the argument has to be. Illegal addresses passed to system calls generally get returned to the calling process with an error code, so I don't know how exactly to get one of those (this may be another throwback to the olden days of yore). >Please, no flames. This question certainly comes under the heading of things your mother (and the manuals) never told you. --
hue@netcom.UUCP (Jonathan Hue) (02/11/90)
In article <1990Feb10.192028.16025@eddie.mit.edu> aryeh@eddie.MIT.EDU (Aryeh M. Weiss) writes: >Unix systems, dereferencing the dreaded NULL pointer. SIGBUS errors are >quite machine dependent, but in my experience can be caused by two >circumstances Another thing that can cause bus errors is accessing an address which is a valid address within your process, but the thing at that location doesn't respond to a read or write. An example of this would be a frame buffer that you mapped into your process' address space, but was flaky and for some reason didn't generate DTACKs. -Jonathan
meissner@osf.org (Michael Meissner) (02/13/90)
In article <1990Feb10.192028.16025@eddie.mit.edu> aryeh@eddie.mit.edu (Aryeh M. Weiss) writes: ... | SIGSYS is for bad arguments to a system call, but this has never happened | to me and I do not know how bad the argument has to be. Illegal addresses | passed to system calls generally get returned to the calling process with | an error code, so I don't know how exactly to get one of those (this may be | another throwback to the olden days of yore). When I was at Data General, we once grep'ed the current version of System V that we had at the time (probably V.1), and the only place that ever generated SIGSYS was if you passed something other than 0, 1, or 2 as the whence argument to lseek. Given that the Version 6 PDP-11 UNIX only had a 'seek' call which took 3, 4, or 5 in addition to lseek's value, to multiply the offset by 512, it may be SIGSYS was a portibility guide that long since has unneeded. -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA Catproof is an oxymoron, Childproof is nearly so
dyer@spdcc.COM (Steve Dyer) (02/13/90)
In article <MEISSNER.90Feb12175641@curley.osf.org> meissner@osf.org (Michael Meissner) writes: >When I was at Data General, we once grep'ed the current version of >System V that we had at the time (probably V.1), and the only place >that ever generated SIGSYS was if you passed something other than 0, >1, or 2 as the whence argument to lseek. Given that the Version 6 >PDP-11 UNIX only had a 'seek' call which took 3, 4, or 5 in addition >to lseek's value, to multiply the offset by 512, it may be SIGSYS was >a portibility guide that long since has unneeded. Another instance in which SIGSYS was returned was in the INDIR system call in PDP-11s. The read and write system calls had an inline calling sequence like this: mov fd, r0 / fildes in R0 sys READ / sys is the PDP-11 trap instruction, READ the syscall index .word bufaddr .word count /next instruction... similarly for write. You can see that this doesn't lend itself easily to C language calls like read(fd, bufaddr, count), especially for pure text programs. INDIR was used to implement the system call libraries, accomodating "pure text" programs which could not modify inline system call arguments. .text read: ... / get fd from stack, place in R0 / move bufaddr and count to dataarea[1] and dataarea[2] sys INDIR / INDIR == 0 .word dataarea /next instruction .data dataarea: sys READ .word 0 .word 0 If dataarea[0] wasn't a trap instruction, you'd get a SIGSYS. -- Steve Dyer dyer@ursa-major.spdcc.com aka {ima,harvard,rayssd,linus,m2c}!spdcc!dyer dyer@arktouros.mit.edu, dyer@hstbme.mit.edu
bogatko@lzga.ATT.COM (George Bogatko) (02/14/90)
Now, SIGSYS I know about. I got it when I tried to execute code that used message queues on a 3B400 on which the IPC package had not been loaded. Horrible death. GB
tim@ohday.sybase.com (Tim Wood) (02/15/90)
In article <1990Feb10.192028.16025@eddie.mit.edu> aryeh@eddie.MIT.EDU (Aryeh M. Weiss) writes: >SIGBUS errors are >quite machine dependent, but in my experience can be caused by ... >[referencing] an odd address with a word oriented instruction >(this is a no-no on Vaxes and 68000's, but 80x86 systems don't mind). ^^^^^ The VAX does not have alignment restrictions, that is, one may read or write a multi-byte operands at a byte-boundary address. Doing this incurs some performance penalties on the VAX, as well as making your program less portable. The trend these days, especially with RISC, seems to be toward alignment restriction. Nice explanation of coredump signals, BTW. -TW --- Sybase, Inc. / 6475 Christie Ave. / Emeryville, CA / 94608 415-596-3500 tim@sybase.com {pacbell,pyramid,sun,{uunet,ucbvax}!mtxinu}!sybase!tim This message is solely my personal opinion. It is not a representation of Sybase, Inc. OK.
chris@mimsy.umd.edu (Chris Torek) (02/18/90)
In article <1990Feb10.192028.16025@eddie.mit.edu> aryeh@eddie.mit.edu (Aryeh M. Weiss) writes: [lots of good stuff] >SIGBUS errors are quite machine dependent ... [but include, e.g.,] >reference an odd address with a word oriented instruction ... on Vaxes >and 68000's .... As someone else already mentioned, VAXen do not care about address alignment except for speed (aligned operands are somewhat faster). 68000 and 68010 CPUs do; 68020 and 68030 CPUs do not; many RISC chips do. On the VAX, a SIGBUS (bus error) is caused by exactly one condition: an address in the range 0x80000000..0xffffffff. (Half of these are legal kernel space addresses; from the kernel, the fault occurs for addresses in 0xc0000000..0xffffffff. Bus timeouts appear as machine checks rather than faults.) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
aryeh@eddie.mit.edu (Aryeh M. Weiss) (02/19/90)
In article <22598@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes: >In article <1990Feb10.192028.16025@eddie.mit.edu> aryeh@eddie.mit.edu >(Aryeh M. Weiss) writes: >>reference an odd address with a word oriented instruction ... on Vaxes >>and 68000's .... > >As someone else already mentioned, VAXen do not care about address >alignment except for speed (aligned operands are somewhat faster). Sorry, my mistake. I was recalling some of this from some old experiences. Since I have always seen word aligned Vax code and I just assumed ... --
jmm@eci386.uucp (John Macdonald) (02/20/90)
Wow, a chance to pick nits on Chris Torek... In article <22598@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes: | In article <1990Feb10.192028.16025@eddie.mit.edu> aryeh@eddie.mit.edu | (Aryeh M. Weiss) writes: | [lots of good stuff] | >SIGBUS errors are quite machine dependent ... [but include, e.g.,] | >reference an odd address with a word oriented instruction ... on Vaxes | >and 68000's .... | | As someone else already mentioned, VAXen do not care about address | alignment except for speed (aligned operands are somewhat faster). | 68000 and 68010 CPUs do; 68020 and 68030 CPUs do not; [ ... ] ^^^^^ On the 68020, only data references can be unaligned (and slow); code words must have even alignment or the fetch fails. I would guess that the 68030 is the same, but I've never checked. -- Algol 60 was an improvment on most | John Macdonald of its successors - C.A.R. Hoare | jmm@eci386