[comp.unix.internals] How do you make your UNIX crash ???

berny@tndsyd.oz.au (Berny Goodheart) (03/10/91)

I am interested in finding out known ways to make your version of UNIX
crash. If you know of a particular application, program bug or programming
algorithm that will guarantee to cause your UNIX Operating system to crash
I would like to hear about it. Tell me as much info as possible i.e., what
the machine is and what version of the OS it is, how does the problem manifest
itself, etc.  I am particularly interested in generic bugs.

.===========================================================================.
|   ACSnet: berny@tndsyd.oz       UUCP: uunet!munnari.oz!tndsyd.oz.au!berny |
| INTERNET: berny@tndsyd.oz.au  DOMAIN: goodheart_berny@tandem.com          |
|   PSMAIL: smtpgate @comm(berny@tndsyd.oz@munnari.oz.au)                   |
TANDEM Computers Incorporated 76 Berry St, North Sydney, NSW, 2060, Australia

gill@boris.mscs.mu.edu (VAXDEATH (Ender) Gill) (03/11/91)

In article <690@tndsyd.oz.au> berny@tndsyd.oz.au (Berny Goodheart) writes:
>
>I am interested in finding out known ways to make your version of UNIX
>crash. If you know of a particular application, program bug or programming


main()
{
    fork ();
    main ();
}

This brought the system crashing to a halt, before they set up a quota
for process.  The system was a SysV 3.0, on 3B5.
    Also, running two copies of Franz Lisp simultaneously would cause
the system to crash.  This bug has since been fixed.


cheers
-dicky gill (Violator)
-gill@boris.mscs.mu.edu

mike (03/11/91)

In an article, tndsyd.oz.au!berny (Berny Goodheart) writes:
>I am interested in finding out known ways to make your version of UNIX
>crash.

main()
{
int fd;
long now;

	time(&now);
	if ( (fd = open("/dev/kmem",O_RDWR)) == -1 )
		return(1);
	while ( write(fd,&now,sizeof(long)) == sizeof(long) )
		;
	close(fd);
	return(0);
}

-- 
Michael Stefanik, MGI Inc., Los Angeles| Opinions stated are not even my own.
Title of the week: Systems Engineer    | UUCP: ...!uunet!bria!mike
-------------------------------------------------------------------------------
Remember folks: If you can't flame MS-DOS, then what _can_ you flame?

sow@cad.luth.se (Sven-Ove Westberg) (03/12/91)

In article <690@tndsyd.oz.au> berny@tndsyd.oz.au (Berny Goodheart) writes:
|

My favorite on SunOS 4.0x is df /dev/*b   (the swap partition)
It did not crash but the computer hangs forever.


Sven-Ove Westberg, CAD, University of Lulea, S-951 87 Lulea, Sweden.

scs@iti.org (Steve Simmons) (03/13/91)

In an article, tndsyd.oz.au!berny (Berny Goodheart) writes:
>I am interested in finding out known ways to make your version of UNIX
>crash.

In article <513@bria> Michaoe Stefanik replies:

>main()
>{
>int fd;
>long now;

>	time(&now);
>	if ( (fd = open("/dev/kmem",O_RDWR)) == -1 )
>		return(1);
>	while ( write(fd,&now,sizeof(long)) == sizeof(long) )
>		;
>	close(fd);
>	return(0);
>}

Too verbose.  On a Sun, try 'b etc/dump' at the '>' prompt.

Do I need a smiley?
-- 
"Our informal mission is to improve the love life of operators worldwide."
  Peter Behrendt, president of Exabyte.  Quoted in Digital Review, Feb 4, 1991.

tar@math.ksu.edu (Tim Ramsey) (03/13/91)

[ diverging a bit, so I added alt.folklore.computers to the Newsgroups: and
  Followup-To: ]

In article <513@bria>:

>In an article, tndsyd.oz.au!berny (Berny Goodheart) writes:
>>I am interested in finding out known ways to make your version of UNIX
>>crash.

>	time(&now);
>	if ( (fd = open("/dev/kmem",O_RDWR)) == -1 )
>		return(1);
>	while ( write(fd,&now,sizeof(long)) == sizeof(long) )
>		;

A couple of years ago in an extreme fit of boredom, on an unused ATT 3B2/400
running SysV 3.1, I did the following:

  hack# yes > /dev/kmem

(where yes just writes a stream of "y\n" to stdout)

Nothing happened.  I let it run for a couple of minutes with no apparent
effect.  Perhaps the writes weren't moving the seek pointer in kmem, so it
wasn't writing over anything interesting.  I wasn't energetic enough to
find out.  All in all, it didn't help my boredom. :)

--
Tim Ramsey (tar@math.ksu.edu)  (913) 532-6750 (voice)  (913) 532-7004 (FAX)
Department of Mathematics, Kansas State University, Manhattan KS 66506-2602

daveh@marob.uucp (Dave Hammond) (03/15/91)

In article <690@tndsyd.oz.au> berny@tndsyd.oz.au (Berny Goodheart) writes:
>
>I am interested in finding out known ways to make your version of UNIX
>crash.

System:  Altos System V
uname -a: unix ces 5.3.1 d 386/2000 (empty) (empty) (empty) 10

$ stty line 1	# the only installed line discipline is 0.

--
Dave Hammond
daveh@marob.uucp
uunet!rutgers!phri!marob!daveh

lance@motcsd.csd.mot.com (lance.norskog) (03/22/91)

Awhile back, someone found that executing random data made quite
a few RISC chips sieze up.

torek@elf.ee.lbl.gov (Chris Torek) (03/22/91)

In article <3442@engadm3.csd.mot.com> lance@motcsd.csd.mot.com
(lance.norskog) writes:
>Awhile back, someone found that executing random data made quite
>a few RISC chips sieze up.

False.

In fact, the program caused a number of *operating systems* to crash.
As I recall, the only chip bug it exposed (one that was already known
anyway) was in a ``CISC'' (namely, one of the older 80386 implementations).

The ability to crash any particular operating system is not a useful
indicator of the reliability of any particular microprocessor.

That said, one should note that the more complex something is, the
easier it is to make a mistake when implementing it.  Bugs are often
found in complex-addressing-mode CPUs (until fixed in a later revision)
when:

  - an access-probe instruction crosses a page boundary
	(example: VAX 11/780)

  - a read-modify-write instruction crosses a page boundary
	(example: NS32016)

  - an exception occurs while recovering from an exception, e.g.,
	a parity error during a register state retract during a
	page fault.

In general, ``RISC'' chips avoid all these bugs by outlawing the
situations under which they occur: their instructions never cross page
boundaries, they do not provide r/m/w instructions, and so forth.
There is, of course, a wide spectrum of actual implementations, with
varying degrees of ``RISCness'', and simply not providing an operation
does not obviate any existing need for something equivalent.  In other
words, this sometimes merely moves complexity from point A (the
hardware) to point B (the software).

Finding the correct balance of simplicity and functionality is what
machine architecture is all about.  If you think you can do better than
``RISC'' (whatever that is, besides a marketing word---like obscenity,
we all know what RISC is when we see it, but no one can define it), by
all means, gather some venture capital and start a new company.

(There, have I managed to avoid pushing anyone's buttons? :-) )
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

guy@auspex.auspex.com (Guy Harris) (03/26/91)

>Awhile back, someone found that executing random data made quite
>a few RISC chips sieze up.

No, a while back, someone found that executing random data made some
operating systems running on machines with RISC chips crash, and
immediately charged ahead and, on the basis of that small bit of partial
data and a lot of assumptions about RISC chip design methodology,
speculated that this was caused by RISC-chip designers not being as
rigorous as CISC-chip designers in testing their chips.

Subsequent to this:

	1) the same program was found to crash the OS on at least one
	   machine running a *CISC* chip (a 80386 machine, I think);

	2) at least in one case, the problem was a bug in the *OS*'s
	   code to deal with, as I remember, illegal instructions in the
	   delay slots of illegal branches, or something like that (this
	   was in the MIPS version).

At some point, I may dive in and see what caused SunOS to barf; I
suspect it's a bug in the floating-point simulation code (which may get
invoked even on SPARCs with an FPU, as the FPU may not implement every
single floating-point instruction in the architecture).

If one wishes to consider the code that broke on the RISC machines to be
low-enough-level support code that it "should" be considered as much a
part of the architecture's implementation as would the chip itself, you
could, I guess, flame RISC - or, at least, the folks doing the software
part of the implementation.

Of course, given that, you can probably find plenty of microcode bugs to
damn CISC as well, if your goal is to bash some particular architecture
style (the posting in which the person revealed the results of his test
had a bit of a RISC-bashing flavor to it).

(Followups directed to "alt.religion.computers", if you really feel you
*MUST* make your 2 small monetary units worth known on the RISC vs. CISC
topic.)

mouse@thunder.mcrcim.mcgill.edu (der Mouse) (03/26/91)

In article <11313@dog.ee.lbl.gov>, torek@elf.ee.lbl.gov (Chris Torek) writes:
> [...], one should note that the more complex something is, the easier
> it is to make a mistake when implementing it.  Bugs are often found
> in complex-addressing-mode CPUs (until fixed in a later revision)
> when:

> [...]

We came across a most mysterious bug once: our MicroVAX suddenly
started coredumping in test(1), but only for this one user.  After much
disbelief and head-scratching, we eventually determined that the
coredump was due to something going wrong when the pushes involved in
taking the trap for one of the emulated instructions involved pushing
past a page boundary.  I think it might have been only when it took a
pagefault on the new page.  The user in question had just the right
size environment to tickle this.  I'm still not sure exactly *what* was
going wrong when this happened....

About then, we realized that the coredumps had started just about the
time the service guys had done some sort of maintenance operation that
involved replacing the CPU...we called them back and said "we can't
really describe how it's broken; we're not entirely certain we can
state it to ourselves.  But it's definitely broken; please replace it".
They did.  (Our sysadmin at the time was good at persuasion.)

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu