[comp.sys.sun] How to crash a Sun4/280 Server...again and again and again.

gold@beareq.UUCP (Dan Gold) (12/21/89)

The following scenario frequently results in the unwanted rebooting of a
Sun4/280 server:

(1) I rlogin to the server (perhaps not incidentally my boot server) from
a SparcStation1 to run a program which has substantial memory and i/o
requirements.  The program is written in C, compiled under SunOS 4.0.3,
and uses approximately 20-30 Megabytes of memory at its peek, malloc()'ed
in 1, 2, and 8 Megabyte blocks.  I/O is via NFS, reading from a disk
served by a Sun3/280.  XDR is NOT used, however, all structures read from
the Sun3 consist SOLELY of 4-byte members.  The Sun3 runs SunOs 3.5, the
others run 4.0.3.  The Sun4/280 has approximately 128 Mb of physical
memory.

(2) The killer program behaves as follows:

Sometimes the program will execute properly.  Other times it will hang the
SparcStation.  This is occasionally accompanied by NFS read errors (NFS
Server (the Sun3/280) not responding).  When control returns to the
SparcStation, it is because the Sun4/280 has rebooted automatically.  Then
the message "Read Error From Network.  Connection reset by peer." appears,
and lo, I am rlogged out.

Often the program will appear to execute properly, only to see the
Sun4/280 reboot later along with the same message, "Read Error From
Network.  Connection reset by peer."

1.  With structures consisting solely of 4-byte members, is it necessary
to use XDR when going from a Sun3/280 to a SparcStation?  A Sun3/280 to
Sun4/280?  A Sun4/280 to a SparcStation?  I had thought the answer to all
of these questions was "No!"

2.  Are there any known bugs in 4.0.3 which would allow a user-mode
program to crash the server?

3.  What affect might rlogin have on this process?