[comp.sys.sgi] diskless clients hanging

gene@vis.toronto.edu (Eugene Amdur) (12/04/90)

Has anyone else seen this?  We have four diskless 4D/20's served from
a 4D/340VGX.  Periodically (two to three times a day) the diskless machines 
hang.  Users are tipically running "man", "su", or even nothing.  NeWS is 
always running though.  These machines have 8Megs of memory and 20Megs of 
swap space each.  

Everything is running version 3.3.1 of the OS.  There is one strange oddity.
Whenever the machines are reset (by the button next to the on/off switch), the
time is reset to Dec 31 1969.  I`m not sure if this has anything to do with it
though.

There are other things.  We used to run our clients at 3.3 with a /share as
setup buy SGI's clinst.  But we found our clients hanging, so I switched to
a shared /usr and 3.3.1 of the OS (shared with the server) with programs that 
are machine dependent linked to /private (most of the machine dependent programs
on the PIs are from version 3.3 of the OS including the news_server and the
graphics libraries).  The hanging didn`t go away with this modification though.

Finally, our 340VGX hasn't hung yet.  Nor have other diskfull 4D/25's.
Any clues?

--gene
gene@vis.toronto.edu (Internet)
...!{uunet, pyramid, watmath, ubc-cs}!utai!gene (UUCP)

mg@ (Mike Gigante) (12/05/90)

Yep. We have the same problem, less frequent, but still problems, problems,
problems.

We get daemons dying off with "interupted system calls" (perror 5)
apparently a failed page request off the server - I don't know why.

Of course, SGI (in the release notes) advised not to install 3.3.1 for
diskless. Our helpful (true -- not cynical) SEs tell me that we should
expect a maintanence release soon that will fix the diskless problems.

On the diskless issue, we have to damn turn *off* the -s option to 
tftpd in order to boot a diskless machine - exposing a security hole.

So the whole diskless question is made worse by having to fiddle with
inetd before and after diskless booting..

I wont even mention what I think of the clinst procedures after our
*troublesome* 3.3 upgrades from 3.2.2 
(hopefully our SEs have passed back the extensive traumas we went through)

Also, until 3.3, I have also been quite dissappointed by diskless
performance (we have had them for about 15 months now)

The moral: we are buying 200Mb system disks this week to finally
solve the problem.

Mike Gigante
RMIT Australia