[comp.sys.att] 6386 shutdown: I CAN'T BELIEVE at&t was really this stupid!

jr@oglvee.UUCP (Jim Rosenberg) (06/14/89)

I'm in the process of configuring the first of several AT&T 6386en to be used
at our remote locations.  In working with our development 6386 it suddenly hit
me one day how incredibly dumb and potentially catastrophic is the following
lovely "feature" of the UNIX V.3.2 shutdown sequence.  Now as you all know, for
remote sites where no one is UNIX knowledgeable, shutdown is a most weighty
matter, and failing to do shutdown correctly is just about GUARANTEED to cause
big-time trouble.  I look over at the screen one day after a shutdown has gone
to completion, and what do I see as the final line on the screen:

Reboot the system now.

Think about it.  Not "You MAY reboot the system now."  Not "You may turn the
power off or reboot the system now."  Oh no.  This is a command.  It says:
"Reboot the system NOW!" [emphasis added].  Now isn't this fun.  You instruct
your users in the religion of shutdown.  They only know about UNIX what you've
told them.  You think you have dynamite menu stuff and shell script stuff and
the whole works.  One day they have to move the machine, so they have to unplug
it.  Like good little campers they go through shutdown.  They are all ready to
hit the power switch, and then they see this stern admonition:  "Reboot the
system now."  "Oh, OK, well the computer TOLD ME to reboot ..."  So they
reboot, chatter with someone else in the office for a minute, then turn the
power off ...

All right, so AT&T flubbed this, no big deal, I'll just edit whatever shell
script has this abortion in it.  I look at /etc/rc0.  Not there.  In
desparation I go through every file in /etc.  Not there.  /etc/rc0.d/*:  same
story.  Finally there is only one place left in all of /dev/universe.  I say
to myself "I DO NOT BELIEVE THIS!" and run strings on /unix.  Yup.  This idiocy
is HARD CODED INTO THE FLIBBING **KERNEL**!!!!

So quick now, all you glorious VARs and systems integrators and other
adventurous souls using the 6386, forward march & adb that kernel -- and the
.o file for making new kernels and ...
-- 
Jim Rosenberg                        pitt
Oglevee Computer Systems                 >--!amanue!oglvee!jr
151 Oglevee Lane                      cgh
Connellsville, PA 15425                                #include <disclaimer.h>

wjc@ho5cad.ATT.COM (Bill Carpenter) (06/14/89)

In article <483@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:

> Reboot the system now.


Standard instructions on shampoo bottles:

1.  Lather
2.  Rinse
3.  Repeat

:-)
--
   Bill Carpenter         att!ho5cad!wjc  or  attmail!bill

tneff@bfmny0.UUCP (Tom Neff) (06/15/89)

If you think about it for a second, by the time UNIX is totally ready
to be "rebooted now," you shouldn't have an active file system to
read scripts from.  The kernel is ALL THAT'S LEFT.  So the message
has to go there.  (I suppose it could be linked in at rebuild time.)

It seems sufficient to tell people, "Always run SHUTDOWN before turning
off power.  Never power down a running system."  If they DO reboot
after a shutdown, they cannot hurt anything without violating your
instructions, since the system would once again be "running."

By the way in my experience with my pair of 6386en, the only danger to
an inadvertent powerdown or crash is to the files you have changed since
the last 'sync' happened.  You can reboot the system every 5 minutes all
day long and never lose a thing if you don't change your files (edit etc.)
At worst, some log files might not get updated, but if you're sitting
there rebooting all the time, what's to log anyway.

To minimize the risk from power hits and crashes, I add a root cron job
that performs a 'sync ; sync' every 10 minutes.  I have not been reliably
persuaded that this is something the kernel does automatically on V/386,
although I know of UNIXen where that's true.  I feel more comfortable
doing it myself -- the overhead is negligible.  Only caveat - this doesn't
help files accessed remotely via RFS.  I wish there were an RFS equivalent
to force update of the cache but I haven't found one.

Also - to simplify the shutdown procedure in the event of my team's absence
(say, if someone had to run in and shut things off due to an emergency in
the building or an impending power shutdown), I create the following
entry in /etc/passwd:

	shutdown:x:0:1:Shut down the system:/usr/shutdown:/bin/sh

and the following /usr/shutdown/.profile:

	cd /
	exec /etc/shutdown -y -g15

I could do the same thing directly by naming shutdown as the startup 
shell, but I hate having to edit /etc/passwd just to change things 
like delay options. 
-- 
You may not redistribute this article for profit without written permission.
--
Tom Neff				UUCP:     ...!uunet!bfmny0!tneff
    "Truisms aren't everything."	Internet: tneff@bfmny0.UU.NET

kremer@cs.odu.edu (Lloyd Kremer) (06/15/89)

In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes:

>Also - to simplify the shutdown procedure in the event of my team's absence
>(say, if someone had to run in and shut things off due to an emergency in
>the building or an impending power shutdown), I create the following
>entry in /etc/passwd:
>
>	shutdown:x:0:1:Shut down the system:/usr/shutdown:/bin/sh
>
>and the following /usr/shutdown/.profile:
>
>	cd /
>	exec /etc/shutdown -y -g15
>
>I could do the same thing directly by naming shutdown as the startup 
>shell, but I hate having to edit /etc/passwd just to change things 
>like delay options. 


Small point, but I don't think you *could* do it directly in /etc/passwd.
In any AT&T system I know, the shell field of /etc/passwd does not allow
options or arguments.

-- 
					Lloyd Kremer
					Brooks Financial Systems
					...!uunet!xanth!brooks!lloyd
					Have terminal...will hack!

hjespersen@trillium.waterloo.edu (Hans Jespersen) (06/15/89)

In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes:

>To minimize the risk from power hits and crashes, I add a root cron job
>that performs a 'sync ; sync' every 10 minutes.  I have not been reliably
                  ^^^^^^^^^^^  

Why do people always do this? Running sync twice does nothing that
running sync once wouldn't do. Remember that 'sync' does NOT guarantee
that all delayed writes are actually written out to disk. It mearly
guarantees that they are in the queue to be written as soon as possible.
When you are at a shell prompt running

# sync
# sync
# sync

usually guarantees enough time has passed (since the first sync) that
the files were written to disk. Running

# sync;sync;sync

is kind of stupid since the first sync is not performed until after
you have typed the whole line in. 

-- 
Hans Jespersen
hjespersen@trillium.waterloo.edu
uunet!watmath!trillium!hjespersen

pss@unh.UUCP (Paul S. Sawyer) (06/16/89)

In article <WJC.89Jun14170102@ho5cad.ho5cad.ATT.COM>, wjc@ho5cad.ATT.COM (Bill Carpenter) writes:
:> 
:> Standard instructions on shampoo bottles:
:> 
:> 1.  Lather
:> 2.  Rinse
:> 3.  Repeat
:> 
:> :-)
:> --
:>    Bill Carpenter         att!ho5cad!wjc  or  attmail!bill

Reaching EOF on the bottle usually terminates the loop in all but the
most poorly implemented systems...

(   B-)


-- 
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Paul S. Sawyer              uunet!unh!unhtel!paul     paul@unhtel.UUCP
UNH Telecommunications
Durham, NH  03824-3523      VOX: 603-862-3262         FAX: 603-862-2030

jfc@athena.mit.edu (John F Carr) (06/16/89)

In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes:

>To minimize the risk from power hits and crashes, I add a root cron job
>that performs a 'sync ; sync' every 10 minutes.  I have not been reliably
>persuaded that this is something the kernel does automatically on V/386,
>although I know of UNIXen where that's true.

We have the following man page (and a program to go with it) on our BSD 4.3
system.

UPDATE(8)           UNIX Programmer's Manual            UPDATE(8)

NAME
     update - periodically update the super block

SYNOPSIS
     /etc/update [-n]

DESCRIPTION
     Update is a program that executes the sync(2) primitive
     every 30 seconds.  This insures that the file system is
     fairly up to date in case of a crash.  This command should
     not be executed directly, but should be executed out of the
     initialization shell command file.

     Normally, update opens certain system directories to keep
     them in the name translation cache.  If the -n option is
     given, subdirectories of /usr are not opened so that remote
     system libraries can be unmounted while the system is run-
     ning.


It doesn't appear to place much load on a system to do this twice
per minute (perhaps 2 minutes CPU per day of runtime).

    --John Carr (jfc@athena.mit.edu)

tneff@bfmny0.UUCP (Tom Neff) (06/16/89)

In article <14506@watdragon.waterloo.edu> hjespersen@trillium.waterloo.edu (Hans Jespersen) writes:
>In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes:
>>To minimize the risk from power hits and crashes, I add a root cron job
>>that performs a 'sync ; sync' every 10 minutes.  I have not been reliably
>                  ^^^^^^^^^^^  
		I wrote it this way to conserve space;
		actually they're on separate lines of
		a shell script.

>Why do people always do this? Running sync twice does nothing that
>running sync once wouldn't do. 

I experimented with this fairly extensively when this installation was born.
Running 'sync' once, waiting for the hard disk light to go out and then
punching the button nearly always lost some data on the FS check.  Running
it twice never did.

I might add that regardless of the underlying reasons, it CANNOT hurt.

>				Remember that 'sync' does NOT guarantee
>that all delayed writes are actually written out to disk. It mearly
>guarantees that they are in the queue to be written as soon as possible.

That's good enough for the purpose described above.  If you are doing it
by hand preparatory to an emergency powerdown, you should wait until the
disk accesses are visibly done.

-- 
You may not redistribute this article for profit without written permission.
--
Tom Neff				UUCP:     ...!uunet!bfmny0!tneff
    "Truisms aren't everything."	Internet: tneff@bfmny0.UU.NET

les@chinet.chi.il.us (Leslie Mikesell) (06/17/89)

In article <483@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:

>Think about it.  Not "You MAY reboot the system now."  Not "You may turn the
>power off or reboot the system now."  Oh no.  This is a command.  It says:
>"Reboot the system NOW!"

Well, a common reason to shut down a 386 machine is to bring it up under DOS
or a diagnostic disk, so the message isn't always inappropriate.

Has anyone else had a problem rebooting with ctl-alt-delete after getting
this message?  I generally get a rom diagnostic about a serial port failure
unless I push the reset button or power down after running unix.

Les Mikesell

jim@dfmp1.UUCP (Jim Murray) (06/20/89)

In article <8708@chinet.chi.il.us>, les@chinet.chi.il.us (Leslie Mikesell) writes:
> 
> Has anyone else had a problem rebooting with ctl-alt-delete after getting
> this message?  I generally get a rom diagnostic about a serial port failure
> unless I push the reset button or power down after running unix.
> 
> Les Mikesell

Yeah! only I haven't seen the pattern you've seen.  It seems intermittent like
crazy.  The problem I have is not only does the Rom Diagnostics complain
but the serial port fails.  Has any one else had the same problem??
Has anyone else gotten the following combination to work???

6383e with 5 meg of Ram
EGA monitor with a VDC 600
IPC 802
Starlan 
Bus Mouse

It's giving me fits???
Jim Murray

pjr@jjcac.uucp (Paul Rak) (06/20/89)

In article <8708@chinet.chi.il.us> les@chinet.chi.il.us (Leslie Mikesell) writes:
>
>Has anyone else had a problem rebooting with ctl-alt-delete after getting
>this message?  I generally get a rom diagnostic about a serial port failure
>unless I push the reset button or power down after running unix.
>
>Les Mikesell

The problem exists even with the latest revision of the BIOS that I've seen
(1.14), and the solution from the hotline is to use the RESET button, as
they can't figure out why it fails.  It occurs on every 6386 that I've
worked on, both 16 & 20 MHz desktops and 20 MHz towers.

Paul Rak

(School / Weekend Work)                    (Weekday Work)
Paul Rak, Jr.                              Paul Rak, Jr.
Sys Administrator/Lab Assistant            Member of Technical Staff
Joliet Junior College                      EMO Computer Products, Inc.
Academic Computing Center                  1701 Quincy Avenue, Suite 22
1216 Houbolt Avenue                        Naperville, IL 60540
Joliet, IL 60536                           (312) 369-1350
(815) 729-9020 x362                        (An AT&T VAR)

chris@cetia4.UUCP (Christian Bertin) (06/22/89)

In article <12044@bloom-beacon.MIT.EDU>, jfc@athena.mit.edu (John F Carr) writes:
  (talking about the 'update' daemon)
>
> It doesn't appear to place much load on a system to do this twice
> per minute (perhaps 2 minutes CPU per day of runtime).
> 
>     --John Carr (jfc@athena.mit.edu)

If you have a large buffer cache (1Mb on my system) and if you do a lot of
compiles, or if you do anything that creates large temporary files, you can
waste a LOT of time sync'ing files that should never have been written out
to disk. The last time I tried to measure this, a 3 hour compile session went
down to 2:40 after I changed the sync time from 30 seconds to 2 minutes.

At the very least, 'update' should take an optionnal argument to customize the
sync intervals.

Chris
-- 
Chris Bertin	| -- CETIA -- 150, Av Marcelin Berthelot, Z.I. Toulon-Est
+33(94)212005	| 83088 Toulon Cedex, France
		| inria!cetia!chris

mrm@Sceard.COM (M.R.Murphy) (06/23/89)

In article <14506@watdragon.waterloo.edu> hjespersen@trillium.waterloo.edu (Hans Jespersen) writes:
+In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes:
+ 
+>To minimize the risk from power hits and crashes, I add a root cron job
+>that performs a 'sync ; sync' every 10 minutes.  I have not been reliably
+                  ^^^^^^^^^^^  
+
+Why do people always do this? Running sync twice does nothing that
+running sync once wouldn't do. Remember that 'sync' does NOT guarantee
+that all delayed writes are actually written out to disk. It mearly
+guarantees that they are in the queue to be written as soon as possible.
+When you are at a shell prompt running

The reason that people do this comes from heeding the warning in the manual
section UPDATE(VIII) that was released 11/1/73 with Version 5 (not System V).
update did the sync every 30 seconds.

Quoting, albeit without permission, though in hope that none will complain
too bitterly,

	BUGS
		With update running, if the CPU is halted just as the sync
		is executed, a file system can be damaged. This is partially
		due to DEC hardware that writes zeros when NPR requests fail.
		A fix would be to have sync temporarily increment the system
		time by at leaset 30 seconds to trigger the execution of update.
		This would give 30 seconds grace to halt the CPU.

The entry for SYNC(II) in the same manual is similar to the entry in a System V
manual with the sentence, again quoted without permission,

	The writing, although scheduled, is not necessarily complete upon
	return from the sync.

present in the System V manual and missing in the Version 5 manual.

Since presumably the paragraph in SYNC(II) in the Version 5 manual and in
SYNC(2) in the System V manual that suggests that programs such as fsck and
df that jerk with file systems should sync, and since the manditory use of
sync before boot is stated, it probably is a good idea to sync before halt or
re-boot :-).

The reasoning for two syncs goes like this:

	1. Manually type sync. It is scheduled, but doesn't complete
	   all the important work.

	2. Shut off or halt the machine just as the really important
	   write is happening or just as some other sync occurs as a
	   result of something like the cron entry suggested above.

	3. Be real sad that the disk is corrupted.

or,

	1. sync twice with a little time between so that nothing remains
	   to be written and a shut off or halt won't interrupt a write.

+
+# sync
+# sync
+# sync
+
+usually guarantees enough time has passed (since the first sync) that
+the files were written to disk. Running
+
+# sync;sync;sync
+
+is kind of stupid since the first sync is not performed until after
+you have typed the whole line in. 
+
+-- 
+Hans Jespersen
+hjespersen@trillium.waterloo.edu
+uunet!watmath!trillium!hjespersen


Disk controllers change. System software techniques change. Old habits change,
but not so easily. If you don't type sync;sleep 3;sync after completing a big
edit and before you can back up onto something other than the one flakey disk
drive, you've probably never had some[one|thing] crash your system just before
you do the backup. And your changes are lost, and you can't remember the
brilliant technique that you used in the changes.

Some go so far as to have an entry for user "sync"" in /etc/passwd with
no password and a shell of /bin/sync just in case the system is hung so
logins of a more complicated nature can't be performed and a sync is
required/desired before pulling the plug. At least it makes you feel better.
---
Mike Murphy  Sceard Systems, Inc.  544 South Pacific St. San Marcos, CA  92069
mrm@Sceard.COM        {hp-sdd,nosc,ucsd,uunet}!sceard!mrm      +1 619 471 0655

vjs@calcite.UUCP (Vernon Schryver) (06/23/89)

The continuing discussion of an old fashioned update(1m) deamon for SVR3 is
puzzling.  SVR3 has the kernel process which ps lists 'bdflush'.  It does
better write-behind than a periodic, complete buffer flush flood like the
update deamon.  In at least some versions, one can tune bdflush's aging
parameters.

Separately, some file systems are careful to invalidate buffers containing
data for unlinked files when the in-core inode reference count goes to 0.
Bdflush may never have a chance to write such disk blocks.  That strategy
can have a measurable effect on the speed of things like C compilers.

It is irritating as well as unusual to have to say something nice about
System V, but truth is truth.

Vernon Schryver
vjs@calcite.uucp

rk@bigbroth.UUCP (rohan kelley) (06/23/89)

In article <483@oglvee.UUCP>, jr@oglvee.UUCP (Jim Rosenberg) writes:
>                    I look over at the screen one day after a shutdown has gone
> to completion, and what do I see as the final line on the screen:
> Reboot the system now.
>                   One day they have to move the machine, so they have to unplug
> it.  Like good little campers they go through shutdown.  They are all ready to
> hit the power switch, and then they see this stern admonition:  "Reboot the
> system now."  "Oh, OK, well the computer TOLD ME to reboot ..."  So they
> reboot, chatter with someone else in the office for a minute, then turn the
> power off ...
> 
> All right, so AT&T flubbed this, no big deal, I'll just edit whatever shell
> script has this abortion in it.  I look at /etc/rc0.  Not there.  In

About the only thing you can do is write a shell script in /etc/rc0 like: 
echo "ANYONE WHO FAILS TO IGNOR THE NEXT MESSAGE WILL BE SHOT AT SUNRISE...."

=======================================================================
Rohan Kelley -- UNIleX Systems, Inc. (Systems and software for lawyers)
UUCP:  ...{gatech!uflorida,ucf-cs}!novavax!bigbroth!rk (office)
                                   novavax!mdlbrotr!rk (home)
ATTmail:  attmail!bigbroth!rk
3365 Galt Ocean Drive, Ft. Lauderdale, FL 33308 Phone: (305) 563-1504

"Go first class or your heirs will" -somebodyelse
=======================================================================

kdb@chinet.chi.il.us (Karl Botts) (07/22/89)

In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes:
>If you think about it for a second, by the time UNIX is totally ready
>to be "rebooted now," you shouldn't have an active file system to
>read scripts from.  The kernel is ALL THAT'S LEFT.  So the message
>has to go there.  (I suppose it could be linked in at rebuild time.)

This is simply not true; the root file system should be and is still mounted
at reboot time.  When a normal Unix system comes up, it comes up  by
default in single user mode and the root filesystem is already mounted.
You can read and write in the normal manner to the root filesystem, and
it is _your_ job to make sure it is synced before you reboot.  This is
the same state the system is in just before you reboot in the usual
shutdown script.
Incidentally, it is not possible for the reboot command itself (when it
is available as a sfotware command) to automatically sync the system,
because there are times when you _must_ reboot the system without
syncing, i.e., after rebuilding the free list with fsck.

pat@orac.pgh.pa.us (Pat Barron) (07/23/89)

In article <8714@chinet.chi.il.us> kdb@chinet.chi.il.us (Karl Botts) writes:
>Incidentally, it is not possible for the reboot command itself (when it
>is available as a sfotware command) to automatically sync the system,
>because there are times when you _must_ reboot the system without
>syncing, i.e., after rebuilding the free list with fsck.

Well, that's easy enough to fix.  In 4.3BSD, "reboot" and "halt" are
implemented as system calls, and they *do* automatically sync the
disks by default.  However, you can pass them an option which tells
them not to do the sync.

--Pat.
-- 
Pat Barron
Internet:  pat@orac.pgh.pa.us  - or -   orac!pat@gateway.sei.cmu.edu
UUCP:  ...!uunet!apexepa!sei!orac!pat  - or -  ...!pitt!darth!orac!pat

psfales@cbnewsc.ATT.COM (Peter Fales) (07/23/89)

In article <8708@chinet.chi.il.us>, les@chinet.chi.il.us (Leslie Mikesell) writes:

> Well, a common reason to shut down a 386 machine is to bring it up under DOS

> 
I have seen the same thing on my AT&T 6386 WGS.  I have formed the habit of
always using the reset button and everything seems to work fine.

-- 
Peter Fales			AT&T, Room 5B-420
				2000 N. Naperville Rd.
UUCP:	...att!ihlpb!psfales	Naperville, IL 60566
Domain: psfales@ihlpb.att.com	work:	(312) 979-8031

tneff@bfmny0.UUCP (Tom Neff) (08/02/89)

Boy I've been seeing some -v-e-r-y- old news followups and mail replies
the last day or so!  Was there some major logjam out there?
-- 
"We walked on the moon --	((	Tom Neff
	you be polite"		 )) 	tneff@bfmny0.UU.NET