jr@oglvee.UUCP (Jim Rosenberg) (06/14/89)
I'm in the process of configuring the first of several AT&T 6386en to be used at our remote locations. In working with our development 6386 it suddenly hit me one day how incredibly dumb and potentially catastrophic is the following lovely "feature" of the UNIX V.3.2 shutdown sequence. Now as you all know, for remote sites where no one is UNIX knowledgeable, shutdown is a most weighty matter, and failing to do shutdown correctly is just about GUARANTEED to cause big-time trouble. I look over at the screen one day after a shutdown has gone to completion, and what do I see as the final line on the screen: Reboot the system now. Think about it. Not "You MAY reboot the system now." Not "You may turn the power off or reboot the system now." Oh no. This is a command. It says: "Reboot the system NOW!" [emphasis added]. Now isn't this fun. You instruct your users in the religion of shutdown. They only know about UNIX what you've told them. You think you have dynamite menu stuff and shell script stuff and the whole works. One day they have to move the machine, so they have to unplug it. Like good little campers they go through shutdown. They are all ready to hit the power switch, and then they see this stern admonition: "Reboot the system now." "Oh, OK, well the computer TOLD ME to reboot ..." So they reboot, chatter with someone else in the office for a minute, then turn the power off ... All right, so AT&T flubbed this, no big deal, I'll just edit whatever shell script has this abortion in it. I look at /etc/rc0. Not there. In desparation I go through every file in /etc. Not there. /etc/rc0.d/*: same story. Finally there is only one place left in all of /dev/universe. I say to myself "I DO NOT BELIEVE THIS!" and run strings on /unix. Yup. This idiocy is HARD CODED INTO THE FLIBBING **KERNEL**!!!! So quick now, all you glorious VARs and systems integrators and other adventurous souls using the 6386, forward march & adb that kernel -- and the .o file for making new kernels and ... -- Jim Rosenberg pitt Oglevee Computer Systems >--!amanue!oglvee!jr 151 Oglevee Lane cgh Connellsville, PA 15425 #include <disclaimer.h>
wjc@ho5cad.ATT.COM (Bill Carpenter) (06/14/89)
In article <483@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes: > Reboot the system now. Standard instructions on shampoo bottles: 1. Lather 2. Rinse 3. Repeat :-) -- Bill Carpenter att!ho5cad!wjc or attmail!bill
tneff@bfmny0.UUCP (Tom Neff) (06/15/89)
If you think about it for a second, by the time UNIX is totally ready to be "rebooted now," you shouldn't have an active file system to read scripts from. The kernel is ALL THAT'S LEFT. So the message has to go there. (I suppose it could be linked in at rebuild time.) It seems sufficient to tell people, "Always run SHUTDOWN before turning off power. Never power down a running system." If they DO reboot after a shutdown, they cannot hurt anything without violating your instructions, since the system would once again be "running." By the way in my experience with my pair of 6386en, the only danger to an inadvertent powerdown or crash is to the files you have changed since the last 'sync' happened. You can reboot the system every 5 minutes all day long and never lose a thing if you don't change your files (edit etc.) At worst, some log files might not get updated, but if you're sitting there rebooting all the time, what's to log anyway. To minimize the risk from power hits and crashes, I add a root cron job that performs a 'sync ; sync' every 10 minutes. I have not been reliably persuaded that this is something the kernel does automatically on V/386, although I know of UNIXen where that's true. I feel more comfortable doing it myself -- the overhead is negligible. Only caveat - this doesn't help files accessed remotely via RFS. I wish there were an RFS equivalent to force update of the cache but I haven't found one. Also - to simplify the shutdown procedure in the event of my team's absence (say, if someone had to run in and shut things off due to an emergency in the building or an impending power shutdown), I create the following entry in /etc/passwd: shutdown:x:0:1:Shut down the system:/usr/shutdown:/bin/sh and the following /usr/shutdown/.profile: cd / exec /etc/shutdown -y -g15 I could do the same thing directly by naming shutdown as the startup shell, but I hate having to edit /etc/passwd just to change things like delay options. -- You may not redistribute this article for profit without written permission. -- Tom Neff UUCP: ...!uunet!bfmny0!tneff "Truisms aren't everything." Internet: tneff@bfmny0.UU.NET
kremer@cs.odu.edu (Lloyd Kremer) (06/15/89)
In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes: >Also - to simplify the shutdown procedure in the event of my team's absence >(say, if someone had to run in and shut things off due to an emergency in >the building or an impending power shutdown), I create the following >entry in /etc/passwd: > > shutdown:x:0:1:Shut down the system:/usr/shutdown:/bin/sh > >and the following /usr/shutdown/.profile: > > cd / > exec /etc/shutdown -y -g15 > >I could do the same thing directly by naming shutdown as the startup >shell, but I hate having to edit /etc/passwd just to change things >like delay options. Small point, but I don't think you *could* do it directly in /etc/passwd. In any AT&T system I know, the shell field of /etc/passwd does not allow options or arguments. -- Lloyd Kremer Brooks Financial Systems ...!uunet!xanth!brooks!lloyd Have terminal...will hack!
hjespersen@trillium.waterloo.edu (Hans Jespersen) (06/15/89)
In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes: >To minimize the risk from power hits and crashes, I add a root cron job >that performs a 'sync ; sync' every 10 minutes. I have not been reliably ^^^^^^^^^^^ Why do people always do this? Running sync twice does nothing that running sync once wouldn't do. Remember that 'sync' does NOT guarantee that all delayed writes are actually written out to disk. It mearly guarantees that they are in the queue to be written as soon as possible. When you are at a shell prompt running # sync # sync # sync usually guarantees enough time has passed (since the first sync) that the files were written to disk. Running # sync;sync;sync is kind of stupid since the first sync is not performed until after you have typed the whole line in. -- Hans Jespersen hjespersen@trillium.waterloo.edu uunet!watmath!trillium!hjespersen
pss@unh.UUCP (Paul S. Sawyer) (06/16/89)
In article <WJC.89Jun14170102@ho5cad.ho5cad.ATT.COM>, wjc@ho5cad.ATT.COM (Bill Carpenter) writes:
:>
:> Standard instructions on shampoo bottles:
:>
:> 1. Lather
:> 2. Rinse
:> 3. Repeat
:>
:> :-)
:> --
:> Bill Carpenter att!ho5cad!wjc or attmail!bill
Reaching EOF on the bottle usually terminates the loop in all but the
most poorly implemented systems...
( B-)
--
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Paul S. Sawyer uunet!unh!unhtel!paul paul@unhtel.UUCP
UNH Telecommunications
Durham, NH 03824-3523 VOX: 603-862-3262 FAX: 603-862-2030
jfc@athena.mit.edu (John F Carr) (06/16/89)
In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes: >To minimize the risk from power hits and crashes, I add a root cron job >that performs a 'sync ; sync' every 10 minutes. I have not been reliably >persuaded that this is something the kernel does automatically on V/386, >although I know of UNIXen where that's true. We have the following man page (and a program to go with it) on our BSD 4.3 system. UPDATE(8) UNIX Programmer's Manual UPDATE(8) NAME update - periodically update the super block SYNOPSIS /etc/update [-n] DESCRIPTION Update is a program that executes the sync(2) primitive every 30 seconds. This insures that the file system is fairly up to date in case of a crash. This command should not be executed directly, but should be executed out of the initialization shell command file. Normally, update opens certain system directories to keep them in the name translation cache. If the -n option is given, subdirectories of /usr are not opened so that remote system libraries can be unmounted while the system is run- ning. It doesn't appear to place much load on a system to do this twice per minute (perhaps 2 minutes CPU per day of runtime). --John Carr (jfc@athena.mit.edu)
tneff@bfmny0.UUCP (Tom Neff) (06/16/89)
In article <14506@watdragon.waterloo.edu> hjespersen@trillium.waterloo.edu (Hans Jespersen) writes: >In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes: >>To minimize the risk from power hits and crashes, I add a root cron job >>that performs a 'sync ; sync' every 10 minutes. I have not been reliably > ^^^^^^^^^^^ I wrote it this way to conserve space; actually they're on separate lines of a shell script. >Why do people always do this? Running sync twice does nothing that >running sync once wouldn't do. I experimented with this fairly extensively when this installation was born. Running 'sync' once, waiting for the hard disk light to go out and then punching the button nearly always lost some data on the FS check. Running it twice never did. I might add that regardless of the underlying reasons, it CANNOT hurt. > Remember that 'sync' does NOT guarantee >that all delayed writes are actually written out to disk. It mearly >guarantees that they are in the queue to be written as soon as possible. That's good enough for the purpose described above. If you are doing it by hand preparatory to an emergency powerdown, you should wait until the disk accesses are visibly done. -- You may not redistribute this article for profit without written permission. -- Tom Neff UUCP: ...!uunet!bfmny0!tneff "Truisms aren't everything." Internet: tneff@bfmny0.UU.NET
les@chinet.chi.il.us (Leslie Mikesell) (06/17/89)
In article <483@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes: >Think about it. Not "You MAY reboot the system now." Not "You may turn the >power off or reboot the system now." Oh no. This is a command. It says: >"Reboot the system NOW!" Well, a common reason to shut down a 386 machine is to bring it up under DOS or a diagnostic disk, so the message isn't always inappropriate. Has anyone else had a problem rebooting with ctl-alt-delete after getting this message? I generally get a rom diagnostic about a serial port failure unless I push the reset button or power down after running unix. Les Mikesell
jim@dfmp1.UUCP (Jim Murray) (06/20/89)
In article <8708@chinet.chi.il.us>, les@chinet.chi.il.us (Leslie Mikesell) writes: > > Has anyone else had a problem rebooting with ctl-alt-delete after getting > this message? I generally get a rom diagnostic about a serial port failure > unless I push the reset button or power down after running unix. > > Les Mikesell Yeah! only I haven't seen the pattern you've seen. It seems intermittent like crazy. The problem I have is not only does the Rom Diagnostics complain but the serial port fails. Has any one else had the same problem?? Has anyone else gotten the following combination to work??? 6383e with 5 meg of Ram EGA monitor with a VDC 600 IPC 802 Starlan Bus Mouse It's giving me fits??? Jim Murray
pjr@jjcac.uucp (Paul Rak) (06/20/89)
In article <8708@chinet.chi.il.us> les@chinet.chi.il.us (Leslie Mikesell) writes: > >Has anyone else had a problem rebooting with ctl-alt-delete after getting >this message? I generally get a rom diagnostic about a serial port failure >unless I push the reset button or power down after running unix. > >Les Mikesell The problem exists even with the latest revision of the BIOS that I've seen (1.14), and the solution from the hotline is to use the RESET button, as they can't figure out why it fails. It occurs on every 6386 that I've worked on, both 16 & 20 MHz desktops and 20 MHz towers. Paul Rak (School / Weekend Work) (Weekday Work) Paul Rak, Jr. Paul Rak, Jr. Sys Administrator/Lab Assistant Member of Technical Staff Joliet Junior College EMO Computer Products, Inc. Academic Computing Center 1701 Quincy Avenue, Suite 22 1216 Houbolt Avenue Naperville, IL 60540 Joliet, IL 60536 (312) 369-1350 (815) 729-9020 x362 (An AT&T VAR)
chris@cetia4.UUCP (Christian Bertin) (06/22/89)
In article <12044@bloom-beacon.MIT.EDU>, jfc@athena.mit.edu (John F Carr) writes: (talking about the 'update' daemon) > > It doesn't appear to place much load on a system to do this twice > per minute (perhaps 2 minutes CPU per day of runtime). > > --John Carr (jfc@athena.mit.edu) If you have a large buffer cache (1Mb on my system) and if you do a lot of compiles, or if you do anything that creates large temporary files, you can waste a LOT of time sync'ing files that should never have been written out to disk. The last time I tried to measure this, a 3 hour compile session went down to 2:40 after I changed the sync time from 30 seconds to 2 minutes. At the very least, 'update' should take an optionnal argument to customize the sync intervals. Chris -- Chris Bertin | -- CETIA -- 150, Av Marcelin Berthelot, Z.I. Toulon-Est +33(94)212005 | 83088 Toulon Cedex, France | inria!cetia!chris
mrm@Sceard.COM (M.R.Murphy) (06/23/89)
In article <14506@watdragon.waterloo.edu> hjespersen@trillium.waterloo.edu (Hans Jespersen) writes: +In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes: + +>To minimize the risk from power hits and crashes, I add a root cron job +>that performs a 'sync ; sync' every 10 minutes. I have not been reliably + ^^^^^^^^^^^ + +Why do people always do this? Running sync twice does nothing that +running sync once wouldn't do. Remember that 'sync' does NOT guarantee +that all delayed writes are actually written out to disk. It mearly +guarantees that they are in the queue to be written as soon as possible. +When you are at a shell prompt running The reason that people do this comes from heeding the warning in the manual section UPDATE(VIII) that was released 11/1/73 with Version 5 (not System V). update did the sync every 30 seconds. Quoting, albeit without permission, though in hope that none will complain too bitterly, BUGS With update running, if the CPU is halted just as the sync is executed, a file system can be damaged. This is partially due to DEC hardware that writes zeros when NPR requests fail. A fix would be to have sync temporarily increment the system time by at leaset 30 seconds to trigger the execution of update. This would give 30 seconds grace to halt the CPU. The entry for SYNC(II) in the same manual is similar to the entry in a System V manual with the sentence, again quoted without permission, The writing, although scheduled, is not necessarily complete upon return from the sync. present in the System V manual and missing in the Version 5 manual. Since presumably the paragraph in SYNC(II) in the Version 5 manual and in SYNC(2) in the System V manual that suggests that programs such as fsck and df that jerk with file systems should sync, and since the manditory use of sync before boot is stated, it probably is a good idea to sync before halt or re-boot :-). The reasoning for two syncs goes like this: 1. Manually type sync. It is scheduled, but doesn't complete all the important work. 2. Shut off or halt the machine just as the really important write is happening or just as some other sync occurs as a result of something like the cron entry suggested above. 3. Be real sad that the disk is corrupted. or, 1. sync twice with a little time between so that nothing remains to be written and a shut off or halt won't interrupt a write. + +# sync +# sync +# sync + +usually guarantees enough time has passed (since the first sync) that +the files were written to disk. Running + +# sync;sync;sync + +is kind of stupid since the first sync is not performed until after +you have typed the whole line in. + +-- +Hans Jespersen +hjespersen@trillium.waterloo.edu +uunet!watmath!trillium!hjespersen Disk controllers change. System software techniques change. Old habits change, but not so easily. If you don't type sync;sleep 3;sync after completing a big edit and before you can back up onto something other than the one flakey disk drive, you've probably never had some[one|thing] crash your system just before you do the backup. And your changes are lost, and you can't remember the brilliant technique that you used in the changes. Some go so far as to have an entry for user "sync"" in /etc/passwd with no password and a shell of /bin/sync just in case the system is hung so logins of a more complicated nature can't be performed and a sync is required/desired before pulling the plug. At least it makes you feel better. --- Mike Murphy Sceard Systems, Inc. 544 South Pacific St. San Marcos, CA 92069 mrm@Sceard.COM {hp-sdd,nosc,ucsd,uunet}!sceard!mrm +1 619 471 0655
vjs@calcite.UUCP (Vernon Schryver) (06/23/89)
The continuing discussion of an old fashioned update(1m) deamon for SVR3 is puzzling. SVR3 has the kernel process which ps lists 'bdflush'. It does better write-behind than a periodic, complete buffer flush flood like the update deamon. In at least some versions, one can tune bdflush's aging parameters. Separately, some file systems are careful to invalidate buffers containing data for unlinked files when the in-core inode reference count goes to 0. Bdflush may never have a chance to write such disk blocks. That strategy can have a measurable effect on the speed of things like C compilers. It is irritating as well as unusual to have to say something nice about System V, but truth is truth. Vernon Schryver vjs@calcite.uucp
rk@bigbroth.UUCP (rohan kelley) (06/23/89)
In article <483@oglvee.UUCP>, jr@oglvee.UUCP (Jim Rosenberg) writes: > I look over at the screen one day after a shutdown has gone > to completion, and what do I see as the final line on the screen: > Reboot the system now. > One day they have to move the machine, so they have to unplug > it. Like good little campers they go through shutdown. They are all ready to > hit the power switch, and then they see this stern admonition: "Reboot the > system now." "Oh, OK, well the computer TOLD ME to reboot ..." So they > reboot, chatter with someone else in the office for a minute, then turn the > power off ... > > All right, so AT&T flubbed this, no big deal, I'll just edit whatever shell > script has this abortion in it. I look at /etc/rc0. Not there. In About the only thing you can do is write a shell script in /etc/rc0 like: echo "ANYONE WHO FAILS TO IGNOR THE NEXT MESSAGE WILL BE SHOT AT SUNRISE...." ======================================================================= Rohan Kelley -- UNIleX Systems, Inc. (Systems and software for lawyers) UUCP: ...{gatech!uflorida,ucf-cs}!novavax!bigbroth!rk (office) novavax!mdlbrotr!rk (home) ATTmail: attmail!bigbroth!rk 3365 Galt Ocean Drive, Ft. Lauderdale, FL 33308 Phone: (305) 563-1504 "Go first class or your heirs will" -somebodyelse =======================================================================
kdb@chinet.chi.il.us (Karl Botts) (07/22/89)
In article <14401@bfmny0.UUCP> tneff@bfmny0.UUCP (Tom Neff) writes: >If you think about it for a second, by the time UNIX is totally ready >to be "rebooted now," you shouldn't have an active file system to >read scripts from. The kernel is ALL THAT'S LEFT. So the message >has to go there. (I suppose it could be linked in at rebuild time.) This is simply not true; the root file system should be and is still mounted at reboot time. When a normal Unix system comes up, it comes up by default in single user mode and the root filesystem is already mounted. You can read and write in the normal manner to the root filesystem, and it is _your_ job to make sure it is synced before you reboot. This is the same state the system is in just before you reboot in the usual shutdown script. Incidentally, it is not possible for the reboot command itself (when it is available as a sfotware command) to automatically sync the system, because there are times when you _must_ reboot the system without syncing, i.e., after rebuilding the free list with fsck.
pat@orac.pgh.pa.us (Pat Barron) (07/23/89)
In article <8714@chinet.chi.il.us> kdb@chinet.chi.il.us (Karl Botts) writes: >Incidentally, it is not possible for the reboot command itself (when it >is available as a sfotware command) to automatically sync the system, >because there are times when you _must_ reboot the system without >syncing, i.e., after rebuilding the free list with fsck. Well, that's easy enough to fix. In 4.3BSD, "reboot" and "halt" are implemented as system calls, and they *do* automatically sync the disks by default. However, you can pass them an option which tells them not to do the sync. --Pat. -- Pat Barron Internet: pat@orac.pgh.pa.us - or - orac!pat@gateway.sei.cmu.edu UUCP: ...!uunet!apexepa!sei!orac!pat - or - ...!pitt!darth!orac!pat
psfales@cbnewsc.ATT.COM (Peter Fales) (07/23/89)
In article <8708@chinet.chi.il.us>, les@chinet.chi.il.us (Leslie Mikesell) writes: > Well, a common reason to shut down a 386 machine is to bring it up under DOS > I have seen the same thing on my AT&T 6386 WGS. I have formed the habit of always using the reset button and everything seems to work fine. -- Peter Fales AT&T, Room 5B-420 2000 N. Naperville Rd. UUCP: ...att!ihlpb!psfales Naperville, IL 60566 Domain: psfales@ihlpb.att.com work: (312) 979-8031
tneff@bfmny0.UUCP (Tom Neff) (08/02/89)
Boy I've been seeing some -v-e-r-y- old news followups and mail replies the last day or so! Was there some major logjam out there? -- "We walked on the moon -- (( Tom Neff you be polite" )) tneff@bfmny0.UU.NET