jcm@mtunb.ATT.COM (was-John McMillan) (08/03/89)
In article <802@bagend.UUCP> jan@bagend.UUCP (Jan Isley) writes: >In article <9832@csli.Stanford.EDU> crimmins@csli.stanford.edu (Mark Crimmins) writes: >>This has happened to me a couple of times, and I wonder if anyone >>knows why. I turn on my 3b1 (3.5M 67HD rev. 3.5 sys and utils) and it >>goes through the normal boot procedure until the "checking stored >>files" screen turns to gibberish and then the boot procedure starts ^^^^^^^^^^^^^^^^^^ >>over (and over and over). The problem goes away when I "upgrade" all >>system files from floppy, including utilities. : >FSCK has a nasty habbit of saying it fixed a problem it found in the file ^^^^^^^^^^^^^^^ >system when it really did not fix it. You know it found a problem if the >system does a reboot after the "checking stored files" routine. Usually >the system will come up after the second time through. But, sometimes >the problem was not really fixed, and fsck will *never* be able to fix >it on a mounted file system. : Maybe someone has clarified this by now. (Or maybe I'm missing the exact scenario: I get lost in the technical jargon of "turns to gibberish".) (Well, I guess I understand that term regarding a friend's daughter....) Occsionally, you get folks who miss the point that you can fool the File System some of the time -- but not ALL of the time: + WRITING A FILE uses/consumes/alters the File System FREE-LIST. + FSCK (often) re-writes the File System FREE-LIST. Ergo: + It's a less than brilliant move to have FSCK write a log FILE on the File System it's manipulating -- this intrinsically attempts to alter the very data fsck's correcting. Example: + If the Free List is corrupted -- perhaps it was even the CAUSE of the crash -- the FSCK log file is building an INODE (file) using that corrupted data, and building it in RAM while the DISK is being fixed. Then it gets moved to the disk.... + Or, maybe, that INODE is written to disk, but the Superblock, as created by FSCK, still marks some of those now-used blocks as FREE.... Time for "Duplicate BLOCKS" (or whatever). I DON'T know of any FSCK errors -- FSCK probably DOES correct the problems. But SOMEONE wrote an RC script that corrupts the data AS IT IS BEING CORRECTED. This has been discussed, here, many times. It will be discussed many more. I have requested this be fixed w/in AT&T sources. The only correction is to ELIMINATE any saving of FSCK output IN A FILE on the same FS being checked. Period. So far, I trust FSCK far more than most C programmers I know !-) john mcmillan -- att!mtunb!jcm
jan@bagend.UUCP (Jan Isley) (08/04/89)
In article <1582@mtunb.ATT.COM> jcm@mtunb.UUCP (John McMillan) writes: >In article <802@bagend.UUCP> jan@bagend.UUCP (Jan Isley) writes: >>In article <9832@csli.Stanford.EDU> crimmins@csli.stanford.edu (Mark Crimmins) writes: To summarize: Mark describes a problem... I offer a simple suggestion to fix his problem... John offers a series of comments that are IMHO, quite clearly designed to make himself look totally enlightened about Mark's problem and anything else one could possibly think of, while adding smug commentary about our use of the wonderfully rich English language. John then describes what I understand to be a correct assesment of the behavior of fsck, far better than I did of course, then offers: > The only correction is to ELIMINATE any saving of FSCK > output IN A FILE on the same FS being checked. Period. This is *exactly* what my suggestion does. I offered a way to do this. Where is your suggestion? What is your point? John, you may be smarter than the average bear, but the bears that I have met have had better manners. Jan --- jan@bagend | gatech!bagend!jan | h (404)434-1335 | w (404)425-5700 Humankind cannot bear very much reality. T. S. Eliot
wilkes@mips.COM (John Wilkes) (08/05/89)
In article <822@bagend.UUCP> jan@bagend.UUCP (Jan Isley) writes: >In article <1582@mtunb.ATT.COM> jcm@mtunb.UUCP (John McMillan) writes: >>In article <802@bagend.UUCP> jan@bagend.UUCP (Jan Isley) writes: >>>In article <9832@csli.Stanford.EDU> crimmins@csli.stanford.edu (Mark Crimmins) writes: > >To summarize: [summary, flame, and whining deleted] I was going to send private e-mail to Jan, but against my better judgement decided to flame him publicly. Jan, your simple suggestion to fix Mark's problem is certainly useful, however, I believe John's description of the problem was concise and also useful. You did not explain the nature of the problem at all, merely offered a way to solve it. John did not really offer a concrete solution, but he did describe what is going on when you redirect the output of fsck to the file system being checked. He also suggested that he's made some sort of "official" request within the bowels of ATT to have this addressesed by those who maintain the sources. Both commentaries have value. What's your problem? I did not infer that John was saying your solution was incorrect in any way; did you? Were your feathers ruffled somehow? Are you that thin-skinned? You will not find satori that way. Your response sure sounded like a personal attack to me. If you and John have some personal history that has now flared up in public, please put it back in a box or pick nits somewhere else. This is not appropriate for the unix-pc groups (not that I stake any claim to being a net.policeman.) Jan, your manners are every bit as bad as John's (and mine aren't visibly better, either.) Take note that followups have been directed to alt.dev.null and that I will post no more on your or John McMillan's personal problems. I will respond to private e-mail, however. -- -wilkes wilkes@mips.com -OR- {ames, decwrl, pyramid}!mips!wilkes
jbm@uncle.UUCP (John B. Milton) (08/07/89)
In article <1582@mtunb.ATT.COM> jcm@mtunb.UUCP (John McMillan) writes: >In article <802@bagend.UUCP> jan@bagend.UUCP (Jan Isley) writes: >>In article <9832@csli.Stanford.EDU> crimmins@csli.stanford.edu (Mark Crimmins) writes: >>>This has happened to me a couple of times, and I wonder if anyone >>>knows why. I turn on my 3b1 (3.5M 67HD rev. 3.5 sys and utils) and it >>>goes through the normal boot procedure until the "checking stored >>>files" screen turns to gibberish and then the boot procedure starts > ^^^^^^^^^^^^^^^^^^ >>>over (and over and over). The problem goes away when I "upgrade" all >>>system files from floppy, including utilities. >: Hmm. Turn to gibberish. I would read that to mean the binary count pattern the boot ROM puts up when it test video RAM. I modified my /etc/rc a long time ago. Yeah yeah. Well, lets take a closer look. Let's look at the relevant code without the bogus comments. /etc/fsck -pq > /dev/null || ( if [ -r /etc/.installdate ]; then date > /etc/.lastfsck /etc/fsck -y >> /etc/.lastfsck else /bin/sh fi ) The -p for preen switch seems to be unique to the UNIXpc. The man page for fsck specifically mentions this feature should be used in /etc/rc for un-attended booting. Booting does not always happen when the stupid comments say it will. When the "fsck -pq" finds and fixes minor problems it WILL reboot the system. The vast majority of all file system problems are minor. When minor fixes are complete and the system reboots, the || part obviously never gets run. If the "fsck -pw" finds something real nasty, it exits with a bad status and the || part does get executed. If you are just now installing the system, the file .installdate will not exist, and it will just dump you at a shell prompt with scary error messages. If your system has been installed, the redirection to /etc/.lastfsck is done with an "fsck -y". I do very much agree that this is very stupid. The correct way to do this would have been to overwrite a pre- existing, pre-allocated file, much the same way /lost+found is used, using a special switch to fsck, say -L. If too much is written, output just stops. The way it is with the installed /etc/rc through, when things are really bad, /etc/rc makes it worse. Hmm. maybe bad enough for a, ah, service call? John -- John Bly Milton IV, jbm@uncle.UUCP, n8emr!uncle!jbm@osu-cis.cis.ohio-state.edu (614) h:294-4823, w:785-1110; N8KSN, AMPR: 44.70.0.52; Don't FLAME, inform!