edmund@turtlevax.UUCP (Ed Trujillo) (02/07/85)
Why does the following scenario happen under 4.2 running on a vax 750? At boot time, fsck finds out that the root file system was modified so it exits with a condition code of 4 that is passed on to /etc/rc which immediately does an /etc/reboot -n. According to the documentation for reboot the -n option avoids the sync. Why then does fsck do a sync() before the call to exit(4) ??? Is there a logical reason for this? BTW, this apparent bug didn't appear in the 4.2 buglist. -- Ed(mund) Trujillo @ CADLINC, Menlo Park, CA {amd,decwrl,nsc,seismo,spar}!turtlevax!edmund
tim@callan.UUCP (Tim Smith) (02/09/85)
In article <649@turtlevax.UUCP> edmund@turtlevax.UUCP (Ed Trujillo) writes: >reboot the -n option avoids the sync. Why then does fsck do a sync() >before the call to exit(4) ??? Is there a logical reason for this? The above is for 4.2bsd, so what I say may be wrong. On Sys V it goes like this.... When fsck is run on the root, the cooked file system is used. This is because of the way fsck determines when it is doing root. So fsck must do a sync() at the end to make sure the disk gets changed. This can cause problems. If there is a problem with the inode for the console, fsck will fix it on the disk, but since it is using the console, the modify time on the in-core copy of the inode gets changed, and so the final sync() writes it out, putting you right back where you started! Here at Callan, we changed fsck to NOT do the final sync when doing a raw device. Then problems like the above can be fixed by using fsck on the raw root. -- Duty Now for the Future Tim Smith ihnp4!wlbr!callan!tim or ihnp4!cithep!tim
steveg@hammer.UUCP (Steve Glaser) (02/10/85)
In article <649@turtlevax.UUCP> Ed(mund) Trujillo @ CADLINC, Menlo Park writes: >Why does the following scenario happen under 4.2 running on a vax 750? > >At boot time, fsck finds out that the root file system was modified so >it exits with a condition code of 4 that is passed on to /etc/rc which >immediately does an /etc/reboot -n. According to the documentation for >reboot the -n option avoids the sync. Why then does fsck do a sync() >before the call to exit(4) ??? Is there a logical reason for this? 4.2 fsck uses the cooked device on the fsck of the root filesystem. When fsck is done, if it had to modify the superblock of the filesystem there will be TWO copies of the superblock around in kernel memory. One is the copy kept by the kernel cause the file system is mounted and the other is a normal block in the block buffer cache due to the write by fsck when it fixed the superblock. The sync(2) system call will out BOTH of these copies (and always has). The trick on 4.2 (4.1 too?) is that they the kernel makes sure that the block buffer version gets written out *after* the other one. Thus the sync inside fsck is correct and gets the updated stiff onto the disk. You must then avoid all syncs until the reboot cause the copy in the block buffer cache is no longer marked dirty (cause the sync in fsck wrote it out and nobody has changed it since then). Thus another sync would write out the wrong copy of the superblock onto disk, undoing some of the work that fsck just did. Summary: this is a case where one sync (inside fsck) is correct and more than one will undo some of the work fsck just did for you. Disclaimer: I'm not saying that I *like* this scheme. It works, but seems kinda fragile. At bare minimum, it should be documented and the "new expanded" semantics of sync(2) should be guaranteed by all future systems. Steve Glaser tektronix!steveg
Ron Natalie <ron@BRL-TGR> (02/10/85)
It's not a bug. The sync doesn't kill you. When working on "hotroot" fsck uses the cooked device. The sync is desirable in this instance as it forces the changed superblock, etc. back to the disk. -Ron