[comp.unix.wizards] Summary: Re: SunOS file corruption

hastings@coherent.com (Reed Hastings) (02/22/90)
Two weeks ago I wrote:

>I have heard that there was an old BSD bug such that if you let your
>disk get more than 90% full you were likely to "lose files" or
>"create garbage inodes" or similar ugly things.  

Thanks to all who responded.  Enclosed are the two most informative
answers.
  -Reed. (hastings@coherent.com)


Date: Tue, 13 Feb 90 16:57:23 EST
From: ames!harvard!genrad.com!jpn (John P. Nelson)
Subject: Re: SunOS file corruption
Organization: GenRad, Inc., Concord, Mass.
UUCP:	{decvax,mit-eddie}!genrad!jpn
smail:	jpn@genrad.com

Boy.  These things really get corrupted after they are retold five or
six times by people who don't really understand them.  None of the
above things can ever happen.

The Berkeley "Fast Filesystem", first introduced in BSD 4.1c, I
believe, has certain characteristics:  when the filesystem becomes 90%
full, the performance begins to drop (both for writing, and also for
reading files written once the filesystem gets full).  This is because
it gets harder to find an "optimal" block for the file being written.
When the filesystem gets VERY full (like 98% - 99%), there is the
possibility that the disk driver will not be able to allocate a disk
block at all.  This is because the tail ends of files are stored in
"block fragments" (which are 1/4th of ordinary full blocks).  It is
possible that there are no full disk blocks available on the disk even
though there appears to be space, because all the space is used by
fragments.

I'm not sure what happens in this situation, but I believe you get a
kernel "panic".  You certainly don't lose files or get garbage inodes.

Starting with BSD 4.2, the kernel ENFORCED the 90% limit (actually,
this is a tunable parameter) for anyone but root.  In other words, if
you attempt to write to a "fast filesystem" disk as an ordinary user,
and the disk is already 90% full, you will get an error: ENOSPC.  The
disk APPEARS to be full.  In addition, utilities like "df" only report
"kbytes free" and "% used" AFTER first subtracting 10%.  So you can
use up to 100% of the space that df reports, and after that, the disk
appears to be full, and all writes will fail.

>Can anyone confirm & clarify this, and most importantly, comment 
>on whether it still exists in SunOS 4.x.

This behavior is still there.  You will never see it, unless you
attempt to cram your disk past 100% full (as df reports it).  We have
occasionally run read-only (to ordinary users) disks at 108% full with
no problems.



Date: Wed, 14 Feb 90 08:35:53 PST
From: sun!bit!jayl (Jay Lessert)
Subject: Re: SunOS file corruption
Organization: BIT  Portland, OR
Jay Lessert  {ogccse,sun,decwrl}!bit!jayl
Bipolar Integrated Technology, Inc.
503-629-5490  (fax)503-690-1498


We've never seen anything like this in 3+ years of heavy Sun3/Sun4 usage,
and we've filled filesystems many a time.

HOWEVER, there are at least two extremely serious file corruption problems
in SunOS 4.0.x that are NFS-related, and Sun *won't* volunteer the
information... :-)

1) The 1st we call "NFS read corruption".  An NFS client starts substituting
   random chunks of NFS buffer cache for NFS file reads.  The actual file
   contents on the server(s) are ok.  Once it starts happening (it is sort
   of a "mode"), it keeps happening randomly until said client is rebooted.
   Amusingly enough, a fastboot(8) won't do the job, fasthalt(8) and then
   booting is the quickest way.  We have this happen about once every two
   weeks.

2) The 2nd we call "UFS fragment write corruption".  A UFS write on an NFS
   server, followed by an NFS read of that file that happens *before* the UFS
   write buffer is flushed to disk, can cause the UFS fragment of said file to
   be replaced with random chunks of UFS buffer cache.  In this case the
   physical file is truly corrupted.  Often shows up when mail or news spool
   directories are NFS-mounted, as you might imagine.

These bugs are present in all 4.0.x versions, through 4.0.3.  There are no
patches.  I can dig out the Sun bugid's if you're interested.  Sun claims
that they will not release 4.1 to production until these are fixed, we'll
see...
--