[comp.sys.sun] Can a filing system overflow?

loki@relay.eu.net (03/06/90)

We are running a WREN-IV as a second disk on a SUN 4/110, and recenty some
rather alarming things have been happening, things I have always assumed
to be impossible on a clean filing system.  We have our news on this WREN,
and it has recently been filling up (oh what a surprise!!!).  Fine, I can
accept that.  HOWEVER, I noticed that the first 1k or so bytes of the
history file (the offending file, I assume), were found in one or two
files randomly scattered around about the filing system.  Like a fool I
did not check the inode numbers, but instead took the system down and
fsck'd everything in sight - no errors.

This problem does not seem to happen unless the disk fills up.

Has anyone ever witnessed such a thing?  Is it a bug in nfs? WREN's?
Would your advise be `back everything up NOW, and get a new disk'?

Any help much appreciated.

--
   Harry Fearnhamm, ,---.'\   EMAIL: loki@moncam.uucp
    Monotype ADG,  (, /@ )/          ...!ukc!acorn!moncam!loki
    Science Park,    /( _/ ') VOICE: +44 (0)223 420018
     Cambridge,      \,`---'    FAX: +44 (0)223 420911
      CB4 4FQ,           DISCLAIMER: Nothing is True.
      ENGLAND.                       Everything is Permitted.

glenn@uunet.uu.net (Glenn Herteg) (03/10/90)

moncam!loki@relay.eu.net writes:
> ... recenty some rather alarming things have been happening, things I
> have always assumed to be impossible on a clean filing system.  ...  I
> noticed that the first 1k or so bytes of the history file (the
> offending file, I assume), were found in one or two files randomly
> scattered around about the filing system.  ...  This problem does not
> seem to happen unless the disk fills up.  Has anyone ever witnessed
> such a thing?

We have a disk here which has an overflow problem.  Certain files, which
we have now captured and caged (renamed as .badXXX), seem to access recent
blocks from the host system cache rather than file disk blocks.  Don't ask
me how it happened, but some sequence of moves in SunOS 4.0 suninstall,
trying to modify disk partition sizes, seems to have caused the file
system to have been created slightly larger than the partition that holds
it.  Unfortunately, there is apparently no check against this in mkfs(8),
and no such check by fsck(8) either.  To see if this is the problem with
your partition, compare the dumpfs(8) "size" field (near the beginning of
the voluminous output) to the number of sectors reported for this
partition by dkinfo(8).  The size field should be exactly half the sectors
shown in the dkinfo output.  If it is larger, you have a problem; you'll
need to dump the file system and re-create it using mkfs(8) or newfs(8).

If you use dump(8) on a file system with this problem, you may get a
number of "(This should not happen)" messages as it tries to read the
offending files.

The problem only shows up when the file system gets full, because that's
when the tail end of the free list (containing these bogus blocks) finally
gets accessed, and the phantom blocks get allocated to files.  It has
nothing to do with brand of disk (ours is a Sun 327-MB drive, for
instance).

I suspect this problem may have been responsible for a number of strange
disk behaviors reported by other users over the years (an old Sun-Spots
article by Steve Simmons comes to mind, for instance); the problems may
have been blamed on disks or disk controllers instead of UNIX software.