[comp.sys.pyramid] Root File System Invisible File

kerr@wrdis01.af.mil (Kerr) (05/03/91)

I run a Pyramid MIServer 4/2 and have had a problem recently with the root 
file system filling up (hold on don't answer just yet) with no visible file 
that has has the amount of space that is missing.

At times there will be 5 megabytes free and it will vanish with no apparent
cause. The free space reappears with the same mystery.

I have looked at every file (visible to find/ls/du) in the root file system 
that has changed during the period of the when the space disappears and none of 
the files can account for the loss/gain of space.  

Yesterday the local FE was here and watched as the only files changing
were the /etc/.*tmp files (they were getting bigger) and poof there was
1 megabyte more of FREE space than minutes earlier.

Has anyone else experienced this? Does any net.person have an explanation?
Are there programs that create files with no directory entrys that can
change size at will? If there are such invisible files how does one find out 
what program is creating them?

Thanks for your time...

grant
(kerr@oodis01.af.mil)

Oh yes, /tmp/ is a separate file system, we are running OSx 5.0d.

romain@pyramid.pyramid.com (Romain Kang) (05/04/91)

This sounds like some program has opened a temporary file and unlinked
it so that it does not appear in the filesystem, while holding the
descriptor open.  I'm not aware of any standard OSx utilities that do
this in the root file system (other than those which use /tmp, and you
already said /tmp was a separate partition); perhaps it may be a local
application.  Your description suggests that there is one large file
holding the space, though you didn't say anything about getting inode
counts from df.

To find the open file, I would use "pstat" and "fstat".  (The latter
is not a standard part of OSx but you should be able to find it at a
reasonably well stocked public Pyramid source archive; I can send a
copy if you don't have it or time to go rummaging.)

"pstat -i" will show the active inodes on the system; look for inodes
with large sizes on your root device (8,0 if pdisk00a).  Next, weed out
the known big files (/vmunix, /etc/.*tmp) according to inode.  If you
don't see any obvious culprits, then getting fstat won't help.  If you're
feeling brave, you can try using "find" with -inum if you can limit the
search to directories that don't include mount points.  Chances are you
won't find the big file this way, since you've looked for it already
with the "normal" methods.

At this point, you can run "fstat {rootdev}", where {rootdev} is the
block device used to mount your root partition.  The output will look
something like this:
USER     CMD          PID   FD DEVICE  INODE    SIZE TYPE  NAME
root     swapper        0   wd  8,  0      2   10240 dir   /dev/iop/pdisk00a
root     init           1 text  8,  0   2948   36864 reg   /dev/iop/pdisk00a
root     init           1   wd  8,  0      2   10240 dir   /dev/iop/pdisk00a
root     init           1   12  8,  0   4118       0 chr   /dev/iop/pdisk00a
 :
 :
romain   fstat       9892    0  8,  0   2879       0 chr   /dev/iop/pdisk00a
romain   fstat       9892    1  8,  0   2879       0 chr   /dev/iop/pdisk00a
romain   fstat       9892    2  8,  0   2879       0 chr   /dev/iop/pdisk00a
romain   fstat       9892    3  8,  0   4797       0 chr   /dev/iop/pdisk00a
romain   fstat       9892    4  8,  0    723 1232896 reg   /dev/iop/pdisk00a
					    [^^^^^^^ umm, pyramid:/vmunix...]

You can match the inode number in this listing with the one from pstat
to find the user and the process holding the file open (if that is
indeed the cause of your space problem).  Note that you could run fstat
alone, and not bother with pstat at all if you already have fstat.

Another classic misoperation is for users to write to a tape device
that isn't in the file system (e.g., /dev/rmt9 when rmt0 was intended);
user then notices error and deletes file.  This probably isn't you're
problem, since (1) the user would have to be root, and I assume your
root users are known and responsible, and (2) if user X catches the
error, I don't think he would repeat the same mistake so frequently.
--
"Eggheads unite!  You have nothing to lose but your yolks!"  -Adlai Stevenson

kenj@yarra.oz.au (Ken McDonell) (05/04/91)

I don't have an explanation, but some information that may help and a couple
of additional questions.

kerr@wrdis01.af.mil (Kerr) writes:

>I run a Pyramid MIServer 4/2 and have had a problem recently with the root 
>file system filling up (hold on don't answer just yet) with no visible file 
>that has has the amount of space that is missing.
>At times there will be 5 megabytes free and it will vanish with no apparent
>cause. The free space reappears with the same mystery.

>I have looked at every file (visible to find/ls/du) in the root file system 
>that has changed during the period of the when the space disappears and none of 
>the files can account for the loss/gain of space.  

>Yesterday the local FE was here and watched as the only files changing
>were the /etc/.*tmp files (they were getting bigger) and poof there was
>1 megabyte more of FREE space than minutes earlier.

>Are there programs that create files with no directory entrys that can
>change size at will? If there are such invisible files how does one find out 
>what program is creating them?

The answer to the first question is "yes", the second is "with great
difficulty".

There are 2 ways in which blocks in the file system can be allocated to
files that are not accessible through the directory naming structure
(and hence cannot be found using find, du, ls -Ra, ...)

1. mounting a file system on a non-empty directory

    + create a 1 Mbyte file in /mnt (blocks allocated in the root file system)
    + mount /dev/iop/pdisk... /mnt
    + you cannot find the file in /mnt any more, but the space is still
      allocated

   Technically this would be aenough to recreate your scenario, but
   this would require file systems to be dynamically mounted and
   unmounted, and the equivalent of fiddling with files in /mnt when
   the file system is unmounted -- seems a little far fetched?

2. unlinking an open file

    + f = creat("/trickyfile", 0644);	/* call me old-fashioned! */
      unlink("/trickyfile");		/* removes name from directory tree */

      do {
	    write(f, ....);		/* inode still allocated, so this is
					/* OK */
      until (bored);

      /* / has been filled up! */
      close(f);
      /* all blocks have been freed again */

   Some programmers really do this to avoid writing signal handlers
   to clean up temporary files when the program is aborted (exit() will
   also do the close() if need be).

   But this would mean you'd need a generally (or group) writeable
   directory in the root file system, and/or a setuid root or setgid
   some_group program.  Use find(1) to locate all likely candidates!


ps. please let us know what the final diagnosis proves to be.

>(kerr@oodis01.af.mil)

>Oh yes, /tmp/ is a separate file system, we are running OSx 5.0d.
-- 
Ken McDonell			  E-mail:     kenj@pyramid.com kenj@yarra.oz.au
Performance Analysis Group	  Phone:      +61 3 820 0711
Pyramid Technology Corporation	  Disclaimer: I speak for me alone, of course.
Melbourne, Australia