kerr@wrdis01.af.mil (Kerr) (05/03/91)
I run a Pyramid MIServer 4/2 and have had a problem recently with the root file system filling up (hold on don't answer just yet) with no visible file that has has the amount of space that is missing. At times there will be 5 megabytes free and it will vanish with no apparent cause. The free space reappears with the same mystery. I have looked at every file (visible to find/ls/du) in the root file system that has changed during the period of the when the space disappears and none of the files can account for the loss/gain of space. Yesterday the local FE was here and watched as the only files changing were the /etc/.*tmp files (they were getting bigger) and poof there was 1 megabyte more of FREE space than minutes earlier. Has anyone else experienced this? Does any net.person have an explanation? Are there programs that create files with no directory entrys that can change size at will? If there are such invisible files how does one find out what program is creating them? Thanks for your time... grant (kerr@oodis01.af.mil) Oh yes, /tmp/ is a separate file system, we are running OSx 5.0d.
romain@pyramid.pyramid.com (Romain Kang) (05/04/91)
This sounds like some program has opened a temporary file and unlinked it so that it does not appear in the filesystem, while holding the descriptor open. I'm not aware of any standard OSx utilities that do this in the root file system (other than those which use /tmp, and you already said /tmp was a separate partition); perhaps it may be a local application. Your description suggests that there is one large file holding the space, though you didn't say anything about getting inode counts from df. To find the open file, I would use "pstat" and "fstat". (The latter is not a standard part of OSx but you should be able to find it at a reasonably well stocked public Pyramid source archive; I can send a copy if you don't have it or time to go rummaging.) "pstat -i" will show the active inodes on the system; look for inodes with large sizes on your root device (8,0 if pdisk00a). Next, weed out the known big files (/vmunix, /etc/.*tmp) according to inode. If you don't see any obvious culprits, then getting fstat won't help. If you're feeling brave, you can try using "find" with -inum if you can limit the search to directories that don't include mount points. Chances are you won't find the big file this way, since you've looked for it already with the "normal" methods. At this point, you can run "fstat {rootdev}", where {rootdev} is the block device used to mount your root partition. The output will look something like this: USER CMD PID FD DEVICE INODE SIZE TYPE NAME root swapper 0 wd 8, 0 2 10240 dir /dev/iop/pdisk00a root init 1 text 8, 0 2948 36864 reg /dev/iop/pdisk00a root init 1 wd 8, 0 2 10240 dir /dev/iop/pdisk00a root init 1 12 8, 0 4118 0 chr /dev/iop/pdisk00a : : romain fstat 9892 0 8, 0 2879 0 chr /dev/iop/pdisk00a romain fstat 9892 1 8, 0 2879 0 chr /dev/iop/pdisk00a romain fstat 9892 2 8, 0 2879 0 chr /dev/iop/pdisk00a romain fstat 9892 3 8, 0 4797 0 chr /dev/iop/pdisk00a romain fstat 9892 4 8, 0 723 1232896 reg /dev/iop/pdisk00a [^^^^^^^ umm, pyramid:/vmunix...] You can match the inode number in this listing with the one from pstat to find the user and the process holding the file open (if that is indeed the cause of your space problem). Note that you could run fstat alone, and not bother with pstat at all if you already have fstat. Another classic misoperation is for users to write to a tape device that isn't in the file system (e.g., /dev/rmt9 when rmt0 was intended); user then notices error and deletes file. This probably isn't you're problem, since (1) the user would have to be root, and I assume your root users are known and responsible, and (2) if user X catches the error, I don't think he would repeat the same mistake so frequently. -- "Eggheads unite! You have nothing to lose but your yolks!" -Adlai Stevenson
kenj@yarra.oz.au (Ken McDonell) (05/04/91)
I don't have an explanation, but some information that may help and a couple of additional questions. kerr@wrdis01.af.mil (Kerr) writes: >I run a Pyramid MIServer 4/2 and have had a problem recently with the root >file system filling up (hold on don't answer just yet) with no visible file >that has has the amount of space that is missing. >At times there will be 5 megabytes free and it will vanish with no apparent >cause. The free space reappears with the same mystery. >I have looked at every file (visible to find/ls/du) in the root file system >that has changed during the period of the when the space disappears and none of >the files can account for the loss/gain of space. >Yesterday the local FE was here and watched as the only files changing >were the /etc/.*tmp files (they were getting bigger) and poof there was >1 megabyte more of FREE space than minutes earlier. >Are there programs that create files with no directory entrys that can >change size at will? If there are such invisible files how does one find out >what program is creating them? The answer to the first question is "yes", the second is "with great difficulty". There are 2 ways in which blocks in the file system can be allocated to files that are not accessible through the directory naming structure (and hence cannot be found using find, du, ls -Ra, ...) 1. mounting a file system on a non-empty directory + create a 1 Mbyte file in /mnt (blocks allocated in the root file system) + mount /dev/iop/pdisk... /mnt + you cannot find the file in /mnt any more, but the space is still allocated Technically this would be aenough to recreate your scenario, but this would require file systems to be dynamically mounted and unmounted, and the equivalent of fiddling with files in /mnt when the file system is unmounted -- seems a little far fetched? 2. unlinking an open file + f = creat("/trickyfile", 0644); /* call me old-fashioned! */ unlink("/trickyfile"); /* removes name from directory tree */ do { write(f, ....); /* inode still allocated, so this is /* OK */ until (bored); /* / has been filled up! */ close(f); /* all blocks have been freed again */ Some programmers really do this to avoid writing signal handlers to clean up temporary files when the program is aborted (exit() will also do the close() if need be). But this would mean you'd need a generally (or group) writeable directory in the root file system, and/or a setuid root or setgid some_group program. Use find(1) to locate all likely candidates! ps. please let us know what the final diagnosis proves to be. >(kerr@oodis01.af.mil) >Oh yes, /tmp/ is a separate file system, we are running OSx 5.0d. -- Ken McDonell E-mail: kenj@pyramid.com kenj@yarra.oz.au Performance Analysis Group Phone: +61 3 820 0711 Pyramid Technology Corporation Disclaimer: I speak for me alone, of course. Melbourne, Australia