dennis%cod@NOSC.MIL.UUCP (02/04/87)
The disk on my DN3000 (SR9.2.3, IX9.2.3, TCP2.1) has been gradually filling up (down to ~2 Mbytes free as reported by LVOLFS), as well as seeming to have a slower response time. After being "up" for several weeks, I had to reboot the node for an unrelated reason, and noticed that I had recovered 10 Mbytes on the disk! Machine seems faster, too. As a check, we did a /SYSTEST/LST on a node that had been up for about 45 days, rebooted, and "found" another 5.5 MBytes. But a subsequent LST could account for only a few tens of Kbytes in the user-visible file system. Where does all this disk space disappear to? How can I get it back without rebooting? Dennis Cottel Naval Ocean Systems Center, San Diego, CA 92152 (619) 225-2406 dennis@NOSC.MIL sdcsvax!noscvax!dennis
Erstad@HI-MULTICS.ARPA.UUCP (02/05/87)
There are a couple places where storage seems to mysteriously disappear. The first place (aside from formatting losses) is the the VTOC (Volume Table Of Contents?). The total size shown in a LVOLFS takes into account formatting losses, but not the VTOC. This is around 5% or so, and depends in part on the parameters used in involing the disk. There is also file space which is not catalogued and thus is not "user visible". Most of this hangs around only for the duration of a process (transcripts, process paging, etc.) and should not be a long term effect, although for a given task this can be a problem. Other common things to watch out for is excessive use of the EDACL command (which creates a new ACL object each time; edacling an entire tree with 10K objects using 10MB!). Use ACL to assign ACLs (EDACLing only a "template") and/or use SALACL to merge ACL structures. If you have processes which die horribly on a frequent basis a FIND_ORPHANS might help. The above explain only where disk space "disappears" to. These are things to watch out for, but won't cause disk space to be 'found' on rebooting. A little is normal (if you SALVOL, the amount you get is a meg or two lower than what the node will show you booted) but we have never seen (unexplained) loss of free space (Our site has 40+ Apollos of almost every type). If I had to guess (and this is ONLY a guess) you may have an application which is creating uncatalogued permanent objects. Does the disk space reappear with a boot only, or with a SALVOL/boot combination only? If the latter, look at the SALVOL report and see if it tells you anything. HI-MULTICS "Disclaimer: My employer doesn't believe a word I say"
peterson@UTAH-CS.ARPA.UUCP (02/05/87)
Unlike native Unix boxes, Apollos do not have an explicit swap area. Instead, user paging space is allocated from the general pool of free disk space. If a diskless node crashes or a process dies abnormally (e.g, "sigp -blast" or vanishes with a "process not found") space used by the process may not be reclaimed. The result is that over a period of weeks disk space leaks away. The best way to reclaim this space is to take the node down and run SALVOL. There is another program, find_orphans, that tracks down objects that are allocated on the disk but don't have a directory entry (these can also result from crashes or abnormal process terminations). However, only run find_orphans when there is no activity on the disk. Otherwise, it might decide a file is an orphan if it finds it while it's being created...) There is a major advantage to this way of allocating swap space - it makes adding or removing diskless partners a fairly trivial process. (If you've ever watched a Sun adminstrator reformat a disk all night to build new swap partitions you'll know what I mean. I think Sun is switching to the global paging pool scheme in their next release...)