[comp.os.vms] problem with file limit

dww@stl.stc.co.uk (David Wright) (01/17/88)

There have been some replies on the problem of "File Limit" errors (met 
running LINK) but they leave out some of the "gotcha's" that hit you if your
disk is VERY fragmented.     We found out the hard way when one of our VAX's 
got into that state, and the people responsible for it would not
image backup/restore the disk (the simplest way to fix the problem)
until a key project finished.  

The obvious problem is lack of disk space - that's easily seen
(show device /mount d    is quite useful)

Problem 2 is when you are at the limit of the number of files allowed on the 
disk - show/device/full will tell you what that is.  You have to INIT the disk
to fix that - nasty!   We have never had that problem, but we do set a large
value when initialising a new disk (look through HELP INIT if you don't know
how).

Problem 3 is that directory files must be contiguous.  If you extend a
directory beyond its current size (i.e. by adding files to it) RMS will find
a new contiguous space to put it in.  If there is one.   If not, you can't
add any more files until you delete some (or rename them to another directory
which is smaller, or not at its allocation limit).   As directories are 
usually small, you must be in a bad way to have this problem.   If you do
it's a warning to do that backup/restore before you hit problem 4!  
MAIL is the most likely sufferer as some users have very large mail 
directories.

Problem 4 is to do with how RMS links bits of disk space (extents) into files.
The 'index block' (a 512 byte block in INDEXF.SYS) contains the file header
plus space for pointers to about 100 extents (this isn't fixed, I think it 
depends on what else is in the header).  If your file is big and your disk
fragmented, your file may need more than 100 extents - in this case a second
(or third or ...) block is allocated in the disk index file, and these linked
together as needed.   We had files with over 3000 extents!  Think what that
must do to performance - fortunately the bad ones were mostly backup listings
etc. which were not read much.    HOWEVER some key system files CANNOT be
extended beyond one header block.  This includes PAGEFILE, SWAPFILE, DUMPFILE
and INDEXF.SYS itself.  I guess DEC decided that nobody would ever let such
important files get that fragmented.  Well, some people do!

A certain inexperienced SYSMGR decided to alter the secondary page file size
by renaminng it and rebooting (correct), then creating a new one WITHOUT first
deleting the original.   So the machine got a potential pagefile made up of
tiny fragments.   He then thought he'd finished.    All was well  until the
next reboot.  The system refused to reboot - it simply didn't know how to 
handle a pagefile with more than one index header block.   
That took some fixing!

Remember that INDEXF.SYS is one of the files that can't have multiple blocks
in the index (itself).   Once your INDEXF.SYS has reached the ~100 extent
limit you cannot add any more files, even if there is plenty of disk space
and you are well within the MAX FILES limit.   As your large, badly fragmented
files will take up multiple index blocks things can get bad quite quickly at
this stage.  

This situation can be confusing to users.   Because these situations are rare,
many programs simply treat them as "disk full", and the VMS error messages are
a bit ambiguous too ("file limit" looks like "disk full" to most people!).
"How can I get a disk full error when SPACE says there's 80,000 free blocks".
(SPACE is a local command to say how much space there is left in the user's
quota and disk).

VMS/RMS is pretty resilient, and in fact the system in question carried on
running, and giving a service to up to 35 users (on a VAX 11/750!), reliably
but slowly for about 3 months in the "problem 4" state, until the disk was
at last backup/restored.    But I hope you will learn from our experience,
and never test the above problems on your system!

P.S.  I have reason to suspect that postings to comp.os.vms from here may not
get gatewayed properly onto ARPAnet/BITNET vms mailing lists.   If you see
this on a mailing list, I'd be greatful if you'd mail a reply.  Thanks.
-- 
Regards,
        David Wright           STL, London Road, Harlow, Essex  CM17 9NA, UK
dww@stl.stc.co.uk <or> ...uunet!mcvax!ukc!stl!dww <or> PSI%234237100122::DWW

NETWORK@UCVAX.ULSTER.AC.UK (01/20/88)

Well, David Wright's explanation did make it back this side of the
pond...

As I understand it, certain critical files *must* be contiguous i.e.
only 1 allocation area. These include INDEX.SYS and QUOTA.SYS

If you let your disk get horrendously fragmented you will probably
find you can't set up quota entries for new users as well as other
side effects.

Once you get into this state you're going to HAVE to do the image
backup/restore to fix it. Keeping the disk less than 80% full helps
delay the problem developing. If you're not too bad to start with
it might be worth trying one of the defraggers but I'd suggest
knocking on wood, crossing fingers and doing a backup (just in case)
first...

                            Brian Beesley
                            SCO Data Comms
                            Univ of Ulster