[comp.os.vms] RMS File stuff

mac@gvg10.UUCP (05/26/88)
> Although I have read the "Guide to VAX/VMS File Applications" manual, I am
> still unclear about the use of the multibuffer count.
> I shall pose my questions below, but any general comments on this topic,
> helpful hints etc. are greatly appreciated.
>  
> My application may or may not open several indexed (or positional) files
> at execution time.  Should I specify the number of buffers at this time,
> or use the value which can be set from DCL with the command
>  
>     $ set rms/indexed/buffers=n
> 
> Furthermore, does the above cause n more buffers to be allocated for 
> each file I open (this seems like a Bad Thing in an arbitrary multi-user 
> environment). 
> 
> Alternatively, if I specify this number in the RAB, is it cumulative?
> That is, does the number of buffers continue to mount as more files
> are opened. 
> 

    Setting the multi-buffer count is a good thing to do for indexed files.
    Yes, the two ways to do it are in the RAB at file open time and from
    DCL.  No, a RAB is file specific and the value you set for one file
    will not affect the other files that you open. Yes, if you do it from
    DCL you will affect all of the subsequent file opens. 
    
    In the best of all possible worlds, a world in which you have an
    infinite amount of time, you would examine each file you were going to
    open, decide how much of it you wanted to keep in memory, and set the
    multi-buffer count in the RAB at file open.  In our application this is
    just not practical, our integrated manfacturing and financial
    application has a data base of 300-400 files and has 600-800 separate
    functions to maintain and report against that data base.  Typically,
    our larger functions open 20-30 files concurrently.  So what we do is
    worry about the worst offenders, namely the files that by their size or
    usage are the most critical to our processing. 
    
    We have not set a multi-buffer count in a RAB yet, what we do is spend
    lots of time looking at a few FDL's.  Take for instance our Sales
    History Detail file, currently it is just under 400k blocks.  We have
    split the file into two pieces, recent history and past history.  The
    recent history gets updated daily with the daily order activity, the
    past history is static.  The recent history has an FDL that specifies
    an allocation that is about 20% too large to make additions to the file
    fast and prevent bucket splits.  The past history file has no extra
    space in it, all the fill factors are 100%. 
    
    We do set the mult-buffer count from DCL but only for batch processes.
    We have found that this is a quick and dirty way to improve through put
    5-10 percent without any analysis other than setting the value to
    something ``reasonable''.  Doing this consumes memory, but in our
    environment most batch jobs run at night when the users are asleep.
    Also, the maximum number of concurrent batch jobs is significantly less
    than the maximum number of concurrent interactive jobs. 
    
    The general problem with setting multi-buffer count globally for a job
    is deciding what is ``reasonable''.  We set it to 20, this improves
    through-put and does not exceed the UAF quota's (Can't remember which
    one that bit use right now, but if I had to guess I would say page file
    quota).  Note you can set this value too high and actually decrease the
    through-put by forcing long sequential searches of in memory buckets. 
    
> Should I have some upper limit for each user coded into my program? 
    
    No one can answer this question definitively for you, but I would think
    that for most applications it would be a waste of time.  Consider our
    order entry example again.  When the order entry program starts up it
    has to read in things like the terms codes and the FOB codes, these are
    stored in files that are read into an internal array in their entirety.
    The order entry program passes each table file once and then closes the
    file. Why set a multi-buffer count for these files?  They are indexed
    files to make sure unique codes are defined, not because the order
    entry program requires randon access to them.
    
    But on the other hand I should really set the count for our product
    structure file---indented bill of materials file.  This file is only
    read randomly, it is large, and it is one of the central files in our
    data base.  Setting low level codes in this file for our larger
    manfacturing divisions can take 4 to 6 hours.  But, if I went about
    tuning the access to this file I would start with the low-level setting
    routine since that is where the most pay back is.  I would never worry
    about setting the multi-buffer count for the person at the terminal who
    is entering the things one at a time. 

> 
> It seems that VMS's default allocation of anything to do with files, buffers 
> etc. is to allocate the smallest, lowest or worst possible value it 
    
    Yes, VMS does tend to set these values low.  By setting these values at
    their minimum VMS can insure that the file system will work, i.e. you
    will be able to open, read, and update files.  A working file system is
    a really large advantage for commerical applications. Small is not
    necessarily bad.  Again using our application as an example, our data
    base follows relational policies (yes, you can have a relational
    data base without buying the name) which means that we have many
    files, a large number of the files are very small, 1 to 25 records.
    Forcing large anything for these files would really hurt us since
    they constitute about 50% of the files in our data base.

    There are no magic bullits. To optimize file access requires an
    understanding of the application as well as an understanding of
    the file system and there are times when it is more important to
    understand the spplication completely that to have an intimate
    knowledge of the bits and bytes of the file system.  

    Bill
    
----------------------------------------------------------------------
| Bill MacAllister          |    Email address:                      |
| The Grass Valley Group    |      Mac@GVG49.Email.Hub.Tek.Com       |
| PO Box 1114               |      Tektronix!GVGPSA!GVG49.uucp!Mac   |
| Grass Valley, Ca  95945   |                                        |
----------------------------------------------------------------------