[comp.os.vms] Suggestions On How To Handle LARGE Fortran Arrays At LINK/RUN Time

CLAYTON@xrt.upenn.EDU ("Paul D. Clayton") (06/23/87)

Information From TSO Financial - The Saga Continues...
Chapter 5 - June 22, 1987

In response to a question from Denis W. Haskin on how to have a FORTRAN 
program have extremely large arrays such as,
      COMMON /PLUME/ [...omitted...], ACIXY(60,80,0:80), GCIXY(60,80,0:80),
                     FRACIX(60,80)
      COMMON /PTIME/ [...omitted...], TACIXY(0:50,60,80,0:30),
                     TGCIXY(0:50,60,80,0:30), IHOUR, IMIN, ISEC
and still link and run, I suggest the following.

1. By default in FORTRAN, as long as the arrays are NOT initialized to any 
value, the arrays are AUTOMATICALLY compressed in 'demand-zero' sections in 
the .EXE file on disk. If the arrays ARE set to a value, then they will be 
stored in the .EXE file for their full size and contain the preset values. 
Note that the default value for a variable in FORTRAN IV with DEC is '0' for 
numbers and the null character for strings.

2. With the error code "LINK-F-EXPAGQUO, exceeded page file quota" error at 
link time, upping the user's PGFLQUOTA is the only alternative.

3. With the error code "LINK-F-MEMFUL, insufficient virtual address space to 
complete this link" the VIRTUALPAGCNT SYSGEN parameter needs a boost followed
by a system reboot. To get some idea what to move the parameter to, I would 
recommend putting in a 'dummied' common section which has the large arrays set
to '1'. This would allow the program to complete the LINK phase and by looking 
at the .MAP file you could tell you how much space the code and minor arrays 
are taking up. Then calculate the size, in 512 byte pages, for the large 
arrays and add that to the number of pages for the image. Note that the number 
of pages for the image is NOT the number of blocks the .EXE file takes up. 
Refer to the .MAP file for this information. Set the SYSGEN parameter to the 
sum of these two numbers plus a comfortable margin. Then reboot the system to 
have the new parameter take effect. Also replace the original arrays sizes for 
the program to use.

4. The use of shared images is an alternative but one I do not think is 
appropiate for your problem with the arrays. The major problem is at run 
time as stated in the next steps.

5. Once the .EXE file is ready to run with the FULL size arrays, now is when
the problems can really start. All the steps to this point are onerous for
one very good reason. The system is making sure you REALLY want to kill the
system QUICKLY. The two items that are going to be the MOST critical to having
any sort of efficient and 'fast' run time is to have the following items.
A. Considerable amount of pagefile space free, B. Accessing of the arrays in
the most efficient manner, C. Sizable PGFLQUOTA, WSEXTENT and WSQUOTA UAF 
parameters, D. Only process running on the system.

5A. Given the case where a sizable portion of the array space that is declared
and USED, the size of the system pagefile(s) needs to have close if not more 
free space then the total size of the array storage. If the UAF parameters are
set large enough then the size of the pagefile does not have to be the size of 
all the arrays. You should give CONSIDERABLE thought to having the SYSGEN 
parameters PAGEFILCNT and SWAPFILCNT set to at least 5 more then they are
currently set to so that additional pagefile(s) can be added 'on the fly' 
if they are needed. At the same time, one terminal should always be logged
into the system while the program is running. Should the system run out of 
pagefile space and everything stalls, you may not be able to log in and create
another pagefile and have things continue. Plans should also be formulated 
prior to running on what disks the additional pagefiles will reside.

5B. Accessing of the arrays in the most efficient manner. The FORTRAN 
compiler stores the arrays in 'column' order. This means that when the array
slots are accessed down the columns, then the system is actually accessing
sequential memory locations. The result is less page faults incurred by the
program. Less page faults means faster run times. Should the arrays be accessed
in totally random or 'row' order then you are essentially jumping around the
virtual address space of the program and causing excessive page faults and 
ending with 'slow' run times. You could also be consuming considerable amounts
of the pagefile(s) in this case.

5C. Sizable PGFLQUOTA, WSEXTENT and WSQUOTA UAF parameters. In order for this 
program to run, the above UAF parameters will also need to be increased. The
PGFLQUOTA parameter would probably need to be the size of the total space taken
by the image, to be on the safe side. You would not want to be in the final 
moments of execution when the program aborts due to pagefile quota exceeded. 
Once again, this size is taken from the .MAP file not the block count of the 
.EXE image file. I would make WSEXTENT and WSQUOTA be the same, and the value 
be the number of free memory pages available on your system when its quiet. This
way, you do not have as much trouble having your process 'grow' as it pagefaults
during execution. If you can not run the program by itself, then best to set the
WWSQUOTA to some 'nominal' size and WSEXTENT to the number of free pages. You 
will also need to have the WSMAX SYSGEN parameter set the number of free pages 
available on your system when it is quiet. If WSMAX is not set high enough, the
current physical pages used by the process will not grow to what you have set
WSEXTENT and WSQUOTA. 

5D. Only process running on the system. By doing this you have the best chance
for your program to 'grow' as needed and reduce the overall page fault rate to
something that may be livable.

I hope this gives some ideas on what course of action I would take. My only
other suggestion would be to look towards some of the new FPU's that are 
being attached to the BI/NMI systems today and do parallel processing.

Paul D. Clayton - Manager Of Systems
TSO Financial - Horsham, Pa. USA
Address - CLAYTON%XRT@CIS.UPENN.EDU