[comp.os.vms] Considerations For Performing On-Line BACKUP Operations...

CLAYTON@xrt.upenn.EDU ("Clayton, Paul D.") (07/15/87)

Information From TSO Financial - The Saga Continues...
Chapter 11 - July 14, 1987

John Macallister has asked the following question and I am placing my thoughts
on the table for consideration and comment.

	What's the justification, nowadays, for doing standalone backups?

I do NOT hold the belief that the backups have to be on a system that is 
totally quiet, which booting standalone backup would ensure.

I DO hold the belief that CONSIDERABLE thought has to go into the backup 
schedule and the impact of the operation itself on any files that may be open
for UPDATE at the time.

I have NO problem with doing the VMS system packs when there may be a handful
of people still on the machine. My 'normal' user community numbers around 500
so a handful is not bad, and I can live with users having password problems
in the event the system disk goes to lunch.

I DO have a problem with doing the data disk when there are ANY processes doing
UPDATE operations. I have a six, soon to be seven, node VAXCluster so the 
overall problem of whos doing what on which disk is NOT an easy matter to
understand.

The primary concern that I have with processes performing UPDATE operations
is that the internal buffers to the process are essentially IGNORED when a 
BACKUP/IGNORE=INTERLOCK command is given. This can have SEVERE implications
if given he case of a RMS ISAM file which has undergone heavy bucket splitting
at both the index and data levels. If the process has a significant number of
buffers allocated, by whatever means, the PROBABILITY is HIGH that the buckets
cached internal to the process can have records that are new to the file or
ones that have been deleted. It also stands to reason that the buckets that
had to be written to disk to gain buffers for the new records MAY have 
references to buckets/records that are NOT written to disk at the moment 
BACKUP comes marching through and writes the CURRENT DISK version to tape. The
result is a corrupted, or incomplete which may be worse, file if a restore 
operation is done from the tape for whatever reason. 

Most of the data files for our systems are RMS ISAM files and this problem 
has added to my declining mass of gray hair. It has also spurned my search 
for fast and large tape subsystems. The one I have settled on is sold by EMULEX
and includes a TC-13 UNIBUS interface coupled with a cartridge drive from 
Megatape Corp. The model I am getting holds 650MB, formatted, and writes data
at 16,000 BPI and 125 IPS. It does an RA-81 in less then 40 minutes and puts
it ALL on ONE tape cartridge with room to spare. I am now planning to do the
data disks in a three hour, max, window starting at 23:00 when I can lock 
everyone out except readonly batch jobs. I will write a future blip on the
drives after they are installed and I run them through the paces.

Other alternatives I had looked at was the Volume Shadowing package offered 
by DEC for use in VAXClusters. We have the package and it does decrease the 
speed of read access to data, and does allow sets to be added and subtracted 
to on-line without impact to the user community. What is DOES NOT do is trigger
a flush of process buffers of files that are OPEN on the shadowed disk when a 
dismount command is given. The result here is the same as having BACKUP 
cruise through and make a copy of the current disk contents, ignoring updated
buffers/buckets resident to the processes with files open on the disk. In other
words its totally USELESS for solving this problem. I had the oportunity to
question the product manager for Volume Shadowing at Nashville about this 
area of concern, and the response was that the product was NOT intended to 
solve this problem, and there are NO plans for it to do such.

One last note on the use of, or lack of, /CRC on the BACKUP commands. For
any tapes I make at 1600 BPI I feel its the only way to go. For 6250, the
question boils down to the tape drive being used and the CPU pushing it. I 
just did a test of a TA81 and 8700 combination and the time to WRITE one
reel of tape was 23 minutes with and without /NOCRC. The thing to rememeber is
that the CRC is a SOFTWARE generated item and as such, if the machine is 
a smaller verison VAX then the difference in time can be apparent and may be
substantial. If its a large VAX, it becomes a mute point.

Hope the information presented here is of some help. :-)

Paul D. Clayton - Manager Of Systems
TSO Financial - Horsham, Pa. USA
Address - CLAYTON%XRT@CIS.UPENN.EDU