[comp.unix.wizards] On backups

lewis@cel.fmc.COM (Bil Lewis) (05/07/87)

	Starting about six months ago I implemented a rather different
backup technique (which is completely unique to the best of my
knowledge) that you might find interesting.

				-----

	My site at FMC comprises some 16 unix machines of various
sizes & vendors. I have a total of about 4GB active storage in 60 or so
partitions.  Using standard unix backup procedures, this entailed ~60
tapes * 9 days for incrementals and a very harassed operator.  Level 0
dumps were really a terror.  [As a matter of fact, many got "lost" on
bad days.]  Restores were a pain.

	We purchased a new Eagle, and today backups are done as follows:

Incrementals:	At 2am, each machine tars up all the new user files
(ignoring .o, .psf, .tmp, ... files) and ships them to the Eagle where
they are kept for a month.  Any missed backups are noted by a script
and can be collected the next day.

Full dumps:	At 3am, one machine is choosen to do a complete dump of
some of its partitions to the Eagle.  These are then written to tape by
the operator, one copy on site, one off.  That partition is cleared &
ready for the next day.

Restores:	From an incremental I just look for the file in the
associated directory files & then restore it from whatever daily it was
on.  [The users could even do their own restores if you trusted them
enough!]

		From a full dump, just load the tape to the scratch
partition, then restore normally.


	The advantages are obvious: Zero time spent on making
incrementals, near immediate access for recent restores, no holes in
the incrementals due to operator error/time problems/heavy dates on
Sundays/etc.  Full dumps are a simple routine and don't require any
down time, and there is also no problem if an operator misses a day;
the full dump just waits on the scratch disk (sending me nasty
messages) until dumped to tape.  By also dumping the incrementals to
tape, I get 100% backup capacity to restore for any day over the entire
year!

	The disadvantages may be more subtile: Do I lose anything by
not bringing the system down to single user level for the full dumps?
Do I lose any security by having the incrementals on-line?  [I think
not.  It requires TWO simultanious head crashes to lose anything & I've
never head of such a thing.]

				--------

	At this point in time I conclude this to be a case of extreme
winning.  For the price of an Eagle (~$10k), I save one operator-year
(~$50k), gobs of tapes, storage, hassles, and improve reliability
enormously.

-Bil

mike@BRL.ARPA (05/08/87)

It seems to me that the basic assumption that you make is that
your users are not working with large files, so that a significant
number of days worth of incremental dumps can fit onto a single
Eagle disk.

For facilities like BRL with heavy scientific and engineering applications,
it is not uncommon for a single scientist to produce 300+ Mbytes of
"new" (different) data.  Each machine (even the aging VAXen) tends to
have >2 Gbytes each, and we have serveral dozen machines.
Your strategy, while nice, does not address this type of environment.

When I was visiting Martin Marietta in ~1980, they described their
backup system (for their PDP-11/70 IS/1 UNIX systems) that is virtually
the same as your system.  (sorry).
	Best,
	 -Mike

mark@elsie.UUCP (Mark J. Miller) (05/09/87)

In article <7295@brl-adm.ARPA>, mike@BRL.ARPA (Mike Muuss) writes:
> For facilities like BRL with heavy scientific and engineering applications,
> it is not uncommon for a single scientist to produce 300+ Mbytes of
> "new" (different) data.  Each machine (even the aging VAXen) tends to
> have >2 Gbytes each, and we have serveral dozen machines.
> Your strategy, while nice, does not address this type of environment.
> 
I remember visiting Ft. Belvor a few years ago. They did backups on a really
BIG tape: 15,000 ft, 1 inch wide (these numbers may be wrong -- it was several
years ago). I wonder if these things exist outside of the military world?
It would be a useful tools like the above environment.
-- 
Dr. Mark J. Miller
NIH/NCI/DCE/LEC
UUCP:	..!seismo!elsie!mark
Phone:	(301) 496-5688

todd@uhccux.UUCP (The Perplexed Wiz) (05/10/87)

Has anyone seriously evaluated the Sony erasable magneto optical disk
drives (which are supposed to start shipping in Oct87)?  It is 5.25"
unit that stores 325M on each side of a removable platter (650M total).
The drive is supposed to cost around $6,600.  The cartridges are supposed
to cost around $200 each.

Without having seen one or knowing much more than what I wrote above,
these units sound like perfect backup devices.  I have six (6)
RA81 drives to worry about and it seems to me that the Sony units could
solve all my backup problems...todd

-- 
Todd Ogasawara, U. of Hawaii Computing Center
UUCP:		{ihnp4,seismo,ucbvax,dcdwest}!sdcsvax!nosc!uhccux!todd
ARPA:		uhccux!todd@nosc.MIL
INTERNET:	todd@uhccux.UHCC.HAWAII.EDU

det@herman.UUCP (Derek Terveer) (05/11/87)

Even our little machines (two vax 11/780s) doing development work generate an
*average* of ~45MB per day each of incremental saves.  There are definitely
days when our incremental fs overflows!

mangler@cit-vax.Caltech.Edu (System Mangler) (05/11/87)

In article <7272@brl-adm.ARPA>, lewis@cel.fmc.COM (Bil Lewis) writes:
> Incrementals: At 2am, each machine tars up all the new user files
> (ignoring .o, .psf, .tmp, ... files) and ships them to the Eagle where
> they are kept for a month.

> Do I lose any security by having the incrementals on-line?  [I think
> not.	It requires TWO simultanious head crashes to lose anything & I've
> never head of such a thing.]

Some consultants here did their backups by periodically copying the
active disk pack to a backup pack.  One night, the active pack crashed
in the middle of a backup.  They were left with half of an image copy
of the filesystem, which is completely useless; they lost the entire
sources and compilers for their proprietary operating system.  A year
later, they are still stuck with making binary patches.

So it doesn't require two simultaneous head crashes to lose a lot.

Think about what will happen if your disk controller goes nuts.
(My experience is that this is more likely than an Eagle crash).

Don Speck   speck@vlsi.caltech.edu  {seismo,rutgers,ames}!cit-vax!speck

preece%mycroft@gswd-vms.arpa (Scott E. Preece) (05/12/87)

  Derek Terveer:
> Even our little machines (two vax 11/780s) doing development work
> generate an *average* of ~45MB per day each of incremental saves.  There
> are definitely days when our incremental fs overflows!
----------
I suspect, though, that many sites don't generate nearly that much
SOURCE material (if you don't bother saving .o files and various other
temporaries your REAL daily save load may go way down).

Note also that keeping the incrementals online for a month is just
a convenience; most of the advantages of the idea are still obtained
if you do the incrementals and one other partition daily to a big
drive and then just do a daily level 0 on that one drive.  Restoring is
more of a pain, but the saving in tape handling is still there.  If you
keep an index of what went on what incremental you ease the restore
problem enough.

  Don Speck:
> Some consultants here did their backups by periodically copying the
> active disk pack to a backup pack.  One night, the active pack crashed
> in the middle of a backup.  They were left with half of an image copy of
> the filesystem, which is completely useless; they lost the entire
> sources and compilers for their proprietary operating system.  A year
> later, they are still stuck with making binary patches.
> 
> So it doesn't require two simultaneous head crashes to lose a lot.
----------
But that's a much less safe procedure than was suggested.  Doing a
direct copy-over backup without using alternating backup disks is
obviously dumb.  Assuming he's doing daily incrementals on his one
backup drive, the original poster is never at risk for more than a
day's backups.

-- 
scott preece
gould/csd - urbana
uucp:	ihnp4!uiucdcs!ccvaxa!preece
arpa:	preece@gswd-vms

davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) (05/14/87)

There is a company supposedly shipping a 1GB (yes GB as in "great big")
tape drive which connects to an SCSI interface and uses 8mm video tape.
The error rates quoted are comparable to standard mag tape, I believe the
transfer rate is slower. On the other hand, as long as the transfer rate
is faster than the ethernet rate, I wouldn't expect any loss in performance.

-- 
bill davidsen			sixhub \	ARPA: wedu@ge-crd.arpa
      ihnp4!seismo!rochester!steinmetz ->  crdos1!davidsen
				chinet /
"Stupidity, like virtue, is its own reward"

corwin@bsu-cs.UUCP (05/16/87)

In article <5996@steinmetz.steinmetz.UUCP>, davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) writes:
> 
> There is a company supposedly shipping a 1GB (yes GB as in "great big")
> tape drive which connects to an SCSI interface and uses 8mm video tape.
> The error rates quoted are comparable to standard mag tape, I believe the
> transfer rate is slower. On the other hand, as long as the transfer rate
> is faster than the ethernet rate, I wouldn't expect any loss in performance.
> 

According to this digital review article, the videotape subsystem is
marketed by Honeywell Inc. Test Instruments Division. It's called the
VLDS, transfers data in excess of 4MB/sec at .21 cents/Mbyte.

One version has SCSI. Current plans are to develop Unibus and BI capability.
The VLDS uses T-120 tapes with a bit-error rate of 1 in 10^12, costing
about $8 each. And end-to-end tape search can be completed in 90 seconds.

The VLDS currently lists at $44,000, and deliveries of the SCSI version
are scheduled for July. For more info:
	Honeywell Test Instruments Division
	P.O Box 5227
	Denver, CO 80217
	(303) 773-4581

-- 
Paul "Corwin" Frommeyer        "Experience is no substitute for competence."
UUCP: 
	{seismo,ihnp4}!{iuvax,pur-ee}!bsu-cs!corwin

henry@utzoo.UUCP (Henry Spencer) (05/20/87)

> There is a company supposedly shipping a 1GB (yes GB as in "great big")
> tape drive which connects to an SCSI interface and uses 8mm video tape.

This is probably a poor bet unless you need something *right* *now*.  The
technology to bet on as the next magtape biggie is Digital Audio Tape,
assuming it actually gets introduced some century soon.  (Everybody has
prototypes now, but nobody is selling.)  The reason is that the specs are
decent, and the consumer market will push the price down.
-- 
"The average nutritional value    Henry Spencer @ U of Toronto Zoology
of promises is roughly zero."     {allegra,ihnp4,decvax,pyramid}!utzoo!henry

henry@utzoo.UUCP (Henry Spencer) (05/20/87)

> 	The disadvantages may be more subtile: Do I lose anything by
> not bringing the system down to single user level for the full dumps?

In general, there is no way to be absolutely certain you have got a clean
backup unless you take the system down, because otherwise you are trying
to get a snapshot of a moving target.  For example, if a database system
is active while you are doing backups, there is a significant chance that
the database files on your backup are not in a consistent state.  This can
be true even if the backup copies of individual files are perfect snapshots,
which is not easy to guarantee.

On the other hand, if the system is quiet and the dump program is a bit
conservative, you'll get something that is *close* to clean, possibly
good enough.

When I started running utzoo, I put my foot down, after too many experiences
with bad backups (n.b. modern backup programs are better):  we do *all* our
backups single-user.
-- 
"The average nutritional value    Henry Spencer @ U of Toronto Zoology
of promises is roughly zero."     {allegra,ihnp4,decvax,pyramid}!utzoo!henry

baron@transys.UUCP (Joe Portman) (05/20/87)

In article <2651@cit-vax.Caltech.Edu> mangler@cit-vax.Caltech.Edu (System Mangler) writes:

>Some consultants here did their backups by periodically copying the
>active disk pack to a backup pack.  One night, the active pack crashed
>in the middle of a backup.  They were left with half of an image copy
>of the filesystem, which is completely useless; they lost the entire
>sources and compilers for their proprietary operating system.  A year
>later, they are still stuck with making binary patches.
>So it doesn't require two simultaneous head crashes to lose a lot.


	Yup, what about fires, vandalism, burglary, etc... These are all
	very real considerations. On all of our machines we use three sets
	of backup media in rotation. One set is always kept off-site in a
	fireproof vault, and a master gen set is kept at the company presidents
	home, updated about once a month. This may not be feasible for very
	large installations, but the principle applies: Spread out your 
	area of risk.
	
	Too much trouble, too expensive, you say, well we have had a few
	disasters here, fried disk drives, etc... All of which were fairly
	easily recovered, because fairly fresh backups are always available.

	As the saying goes, "YOU CAN'T HAVE TOO MUCH BACKUP".
	
	Just my .02 worth.

lewis@cel.fmc.COM (Bil Lewis) (05/26/87)

     For facilities like BRL with heavy scientific and engineering applications
     ,
     it is not uncommon for a single scientist to produce 300+ Mbytes of
     "new" (different) data.  Each machine (even the aging VAXen) tends to
     have >2 Gbytes each, and we have serveral dozen machines.
     Your strategy, while nice, does not address this type of environment.

But Mike, you certainly don't mean to claim that each scientist is
typing in at the rate of... 3k/second!  So what you mean is you are
CHANGING 300+ MB.  The real question is the regeneration costs vs. the
archiving costs.  For our vision folks, that cost is zilch.  What is it
for yours?

     When I was visiting Martin Marietta in ~1980, they described their
     backup system (for their PDP-11/70 IS/1 UNIX systems) that is virtually
     the same as your system.  (sorry).

Sorry?  EVERYTHING IN THE WORLD has already been thought of by someone.
The real question is what they do (or don't do) with the ideas.  We're
saving est. $50k/year with it.  We're quite pleased.  

Plus the fact, I have seen absolutely nothing published, so I'll get a
cheap paper out of it too.  My very last act at FMC (I'm leaving Fri) is a
real winner.  No need for sadness here!

Regards
-Bilu