[comp.unix.wizards] Backups on Live Systems

frank@dvm.UUCP (05/27/87)

I have a problem.

I am system administrator for an Alliant FX-8 running Concentrix 2.0 (based
on 4.2 BSD).  There are a number of large jobs here that need to run 24 hours
a day.  This means that the system cannot be taken down for nightly backups.
However, the powers that be feel (justifiably) that daily incremental backups
are a good idea.  How much am I risking if I try to dump a "live" filesystem?

If dumping a live filesystem -- one that is mounted with the machine running
in multiuser mode -- is a bad idea, is there any good way to suspend user
processes until the backup is over?

If anyone has any ideas, thoughts, or (preferably) experience they can share,
I'd really love to hear it.  Please send replies via mail;  I don't get many
chances to read news.  I will summarize and post useful/interesting responses
in about 2 weeks.

Many Thanks.

this system
any offi
-- 
				Frank
				...!inhp4!allegra!phri!orville!dvm!frank

hayes@wizard.ucsd.edu (James Hayes) (05/29/87)

Just curious (not having any experience with things like this..) what would
be the harm of "suspending" any "multi-day" jobs while the backup is running 
and then restarting them?  It stands to reason the file system might be just 
a tad more "stable" then.

Maybe SIGSTOP and SIGCONT for BSD?

Jim Hayes, University of California at San Diego.

BITNET: hayes%sdcsvax@WISCVM.BITNET
ARPA:	hayes@sdcsvax.ucsd.edu
UUCP:   {pick one close to berkeley}!sdcsvax!hayes

kermit@BRL.ARPA (Chuck Kennedy) (05/29/87)

We dump our machines (including several Alliant FX/8s) incrementally
during the day.  There is the possibility that files can get lost
when the machine is up.  It's a risk we live with.  However, we do
level 0 dumps once a month and the machines are single-user when
that happens to insure good dump tapes so that we never lose more
than a month of work at the very worst.  Generally I think you would
be safe, but it might be reasonable to try and schedule the dumps
done while the machine is up when people are not logged in (like
early morning or late evening).
	Good luck,
	 -Chuck Kennedy
	U.S. Army Ballistic Research Lab

dpz@aramis.rutgers.edu (David P. Zimmerman) (05/31/87)

Frank,

We have 15+ Unix boxes that we do level 0s weekly and level 1s daily
on, while the machines are up multiuser.  A few are by necessity, one
being in a galaxy far far away on the other end of a T1 line, but most
are just because it is convenient to not have to take all of those
machines down.  We have no problems with backups on those live systems
and restores from those backups.

						dpz
-- 
David P. Zimmerman           rutgers!dpz           dpz@rutgers.edu

shannon@sun.uucp (Bill Shannon) (05/31/87)

I'm curious, does anyone have any really good ideas about how to
do reliable backups on active filesystems?  Would such backups,
by necessity, have to be done through the filesystem, rather than
through the raw device as dump does?

Also, what would you *expect* from a full dump taken on an active
filesystem?  If you had to restore that dump, what state of the
filesystem would you expect to restore?  What sort of guarantees
of consistency would you want from such a dump scheme?  When the
filesystem is inactive, it's easy to guarantee a certain consistency,
but when it's (potentially very) active, what can you guarantee?

Does anyone know how other systems handle this problem?  What does
VMS do?  TOPS-20?  Multics?  others?

					Bill Shannon
					Sun Microsystems, Inc.

jas@rtech.UUCP (Jim Shankland) (05/31/87)

In article <20070@sun.uucp> shannon@sun.uucp (Bill Shannon) writes:
>I'm curious, does anyone have any really good ideas about how to
>do reliable backups on active filesystems?  Would such backups,
>by necessity, have to be done through the filesystem, rather than
>through the raw device as dump does?

Well, I'll let others judge what's a "really good" idea, but people
who need to backup active databases do "fuzzy dumps."  The idea is to
dump a live dataset, but keep a log of changes made to pages that have
already been dumped, and then apply the log to the dump that was made.

To do this to UNIX would require kernel support.  Whether a system
call to snapshot a cooked device really belongs in the kernel is
open to debate; but it would get the job done.  Anybody have a better
idea?
-- 
Jim Shankland
 ..!ihnp4!cpsc6a!\
                  rtech!jas
..!ucbvax!mtxinu!/

davy@pur-ee.UUCP (06/02/87)

	In article <20070@sun.uucp> shannon@sun.UUCP writes:
	>I'm curious, does anyone have any really good ideas about how to
	>do reliable backups on active filesystems?  Would such backups,
	>by necessity, have to be done through the filesystem, rather than
	>through the raw device as dump does?

Doing it through the file system wouldn't gain you anything (except a
big drop in speed).  The problem with dumps on live file systems is that
dump makes more than one pass over the data.  The first pass, it maps
regular files; the second pass, it maps directories, the third pass it
dumps directories, the fourth pass it dumps files.  Information can
change between the passes.  Going through the file system would not
alleviate this.

	>Also, what would you *expect* from a full dump taken on an active
	>filesystem?  If you had to restore that dump, what state of the
	>filesystem would you expect to restore?

For things like files getting deleted, created, what have you, you don't
have too much difficulty.  Files which get created are ignored by dump;
files which get deleted cause restore to get slightly upset (you get
"resyncing restore" messages).  The *big* problem is when you get an
inode which used to be a file and is now a directory, or vice-versa.
Things then start to get really ugly, and restore can't always deal with    
it properly.

Several months ago (last time this was discussed), I mentioned that I
had mods for 4.3BSD dump to allow you to dump live file systems.  I got
a lot of requests, so I am posting them here.  Basically, these mods
tell dump to ignore any inode which has changed since the dump started.
This has the minor disadvantage that some files (those being modified)
will be missing from the dump (you occassionally see "resync restore"
messages); but it has the major advantage of not having to shut the
machine down to single user each time you want to do a dump.

We have been running this code for nearly two years on our Vaxes
(4.3BSD) and Goulds (UTX/32); and I have been running it for about 3
months on our Sun systems (SunOS 3.3).  We have never had any problems
with it.  We do partials daily between midnight and 2am on the Vaxes, 4
to 6am on the Goulds, and 8 to 9am on the Suns.  We do full dumps from
10am to about 3pm on the Vaxes, Goulds, and Suns.  Sometimes there are
upwards of 50 people logged in during the dump, with no ill effects
(except they find the machine a little slow...).

These diffs are for 4.3BSD dump.  Applying them to 4.2BSD dump, Sun dump,
etc. should be trivial (I know I didn't have any difficulty...).

--Dave Curry
Purdue University
Engineering Computer Network

--------------------- dumpmain.c ---------------------
*** /tmp/,RCSt1016632	Tue Jun  2 12:41:48 1987
--- /tmp/,RCSt2016632	Tue Jun  2 12:41:49 1987
***************
*** 10,15

  #include "dump.h"

  int	notify = 0;	/* notify operator flag */
  int	blockswritten = 0;	/* number of blocks written on current tape */
  int	tapeno = 0;	/* current tape number */

--- 10,18 -----

  #include "dump.h"

+ #ifdef PURDUE_ECN
+ int	filepass;
+ #endif
  int	notify = 0;	/* notify operator flag */
  int	blockswritten = 0;	/* number of blocks written on current tape */
  int	tapeno = 0;	/* current tape number */
***************
*** 287,292
   	pass(dirdump, dirmap);

  	msg("dumping (Pass IV) [regular files]\n");
  	pass(dump, nodmap);

  	spcl.c_type = TS_END;

--- 290,298 -----
   	pass(dirdump, dirmap);

  	msg("dumping (Pass IV) [regular files]\n");
+ #ifdef PURDUE_ECN
+ 	filepass = 1;
+ #endif
  	pass(dump, nodmap);
  #ifdef PURDUE_ECN
  	filepass = 0;
***************
*** 288,293

  	msg("dumping (Pass IV) [regular files]\n");
  	pass(dump, nodmap);

  	spcl.c_type = TS_END;
  #ifndef RDUMP

--- 294,302 -----
  	filepass = 1;
  #endif
  	pass(dump, nodmap);
+ #ifdef PURDUE_ECN
+ 	filepass = 0;
+ #endif

  	spcl.c_type = TS_END;
  #ifndef RDUMP

--------------------- dumptraverse.c ---------------------
*** /tmp/,RCSt1016661	Tue Jun  2 12:43:06 1987
--- /tmp/,RCSt2016661	Tue Jun  2 12:43:08 1987
***************
*** 7,12
  #ifndef lint
  static char sccsid[] = "@(#)dumptraverse.c	5.3 (Berkeley) 1/9/86";
  #endif not lint

  #include "dump.h"

--- 7,15 -----
  #ifndef lint
  static char sccsid[] = "@(#)dumptraverse.c	5.3 (Berkeley) 1/9/86";
  #endif not lint
+ #ifdef PURDUE_ECN
+ extern int	filepass;
+ #endif

  #include "dump.h"

***************
*** 45,50
  		BIS(ino, dirmap);
  	if ((ip->di_mtime >= spcl.c_ddate || ip->di_ctime >= spcl.c_ddate) &&
  	    !BIT(ino, nodmap)) {
  		BIS(ino, nodmap);
  		if (f != IFREG && f != IFDIR && f != IFLNK) {
  			esize += 1;

--- 48,60 -----
  		BIS(ino, dirmap);
  	if ((ip->di_mtime >= spcl.c_ddate || ip->di_ctime >= spcl.c_ddate) &&
  	    !BIT(ino, nodmap)) {
+ #ifdef PURDUE_ECN
+ 		if (ip->di_mtime >= spcl.c_date || ip->di_ctime >= spcl.c_date){
+ 			if (f != IFDIR)
+ 				return;
+ 		}
+ #endif

  		BIS(ino, nodmap);
  		if (f != IFREG && f != IFDIR && f != IFLNK) {
  			esize += 1;
***************
*** 138,143
  	i = ip->di_mode & IFMT;
  	if (i == 0) /* free inode */
  		return;
  	if ((i != IFDIR && i != IFREG && i != IFLNK) || ip->di_size == 0) {
  		spclrec();
  		return;

--- 148,160 -----
  	i = ip->di_mode & IFMT;
  	if (i == 0) /* free inode */
  		return;

+ #ifdef PURDUE_ECN
+ 	if (ip->di_mtime >= spcl.c_date || ip->di_ctime >= spcl.c_date) {
+ 		if (filepass)
+ 			return;
+ 	}
+ #endif
  	if ((i != IFDIR && i != IFREG && i != IFLNK) || ip->di_size == 0) {
  		spclrec();
  		return;

rbj@icst-cmr.arpa (06/02/87)

   Also, what would you *expect* from a full dump taken on an active
   filesystem?  If you had to restore that dump, what state of the
   filesystem would you expect to restore?  What sort of guarantees
   of consistency would you want from such a dump scheme?  When the
   filesystem is inactive, it's easy to guarantee a certain consistency,
   but when it's (potentially very) active, what can you guarantee?

These are all good questions. I will limit my answers to what I
*expect*, which may or may not correlate with reality.

Ideally, I expect all files newer than some date (selected by level)
to be dumped on tape. I do not care that file deletions be noted,
because it is easier to delete extraneous data than to recreate it.
I also do not believe in incremental restores, as I tried it once
and it didn't seem to work. While I realize this is scant data and
I could have goofed, I would rather restore what I want into a
subdirectory (using either the `i' or `r' options) and sort it
out by hand than depend on something that might really be broken.
I also have little desire to alter my conceptual model in this case.

Practically, since inodes are scanned first, pathnames mapped,
and then dumped as a triple (inode, pathname, data) by inode, it
would appear that any inode that changed could be bogus. This could
include incomplete data if the file was being modified. If a file was
deleted and a new file created that used the same inode, a new file
could be dumped under a wrong (the old) name. Either of these
possibilities presents little problem if you assume that "files in
the process of being updated while dumps are going on deserve what
they get". 

Of course, if your aim is to preserve the *exact* state of a disk,
then deletions are important to you. But then I would say why not
dump the whole thing? Takes time? Gee, life is tough!

   Does anyone know how other systems handle this problem?  What does
   VMS do?  TOPS-20?  Multics?  others?

Of others, I will speak of SECURE under EXEC 8, from memory and
at least a decade ago. Files have a backup date in their equivalent
of inodes. Every so often, and/or when storage becomes scarce, a
daemon backs up files to tape and optionally deletes them, the oldest
files (not specifically excluded by special means) going first. This
is called rolling out. If a file does not exist when referenced, a
daemon is started to roll the file back in and a message to be patient
is emitted. In awhile, the file magically reappears.

I see no reason why such daemons could not be implemented on any
UNIX system with operators onsite.

					   Bill Shannon
					   Sun Microsystems, Inc.

	(Root Boy) Jim Cottrell	<rbj@icst-cmr.arpa>
	National Bureau of Standards
	Flamer's Hotline: (301) 975-5688

mangler@cit-vax.Caltech.Edu (System Mangler) (06/03/87)

In article <132@dvm.UUCP>, frank@dvm.UUCP (Frank Wortner) asks:
> How much am I risking if I try to dump a "live" filesystem?

Once I did some experiments where I ran dump while concurrently doing
an "rm -r" of most of the filesystem.

Restore would dump core pretty early on if I used 4.2bsd dump; but it
made it most or all of the way through if I used 4.3bsd dump.  And if
I didn't start the "rm -r" until after Pass I, II, and III of 4.3 dump
were safely out of the way, restore always made it all the way through.

4.3bsd dump has several checks in it to suppress things that are known
to cause core dumps in restore.  4.2bsd dump has no checks at all.
(4.3bsd dump also *outran* the rm -r).

I once changed dump to prompt for a tape between passes II and III,
so that passes I and II could proceed while the tape was being mounted.
This was a disaster, because the first tape was the operator's favorite
time to go to lunch, which stretched the critical section to an hour,
resulting in a rash of useless dumps until I figured it out.

If you have the choice, try to do full backups on quiesent filesystems,
and check them with fsck first.  A lot is riding on those full backups.

If you're stuck with dumping active filesystems,
    a)	Realize that the bigger the dump, the more likely it is to be
	bad, and the worse the consequences; thus, full dumps must be
	every week or two, to keep the probability of catastrophe down.
	Forget Towers of Hanoi; you need more redundancy than that.
    b)	Pick a time when not much is changing, like early morning.
    c)	Get through passes I, II, and III as fast as you can.  The time
	taken is linear in the number of inodes allocated, so it pays to
	allocate no more inodes than needed (also makes fsck go faster).

I've lost some data to bad dumps, but I've lost a lot more to dumps
that didn't get done.  It is probably better to do "live" dumps every
day than single-user dumps only weekly.  If you really care about the
risk of losing data, get Eagles, which are less likely to lose data
than most disk controllers.  (For this reason, I don't do disk-to-disk).

Don Speck   speck@vlsi.caltech.edu  {seismo,rutgers,ames}!cit-vax!speck

jbn@glacier.UUCP (06/03/87)

       The Multics incremental dumper, circa 1968, dealt with this problem
in a very elegant way; it may take some searching of the literature to turn
up a reference, but it's worth it.  The incremental dumper ran continuously,
backing up any file whose backup was more than some number of minutes or
hours out of date.  The design spec for the Multics file system stated that
"file system reliability should be good enough that users will be comfortable
keeping their only copy of important work in the system".

       UNIVAC's EXEC 8 Secure, with which I was at one time familiar in great
detail, did incremental dumps on live systems in a reliable way; this system,
incidentally, could write several tapes simultaneously.

       I would suggest the following rules for an on-line dumper:

    0.  The goal is to be able to recover everything not designated as 
	temporary without any loss and without any manual "cleanup" or
	"guru activity".  This is achievable and has been done for other
	systems.

    1.  Files should be normally be opened for exclusive write access while
	being dumped, so that each file dumped is in an internally consistent
	state.  Alternatively, the file should be dumped and if altered 
	during dump, dumped again later, towards the end of the dump process.

    2.  Means should be provided to request special handling for files, to
	include:

	- Do not dump (for temporary files)
	- Dump without locking (for log files)
	- Dump as a consistent group (all files in group must not change
		during dump of group, for interrelated files)
	- Signal process when dump requires file access (to indicate
	  that some task should make file or files consistent and 
	  avoid altering it during the dumping process).

    3.  The means for requesting such handling should be available to the
	owner of the file.  Most users will not use it, but some programs,
	particularly serious database systems, need it badly.
	

   These things are the difference between a system that works most of the
time and one that works reliably.

					John Nagle

gnu@hoptoad.UUCP (06/04/87)

The first thing to do is make damn sure that there are *no* inputs that
can crash the restore program.  A program whose main use is recovery
from catastrophe should be utterly reliable, or what use are all those
dump tapes?  I didn't believe it when I tried restoring from a fulldump
plus an incdump, both done with the system single-user; restore screwed
up!  (SunOS 3.0)  This was, and is, unthinkable to me, but I have a
timesharing service bureau background (STSC), where we would change 30 disk
packs for empty ones once a week and RESTORE onto the new ones, to make
sure our tapes were good and because it also gave us a week-old full
backup on disks in case we needed it.  That data was not just valuable;
it was our CUSTOMERS' data and our lifeblood.  You just can't tell a
customer paying for computer time and storage that you zapped her files
and can she please recreate them.  They go elsewhere.

An APL timesharing system I used in Toronto developed a system
for online backups; they would 'freeze' all access to files in
each user's directory while dumping those files.  (It was not a
hierarchical file system.)

Freezing file access on Unix could be done for particular directory
trees, indicated to dump by a control file, or by the presence in 
the file-system-being-dumped of ".dumpfreeze" files or some such.
E.g. each user's home directory could contain one, so that user
would be likely to see a consistent dump.  Such a freeze should
probably hang new accesses (e.g. open or creat) while allowing
read/write/close to continue for a few seconds, to give running
programs a shot at getting files into a consistent state before the dump.

Bill asked whether getting a consistent picture would have to be done
through the file system rather than by reading the raw device.  I think
so.  This need not be construed as a performance liability; probably
adding a few well-chosen primitives to the file system could make dump
actually run faster, since the kernel can presumably do *anything*
faster than a user program if it wants to.  This would reduce the
number of programs that actually handle a raw file system down to 3
(mkfs, kernel and fsck).  The chosen primitives should be general, e.g.
should be usable by users doing their own dumps (no superuser required
if you have access to all the files you are dumping) and could also be
usable by other programs, e.g. 'find' or 'tar'.

One primitive could take a pathname, a time, a buffer, and an integer,
and search starting from the path for files whose inode timestamp is
later than the time, putting their relative pathnames into the buffer,
and optionally exclusive-opening up to N of them for the caller
(putting the fd's into the buffer too).  Files that could not be
exclusively opened would be returned in the buffer with an fd of -1.  I
can think of holes in this (e.g. once one bufferload has been returned,
how to restart the search?  how to indicate whether directories being
searched should remain locked against new links/creats?  etc...), but
clean filesystem interfaces for dumping is an area that would bear
fruit, if only sour grapes :-|.
-- 
Copyright 1987 John Gilmore; you may redistribute only if your recipients may.
(This is an effort to bend Stargate to work with Usenet, not against it.)
{sun,ptsfa,lll-crg,ihnp4,ucbvax}!hoptoad!gnu	       gnu@ingres.berkeley.edu

dricej@drilex.UUCP (Craig Jackson) (06/06/87)

Speaking of operating systems in general, and sites in general, there are
many sites that can afford to be down several hours a day for backups.
In particular, I work at a Burroughs time-sharing site, with worldwide
users.  It's nearly always prime time for somebody around the world.

In article <20070@sun.uucp> shannon@sun.uucp (Bill Shannon) writes:
>I'm curious, does anyone have any really good ideas about how to
>do reliable backups on active filesystems?  Would such backups,
>by necessity, have to be done through the filesystem, rather than
>through the raw device as dump does?

You almost have to do it through the filesystem.  It's a price you pay to stay
live.

>Also, what would you *expect* from a full dump taken on an active
>filesystem?  If you had to restore that dump, what state of the
>filesystem would you expect to restore?  What sort of guarantees
>of consistency would you want from such a dump scheme?  When the
>filesystem is inactive, it's easy to guarantee a certain consistency,
>but when it's (potentially very) active, what can you guarantee?

I expect each file in the system to be restored to the state it had when it
was copied.  If it was actively being written to at the time of the copy,
then you take your lumps.  Most aren't; they're just being read.  If you
really care, DBMS's generally have special backup procedures that handle
this.

Most systems that I've seen that do things like this do not handle
file deletions; unless you've totally lost the disk most user's don't use
that feature anyway.  It's really only an issue for incrementals.

>Does anyone know how other systems handle this problem?  What does
>VMS do?  TOPS-20?  Multics?  others?

On our Burroughs, there is no specialized dump program, nor any concept of
the raw disk (except for maintenance).  The dumps are handled by the rough
analog of 'tar' or 'cpio', from automatically constructed batch jobs.
There is a separate system for figuring out what needs to be dumped &
constructing the jobs, and for retrieving something from a dump. We wrote
our own; most sites either do that, or buy a commercial package.  The vendor
does not market a specific dump program.

>					Bill Shannon
>					Sun Microsystems, Inc.

-- 
Craig Jackson
UUCP: {harvard!axiom,linus!axiom,ll-xn}!drilex!dricej
BIX:  cjackson

mangler@cit-vax.Caltech.Edu (System Mangler) (06/09/87)

In article <6279@pur-ee.UUCP>, davy@pur-ee.UUCP (Dave Curry) writes:
> Basically, these mods
> tell dump to ignore any inode which has changed since the dump started.
> This has the minor disadvantage that some files (those being modified)
> will be missing from the dump

For instance, /dev/rmt8, /dev/console, and /usr/adm/acct will be missing,
even on a full dump done in single-user mode.

This seems very much like closing the barn door after the horse
has run away.  It doesn't hurt for the file to be modified while
dump isn't looking (so long as it stays the same type); what hurts
is when the file is modified in the middle of being dumped, something
you don't detect and can't do anything about even if you detected it
(other than warn the operator to start over).  It is a problem common
to dump, tar, cpio and probably others, because they all assume that
st_sizes won't change while the file is being written to tape.

You do, at least, protect against regular inodes becoming directories
or vice-versa.	However, 4.3bsd dump already does that.  I fail to
see what you've gained.

(Didn't we say all these things last year?)

Don Speck   speck@vlsi.caltech.edu  {seismo,rutgers,ames}!cit-vax!speck

dhesi@rb442.UUCP (06/10/87)

In article <2990@cit-vax.Caltech.Edu> mangler@cit-vax.Caltech.Edu (System 
Mangler) writes:
>In article <6279@pur-ee.UUCP>, davy@pur-ee.UUCP (Dave Curry) writes:
[discussing how to deal with files that change during a dump]
>> Basically, these mods
>> tell dump to ignore any inode which has changed since the dump started.
>> This has the minor disadvantage that some files (those being modified)
>> will be missing from the dump
...
>It doesn't hurt for the file to be modified while
>dump isn't looking (so long as it stays the same type); what hurts
>is when the file is modified in the middle of being dumped, something
>you don't detect and can't do anything about even if you detected it
>(other than warn the operator to start over). 

The perfect solution is not difficult, though it will probably attract
a lot of flames.

The good folks at BBN Inc. invented it many years ago.  They called
it copy-on-write.  They used it to allow self-modifying programs
to be shareable.  We can use it to dump active filesystems.  It will
need some kernel changes.

Before you begin dumping a file, you mark it copy-on-write.  During
the dump, any attempt to write to the file doesn't change the original 
but always allocates a new block.  The dump process sees only the 
original unchanged file.  The process writing to the file sees only 
the new modified file.
-- 
Rahul Dhesi         UUCP:  {ihnp4,seismo}!{iuvax,pur-ee}!bsu-cs!dhesi

jerry@oliveb.UUCP (Jerry F Aguirre) (06/25/87)

I think it is important to realize that there are two levels of
consistancy that we are discussing here.  The first, and what most
people seem to be referring to, is the file system consistancy, i.e.
what fsck checks.

If all you are concerned about is getting a clean restore and fsck then
there are several means to do so.  Anything from locking out writes on
the raw interface to more complex strategy affecting only individual
files.  All these can give you a clean dump and restore.

But this DOESN'T mean that the dumps will be any good.  The object of
this is to have the user's files be recoverable, not to eliminate having
a guru patch the file system.  Even if all file IO was stopped for the
duration of the dump user files could be missing.

The most obvious example of this is if the user is editing a file at the
time that the dump begins.  Say he has just decided to write out all his
laborious changes.  The first thing done is to "create" the file which
truncates it to zero length.  Then it begins copying from the temp file
to the edited file.  If the dump begins at this time then the dumped
file will be at zero length!

This is actually worse than a system crash because vi knows how to
recover from those.  In a crash the temp file is saved and vi knows how
to re-create the edited file from that.  In this case a truncated file
is dumped, after which vi will continue and delete the temp file
necessary to re-create the original.  The dump is not likely to
contain the temp file because it is usually on a different file system.

If you are talking about a long running batch job that can be rerun to
create the output files then getting a "fsck" consistent restore from an
on-line dump is a worthwhile goal.  If you are talking about doing this
while users are banging away at random jobs then expect a few
disappointments.
					Jerry Aguirre

mangler@cit-vax.UUCP (06/29/87)

In article <1589@oliveb.UUCP>, jerry@oliveb.UUCP (Jerry F Aguirre) writes:
> The most obvious example of this is if the user is editing a file at the
> time that the dump begins.  Say he has just decided to write out all his
> laborious changes.  The first thing done is to "create" the file which
> truncates it to zero length.	Then it begins copying from the temp file
> to the edited file.  If the dump begins at this time then the dumped
> file will be at zero length!

The solution to this is file versions.	TOPS-10 did this fairly nicely;
when you did the equivalent of creat(), you got an unnamed temporary
file (actually, its name was blank), and when you were finished
writing it, you could either close it normally, in which case it
replaced the old version, or you could flush it (this was automatic
if your program died).	While the new version was being written,
everybody still saw the old version intact.  Just think, you could
do "sed s/foo/bar/ <myfile >myfile" without explicit temporary files.

To add this to Unix, you'd need two new facilities:  a relative of
creat() to create a file with zero links, yielding a file descriptor;
and an "flink" call to make a link to a file given a file descriptor.
[N.B. flink() should unlink the target if necessary, like rename()].
Perhaps nicer would be to build flink() into close(), making close()
take a filename, and if you give NULL as the filename, you get the
old behavior.  Nicer still would be to have the modified creat()
stash away the filename and pointer to the directory inode for later
use by the close routine, but with BSD long filenames, it might be
difficult to find a place for it.

None of this should be terribly difficult to implement, but you'd
have to change all your programs if you want to take advantage of
it.  (Although stdio might be able to do some of it for you, all
those programs that never call close() would lose...)

Don Speck   speck@vlsi.caltech.edu  {seismo,rutgers,ames}!cit-vax!speck

preece%mycroft@gswd-vms.Gould.COM (06/29/87)

Well, we do our backups on live systems.  The idea of taking machines
out of service long enough to do dumps everyday would not fly -- there
are people on our machines 24 hours a day.  We do several levels of
dump and keep some tapes for very long times.  I have never personally
lost anything due to dump confusion, though it's certainly possible.
Restoring a broken filesystem is a pain in the ass, but unless you
do daily level zeroes (are there enough hours in a day to do that?)
that pain is inevitable.

On the other hand, I'm sufficiently paranoid that I have emacs save a
spare copy of any file I modify.  The spare goes into a filesystem
on a device I don't otherwise work on and has the file's complete
path name in its name (for a while I was doing this over NFS, but the
irritation when the repository machine went offline became rapidly
too high).  Thanks to the repository copies I can usually get the
latest version of a file I'm working on even if the device it normally
lives on goes belly up (head down?).  This makes me less worried
about the 24 hours between dump cycles ("What do you mean I ONLY
lost one day's work?!").

-- 
scott preece
gould/csd - urbana
uucp:	ihnp4!uiucdcs!ccvaxa!preece
arpa:	preece@Gould.com

allbery@ncoast.UUCP (Brandon Allbery) (07/03/87)

As quoted from <3114@cit-vax.Caltech.Edu> by mangler@cit-vax.Caltech.Edu (System Mangler):
+---------------
| To add this to Unix, you'd need two new facilities:  a relative of
| creat() to create a file with zero links, yielding a file descriptor;
| and an "flink" call to make a link to a file given a file descriptor.
| [N.B. flink() should unlink the target if necessary, like rename()].
| Perhaps nicer would be to build flink() into close(), making close()
| take a filename, and if you give NULL as the filename, you get the
| old behavior.  Nicer still would be to have the modified creat()
| stash away the filename and pointer to the directory inode for later
| use by the close routine, but with BSD long filenames, it might be
| difficult to find a place for it.
+---------------

Why?  One easy and compatible way to do it is as follows:

Store a version number for a file in the inode.  (Yes, inode-based versions,
not path-based ones.  This is necessary given the UNIX way of doing things.)
A good place would be byte 39 of the block array, given that the blocks only
fill bytes 0-38.  On the other hand, 255 generations may be too few.  The
inode must also contain the inode number of the previous version, so that
previous versions can be retrieved.

When a file is creat()'ed, an empty, unnamed (link count = 0) is allocated;
data is written to this file.  When it is close()'ed,, either directly or by
exit(), the inode is updated, then the inode and the original are swapped
within the i-list for the file system.  Thus, the inode number never changes
between versions of a file (otherwise, it has to change every directory entry
the file has).

To specify a version of a file, you use an extension of the current naming
scheme; append a slash and the version number to the file name.  When namei()
gets to a file and there is a slash and a valid number afterward, it can
follow the chain back to the correct inode.

Programs aborting via kill() (and possibly by exit(n) for n != 0 or n < 0 or
etc.) would have the new inodes deallocated, leaving the previous version
intact.

Programs which don't know about version numbers then get the version seman-
tics for free.  Also, you can implement GENERATION-RETENTION-COUNT as yet
another variable in the u block, with a system call to get or set it.

This means that directories can't have versions.  Then again, there isn't
much reason for them to have them anyway.

Any comments?

++Brandon
-- 
       ---- Moderator for comp.sources.misc and comp.binaries.ibm.pc ----
Brandon S. Allbery	<BACKBONE>!cbosgd!hal!ncoast!allbery ('til Aug. 1)
aXcess Company		{ames,mit-eddie,harvard,talcott}!necntc!ncoast!allbery
6615 Center St. #A1-105	{well,sun,pyramid,ihnp4}!hoptoad!ncoast!allbery
Mentor, OH 44060-4101	necntc!ncoast!allbery@harvard.HARVARD.EDU (Internet)
+01 216 974 9210	ncoast!allbery@CWRU.EDU (CSnet -- if you dare)
NCOAST ADMIN GROUP	Brandon Allbery on 157/504 (Fidonet/Matrix/whatever)
* ncoast -- Public Access UN*X -- (216) 781-6201, 24 hrs., 300/1200/2400 baud *
 * ncoast is proud to be carrying alt.all -- contact me for more information *