[net.bugs.4bsd] Problem with dump

gill (07/19/82)

	We at BTL physics are having a problem with 4.1 BSD dump(8) on
our /usr file system. Seems that it tries to seek to some very high (and
occasionally negative) block numbers and gets a "shouldn't happen"
error. This happens in the middle of doing an epoch dump.

	I suspect this is due to the file system being active, and
haven't looked into the problem very much yet. Fsck says things
are fine.

	Has anyone else had this problem and/or solved it?
Does anyone know how dump is supposed to deal with (what we
used to call) "phase errors?" (when a inode points to different blocks
between the begining phase of the dump when maps are made and
the time the file is actually written to tape).

	As a great deal of long running (days) similuations are run on
our system (with output going to the file system we wish to back up),
we would like to avoid stopping these processes in order to do
dumps. This used to be possible under older dump programs.

	Thanx,

		Gill Pratt

		...  alice!rabbit!physics!gill
		or
		...  gill@mc

dmmartindale (07/19/82)

We've also seen the garbage block numbers when dumping an active file system.
The best explanation I can come up with is that the inode gets reallocated
(or just truncated and rewritten) during the period it is getting dumped,
and the copy of the inode (or a double-indirect block) in dump's memory which
points to an indirect block on disk is thus out of date.  Now, if this
stale inode points to a disk block as an indirect block when in fact it's
been reallocated as a data block, you'll get garbage when using its contents
as block pointers.  The few times this has happened here, we've aborted
the dump and restarted it, and it has always worked fine the second time.
(Although it's painful having to restart a 6-reel dump.)

	Dave Martindale