[comp.unix.ultrix] _slow_ rdump

field@elvis.cs.pitt.edu (Gus) (10/14/90)

Last night I tried dumping a 300 MB file system to tape (via rdump) from
a Decstation 3100 (Ultrix 3.1) to a  1/4" SCSI tape drive hanging off
a Sun3 (4.0.3).  rdump reported this would take 24 hours!  Is anyone
else using a similiar setup and getting resonable backup times?  How
can I get the 3100 to stream to the remote tape?

Thanks
Brian
-----
field@cs.pitt.edu

grr@cbmvax.commodore.com (George Robbins) (10/14/90)

In article <8844@pitt.UUCP> field@elvis.cs.pitt.edu (Gus) writes:
> 
> Last night I tried dumping a 300 MB file system to tape (via rdump) from
> a Decstation 3100 (Ultrix 3.1) to a  1/4" SCSI tape drive hanging off
> a Sun3 (4.0.3).  rdump reported this would take 24 hours!  Is anyone
> else using a similiar setup and getting resonable backup times?  How
> can I get the 3100 to stream to the remote tape?

When I tried this, the dump wasn't as slow as the prediction...

-- 
George Robbins - now working for,     uucp:   {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing:   domain: grr@cbmvax.commodore.com
Commodore, Engineering Department     phone:  215-431-9349 (only by moonlite)

mjr@hussar.dco.dec.com (Marcus J. Ranum) (10/15/90)

In article <8844@pitt.UUCP> field@elvis.cs.pitt.edu (Gus) writes:

>Last night I tried dumping a 300 MB file system to tape (via rdump) from
>a Decstation 3100 (Ultrix 3.1) to a  1/4" SCSI tape drive hanging off
>a Sun3 (4.0.3).  rdump reported this would take 24 hours!

	Sometimes rdump exaggerates a little with its time estimates, but
it *IS* pretty slow. I've heard (haven't measured it) that using dd to
block the transfer can be much faster, EG:

dump 0f - /filesystem | rsh tapehost dd of=/dev/tapedrive

	I'm not sure why this is the case, to tell the truth. The rmt
protocol sends a return value after each remote read/write, which should
make it slower than the rsh/dd combination (which doesn't check errors
or returns on the write) but I can't imagine it would make it that much
slower. Has anyone ever measured this ? This is a subject of some interest
to me.

mjr.
-- 
 coffeecoffeecoffeecoffeecoffeecoffeecoffeecoffeecoffeecoffeecoffeecoffeecoffee

D. Allen [CGL]) (10/15/90)

Here's stuff on dump/rdump I sent to comp.unix.ultrix last summer.
We run Ultrix 3.1 and 3.1C.

From idallen Thu Jun  7 21:42:39 1990
To: comp.unix.ultrix
Subject: Why isn't dump maximally efficient with TK70 tapes?

DECsystem 5400, Ultrix 3.1C, RA90 disk, one user (me).

Watch the elapsed real times here.

Here's a plain root dump to tape (TK70):

    # time dump 0 /
      DUMP: Date of this level 0 dump: Thu Jun  7 21:17:43 1990
      DUMP: Date of last level 0 dump: the epoch
      DUMP: Dumping /dev/rra0a (/) to /dev/rmt0h
      DUMP: Mapping (Pass I) [regular files]
      DUMP: Mapping (Pass II) [directories]
      DUMP: Estimates based on 1200 feet of tape at a density of 10240 BPI...
      DUMP: This dump will occupy 1103 (10240 byte) blocks on 0.13 tape(s).
      DUMP: Dumping (Pass III) [directories]
      DUMP: Dumping (Pass IV) [regular files]
      DUMP: 57.43% done, finished in 0:03
      DUMP: 1103 tape blocks were dumped on 1 tape(s)
      DUMP: Tape rewinding
      DUMP: Dump is done
    0% real=9:29 usr=0.3 sys=1.9 rd=0 wr=4 mem=56 pg=3 rec=17 sw=0 sig=0 cs=2776

Here's the identical root dump piped to dd to tape:

    recorder# mt rew
    recorder# time sh -c "dump 0f - / | dd bs=32k rbuf=2 wbuf=2 of=/dev/rmt0h"
      DUMP: Date of this level 0 dump: Thu Jun  7 21:28:18 1990
      DUMP: Date of last level 0 dump: the epoch
      DUMP: Dumping /dev/rra0a (/) to standard output
      DUMP: Mapping (Pass I) [regular files]
      DUMP: Mapping (Pass II) [directories]
      DUMP: Estimated 11295744 bytes output to Standard Output
      DUMP: Dumping (Pass III) [directories]
      DUMP: Dumping (Pass IV) [regular files]
      DUMP: 11295744 bytes were dumped to Standard Output
      DUMP: Dump is done
    0+2780 records in
    0+2780 records out
    4% real=3:34 usr=0.7 sys=8.7 rd=1 wr=8 mem=37 pg=2 rec=17 sw=0 sig=0 cs=10111

That's almost three times faster!  Why can't dump be as good as dd?
Dumps are of major importance; I would have thought that dump would be
the most clever user of the tape drive.  I can't believe this.  Am I
missing something?  I must be missing something.

From idallen Fri Jun  8 02:46:42 1990
Subject: Fun with dump

Ultrix dump of root to nowhere:

    bandicoot# time dump 0f - / >/dev/null
    [dump stuff deleted]
    16% real=0:22 usr=0.9 sys=2.7 rd=8 wr=4 mem=332 pg=0 rec=0 sw=0
	sig=0 cs=959

Ultrix rdump of root to nowhere:

    bandicoot# time /bin/rdump -0f bandicoot:/dev/null /
    [dump stuff deleted]
    39% real=0:55 usr=2.8 sys=19.1 rd=2 wr=6 mem=282 pg=3 rec=60 sw=0
	sig=0 cs=4533

Ultrix rdump of root to a real tape:

    bandicoot# time rdump -0f recorder:/dev/nrmt0h /
    [dump stuff deleted]
    [I hit break after 6 minutes when dump estimated the dump
     would take another 20 minutes]

Ultrix dump of root to rsh/dd to a tape:

    bandicoot# time dump 0f - / | rsh rec dd bs=32k rbuf=2 wbuf=2 of=/dev/rmt0h
    [dump stuff deleted]
    7% real=1:31 usr=1.0 sys=6.0 rd=2 wr=4 mem=351 pg=0 rec=3 sw=0
	sig=0 cs=4900
    15% real=2:48 usr=2.5 sys=24.1 rd=15 wr=7 mem=206 pg=0 rec=3
	sw=0 sig=0 cs=10300

What I learned:

    Don't use rdump.  It's an order of magnitude slower than a pipe to dd.
    In fact, even dump is slower than dump to stdout piped into dd with
    wbuf=2, because of bugs in the Ultrix nbuf code.  At least Ultrix dd
    handles multiple tapes and multi-buffer writes; isn't that convenient?

From idallen Fri Jun  8 04:19:02 1990
Subject: More fun with dump on Ultrix

You'd think that the dump command would have the smarts in it to
write tapes efficiently.  Wrong.  I wrote a simple program that reads
stdin, builds a 32K buffer, and writes it out using Ultrix
double-buffer I/O.  I used it on 198525952 bytes of /usr file system
on our DS5400, sent to a TK70 295Mb tape cartridge:

    # time sh -c "dump 0f - /usr | ./a.out >/dev/rmt0h"
    [dump info deleted]
    5% real=43:01 usr=11.3 sys=131.8 rd=4 wr=8 mem=63 pg=2 rec=18 sw=0
	sig=0 cs=138864

43 minutes elapsed time.  Compare that with what the default gets you:

    # dump 0 /usr
    [dump info deleted]
    DUMP: Estimates based on 1200 feet of tape at a density of 10240 BPI...
    DUMP: This dump will occupy 19400 (10240 byte) blocks on 2.29 tape(s).

Woops.  This dump won't even fit on the tape using the defaults.
Even if I kludged the tape size to make it seem to fit, it would still
take 3 *hours* to dump.  Ultrix dump also uses double-buffer I/O, but
it specifies 8 buffers instead of just 2.  The software release notes
for Ultrix 3.1C suggest 2 is better than more than 2, and this sure
bears that out.

From idallen Fri Jun  8 17:08:17 1990
To: comp.unix.ultrix
Subject: More undocumented performance issues with dump

Dump to stdout (a tape):

    # time dump 0bf 32k - / >/dev/nrmt0h
      DUMP: 11318272 bytes were dumped to Standard Output
      DUMP: Dump is done
    0% real=15:34 usr=0.3 sys=1.6 rd=0 wr=4 mem=92 pg=2 rec=18 sw=0 sig=0 cs=2058

Dump to the same tape directly:

    # time dump 0bf 32k /dev/nrmt0h /
      DUMP: 345 tape blocks were dumped on 1 tape(s)
      DUMP: Dump is done
    0% real=7:16 usr=0.4 sys=1.6 rd=0 wr=4 mem=94 pg=2 rec=18 sw=0 sig=0 cs=2019

Ultrix dump assumes that any output to stdout is to a pipe; it doesn't do
the same stat() [fstat()] tests to determine device type that it does
when you give the file name on the command line.

From idallen Fri Jun  8 17:16:46 1990
To: jpe@egr.duke.edu
Subject: Re: Why isn't dump maximally efficient with TK70 tapes?

> Problem #2 -- according to my man pages for "dd" the rbuf and wbuf options
> cannot be used at the same time.  Besides, the default wbuf value is 8
> for devices that support it.

Indeed, you, the source, and the man page are correct.  The example in
section 1.1.13 of the Ultrix 3.1C release notes is wrong, and I copied
it.  Silly me -- I thought the release notes knew something the man
page did not.  The first option wins and over-rides following [rw]buf
values.  What is not wrong is the statement in 1.1.13 "to get the
most performance gain, use a value of 2 with rbuf and wbuf options".
The default 8 buffers cause *worse* performance that specifying 2.

> Problem #3 -- The block size you specified to "dd" was wrong.  Dump writes
> in 10k blocks, not 32k.  Also you need to specify the obs (instead of bs)
> and specify a cbs equal to the obs.  This will buffer the input to the
> output block size, then write it to tape.  Restore will read a tape
> created this way, I doubt if it can read yours.

No, I wanted to write 32k blocks; it's faster and more efficient. I write
dump tapes far more than I read them; I wanted to speed up the writing.
Restore reads such tapes just fine if unblocked first:

     # dd if=/dev/rmt0h bs=32k rbuf=2 | restore -if -

You're right about the failure to buffer up to the output buffer
size, but I don't want to pay the price of using dd to make the
conversion -- it's way too slow.  See my latest note in comp.unix.ultrix
about a little program that buffers up to 32k and writes.

> Problem #4 -- You should note instead the system times and percentage of
> CPU used.  On my VAXserver 3600 these times jumped dramatically in order
> to give me a few seconds real-time savings.  Also when I used a no-rewind
> device "dump" was actually faster than the "dd pipe."  On a CPU-loaded
> system you might not have such a big win..

I observed a factor of three in real-time performance; more than a few
seconds and of importance to us.  The tape rewind added 12 seconds to
any times I posted.

Looking at the source to dd, I see that if one doesn't use "bs=", it copies
the data painfully from the input buffer to the output buffer one byte at
a time.  No wonder that eats cpu.  I wrote a simple program to buffer
input and write it out in 32k chunks; this works much better than dd, but
it won't handle multiple volumes.  See my comp.unix.ultrix posting.

> Problem #5 -- What happens if one of your partitions becomes larger than
> a TK70?

Ultrix dd handles multi-volumes.

I think the problem is just that dump uses too many buffers and in Ultrix
3.1C that makes things worse rather than better.  Or perhaps dump's
buffers aren't aligned on page boundaries, and dd's are.  (See Ultrix
Version 3.1C Release Notes section 1.1.12.)
-- 
-IAN! (Ian! D. Allen) idallen@watcgl.uwaterloo.ca idallen@watcgl.waterloo.edu
 [129.97.128.64]  Computer Graphics Lab/University of Waterloo/Ontario/Canada

grr@cbmvax.commodore.com (George Robbins) (10/15/90)

In article <1990Oct15.034337.21119@watcgl.waterloo.edu> idallen@watcgl.waterloo.edu (Ian! D. Allen [CGL]) writes:
> Here's stuff on dump/rdump I sent to comp.unix.ultrix last summer.
> We run Ultrix 3.1 and 3.1C.
... ... ... ...
> 
> I think the problem is just that dump uses too many buffers and in Ultrix
> 3.1C that makes things worse rather than better.  Or perhaps dump's
> buffers aren't aligned on page boundaries, and dd's are.  (See Ultrix
> Version 3.1C Release Notes section 1.1.12.)

Somewhere in the Ultrix 4.0 Release notes, it admits that multi-buffer I/O
sucks if the buffers aren't "properly aligned".  This might explain why
dump gets slow, but leaves it as an open question why they didn't bother
to fix the underlying problems in dump or multi-buffer I/O...

Perhaps using Ultrix 1.2 dump is the ticket?  Before they kludged in the
multi-buffered I/O and the generic syscalls?  Where did I put that release
tape...   8-)  BTW, dump really seems to work just fine on my configuration,
5810/HSC50/TA78, but I don't have much to compare it with.

BTW, I'd still like to get the patches to BSD 4.3 dump to sleaze over
the differences between the 4.3 include files and what resolves out of
the ultrix [gvi]node equates.  Somebody last year claimed that this was
"easy", but perhaps this was a theoretical assertion?

-- 
George Robbins - now working for,     uucp:   {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing:   domain: grr@cbmvax.commodore.com
Commodore, Engineering Department     phone:  215-431-9349 (only by moonlite)