[comp.sys.ibm.pc] MKS: multi-volume backups

Isaac_K_Rabinovitch@cup.portal.com (02/05/88)

rdo031@tijc02.UUCP (Rick Odle           ) writes:
->I have been trying to creat a combination of MKS tools to do effectively
->the job of dos's backup and pkarc combined together.  
->
->MKS gives an example in their manual of doing something like this:
->
->	find /dir -type f | cpio -oc | dd of=a:
->
->which finds all of the files starting at root dir, creates an archive
->with ASCII headers ( -oc ), and pipes it to dd, which writes it to the a:
->diskette in a sequential record format( not dos compatable).  I have
->modified this in the following manner:
->
->	find /dir -name * | cpio -oa | dd of=a:
->
->The find command as shown will also get directory entries, even those that
->are empty.  The -oc is not needed for dos-to-dos backups and restores, and
->the -oa keeps the modification time of the original file.  The real problem
->lies in the fact that dd will not write a large archive to a multi-diskette
->volume.  Also there is no compression of the files involved.  
->
->So there you have it.  Any suggestions would be welcome.

The above examples remind me of the sort of shell script I was always writing
when I was administering unix boxes. (I was never very good at it, so
no guru requests, please.)  Every since I became a DOSier, I've been wondering
if I should buy the MKS toolkit -- will it really get me some Unix
functionality, or is MS-DOS just not up to dealing with that sort of thing?
Your examples express my worries.  The pipeline works efficiently under
Unix, because the three programs are all executing "at once" (working in
lock-step, taking advantage of each others' pauses), with the pipeline
keeping the three processes in sync.  By contrast, MS-DOS implements
this pipeline as three sequential program loads, with nothing happening
when the program waits on I/O, and two humungous temp files being created.
Time consuming, and prodigal of disk space.

It's worth noting that, except for people like MKS who port Unix filters to
DOS, nobody seems to be interested in writing programs that utilize the
DOS pipe mechanism.  Which is why you can't find any DOS filters to do
the manipulations you need.

If my understanding of the problem deserves flames, I certainly want to
see them -- I'd rather be wrong about this!

Isaac Rabinovitch
Disclaimer:  Just because I think you're wrong, doesn't
             mean I don't think you're a fun person!
:-)

alex@mks.UUCP (Alex White) (02/08/88)

In article <2962@cup.portal.com>, Isaac_K_Rabinovitch@cup.portal.com writes:
> 
> rdo031@tijc02.UUCP (Rick Odle           ) writes:
> ->MKS gives an example in their manual of doing something like this:
> ->
> ->	find /dir -type f | cpio -oc | dd of=a:
> ->
> ->the -oa keeps the modification time of the original file.  The real problem
> ->lies in the fact that dd will not write a large archive to a multi-diskette
> ->volume.  Also there is no compression of the files involved.  
I always hate to refer to things that are going to be coming out later
[NO - the release date hasn't yet been set], but the new version of cpio
has several enhancements:
1) builtin compression.  Using the -z flag `cpio -ocz' is entirely equivalent
to on unix typing `cpio -oc | compress -b 14' [and `cpio -icz' to
`uncompress | cpio -ic']
2) multi-volume stuff.  Actually, its sort of in there now, though not
as nicely.  If output is to a file, then when cpio gets an out of disk space
error, it just asks for another filename.  Change floppies, put another one
in, and give it a new filename [or, since you changed floppies, the same
name]. Ditto, on input if end of file is reached and cpio's trailer hasn't
been read, it asks for a new filename. Now this of course doesn't help
the pipe to dd if you want to go to a raw floppy, but...
3) raw-disk driver.  I don't know if this will ever see the light of day
because DOS makes these things so hard - it appears to go out of its way
to make sure you can't do reasonable things - but I do have a driver in
my config.sys file, and now can back up via
	find /dir -type f | cpio -ocvz >/dev/fdaq

> Your examples express my worries.  The pipeline works efficiently under
> Unix, because the three programs are all executing "at once" (working in
> lock-step, taking advantage of each others' pauses), with the pipeline
> keeping the three processes in sync.  By contrast, MS-DOS implements
> this pipeline as three sequential program loads, with nothing happening
> when the program waits on I/O, and two humungous temp files being created.
> Time consuming, and prodigal of disk space.
Absolutely - this pipeline is not something one would want to do on DOS.
However, most pipelines are fully as efficient on DOS as Unix - assuming
that you don't want to watch the data coming out the far end while its
going in.
Consider though, how A | B is implemented:
Unix:
	A runs, data copied to kernel,
	context switch to B
	data copy from kernel to B
	context switch back to A,
	...
Dos:
	A runs to completion: data copied to RAM disk
	Switch to B: data copied from RAM disk to B
Now, with the assumption of a RAM disk, and one which is large enough
for your largest pipe [I have about a meg] can anyone think of any reason
that the dos method should be slower in terms of time-to-complete?
[Certainly not in terms of time-to-first-output].

sears@sun.uucp (Daniel Sears) (02/11/88)

Here are two scripts that I use to split and reassemble large files (2-10Mb).
Maybe you can modify them for archiving.

This brings me to a related subject.  Floppy disks, even of the 1.2 megabyte
variety, are too small for backup or file transfer of large files.  I would
be interested in hearing from people who use the MKS Toolkit with tape drives
and streamers.

--Dan

#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create:
#	split.ksh
#	unsplit.ksh
# This archive created: Wed Feb 10 11:03:33 1988
export PATH; PATH=/bin:/usr/bin:$PATH
if test -f 'split.ksh'
then
	echo shar: "will not over-write existing file 'split.ksh'"
else
cat << \SHAR_EOF > 'split.ksh'
#
# split.ksh -- split a file into 1Mb chunks with dd(1)
# usage sh split.ksh "input file"
#

chunks=$(ls -l $1 | awk '{ t = int($3/1000000 + 1) ; for(i=1; i<=t; i++) printf("%d ", i) }')

offset=0

for i in $chunks
do
    echo "change diskettes and press <RETURN>"
    read
    dd if=$1 of=a:$i.bak bs=1024 skip=$offset count=1000
    offset=$(echo $offset | awk '{printf("%s", $0 + 1000)}')
done
SHAR_EOF
fi
if test -f 'unsplit.ksh'
then
	echo shar: "will not over-write existing file 'unsplit.ksh'"
else
cat << \SHAR_EOF > 'unsplit.ksh'
#
# unsplit.ksh -- assemble a file from chunks with dd(1)
# usage sh unsplit.ksh "n chunks" "input file"
#

chunks=$(echo $1 | awk '{ for(i=1; i<=$0; i++) printf("%d ", i) }')

offset=0

for i in $chunks
do
    echo "change diskettes and press <RETURN>"
    read
    dd if=a:$i.bak of=$2 bs=1024 seek=$offset count=1000
    offset=$(echo $offset | awk '{printf("%s", $0 + 1000)}')
done
SHAR_EOF
fi
exit 0
#	End of shell archive
-- 
Daniel Sears                Sun Microsystems, Inc.
Technical Publications      MS 5-42
(415) 691-7435              2550 Garcia Avenue
sears@sun.com               Mountain View, CA  94043

scott@ubvax.UB.Com (Scott Scheiman) (02/12/88)

In article <391@mks.UUCP>, alex@mks.UUCP (Alex White) writes:
< Consider though, how A | B is implemented:
< Unix:
< 	A runs, data copied to kernel,
< 	context switch to B
< 	data copy from kernel to B
< 	context switch back to A,
< 	...
< Dos:
< 	A runs to completion: data copied to RAM disk
< 	Switch to B: data copied from RAM disk to B
< Now, with the assumption of a RAM disk, and one which is large enough
< for your largest pipe [I have about a meg] can anyone think of any reason
< that the dos method should be slower in terms of time-to-complete?

Maybe the version of DOS I'm using is too old (I use 3.10), but in the
world I get to live in there is no way that I know of to tell DOS to use
any disk drive but the current drive when it implements a 'pipe'.  For
many applications, having to have the RAM disk be the current drive
makes the command awkward to type (at the least) and occasionally
impossible (there are programs which have requirements about the current
drive/directory).

Does anyone know of a way to tell DOS what drive to use for the 'pipe'
intermediate file?
-- 
"Ribbit!"       Scott (Beam Me Up, Scotty!) Scheiman      Ungermann-Bass, Inc.
  ` /\/@\/@\/\       ..decvax!amd!ubvax!scott            3990 Freedom Circle
   _\ \ -  / /_           (408) 562-5572                Santa Clara, CA 95050