[comp.unix.wizards] multivol piped to tar

dan@prairie.UUCP (Daniel M. Frank) (11/20/86)

   I have a very strange problem.  I am working with the multivol
program posted to net.sources (or was it mod.sources?).  I piped
tar into it, which seemed to work fine, but when I tried to pipe
its output into tar (i.e. from the diskette), tar would get through
a couple files and then stop with no error messages.

   Upon some investigation, I found that there is a problem with
the way multivol stores block sizes.  As it comes, they are
stored right-justified within the 6-character field, which leads
to problems when the data block following starts with numeric
characters (sscanf counts field sizes starting at the first
non-blank character).  Anyway, a simple fix to the write block
routines fixed that, and everything worked fine, right?

   Wrong.  I can direct the output of the program to a file, 
and feed that file into tar, and it works.  I can cat the file,
and pipe cat's output into tar's standard input, and it works (so
the problem doesn't seem to be seeks).  I cannot pipe the output
of the program into tar, and I cannot pipe the output of the program
into cat, and pipe cat's output into tar (i.e. a three stage pipeline).
I should mention that this problem occurs two files into the first
diskette, so it is not related to changing diskettes.

   Does anyone know why tar should just stop silently?  The only
thing I can think is that multivol doesn't provide its data in
disk-sector-size blocks, since part of each multivol block is taken
up with an eight-byte header.  That doesn't make much sense across
a pipeline, unless tar is timing its reads, which doesn't make
much sense either.  Any ideas?

-- 
    Dan Frank
    uucp: ... uwvax!prairie!dan
    arpa: dan%caseus@spool.wisc.edu

dan@prairie.UUCP (Daniel M. Frank) (11/26/86)

   I would like to thank the people who've responded to this query so far.
Most have indicated that there is a bug in tar that causes it to hang if
used in a pipeline, because it forks a mkdir request and then doesn't
wait on the correct pid.

   I was probably not specific enough:  tar doesn't hang, it exits (or
multivol does).  It does so even when the command is `tar t', which
creates no directories.

   Any other ideas?  By the way, the OS here is SVR2.  I've seen tar
stop under bsd when it was being fed in a pipe by zcat.  Does tar time
its input?

-- 
    Dan Frank
    uucp: ... uwvax!prairie!dan
    arpa: dan%caseus@spool.wisc.edu

tony@xios.UUCP (Keeper Of News) (11/26/86)

I have a similar problem on a number of machines when I use the
'untarmail' script which we got with Compress 4.0.
What happens is that this script executes the pipeline

	atob | uncompress | tar xf -

and when tar has to create a directory, it hangs.  If you kill it and
restart it, it works until it has to make another directory.  If you
change this to

	atob > file
	uncompress < file | tar xf -

it works fine.  Related problem?? 
-- 
-------------------------------------------------------------------------------
Tony Lill	Keeper of News @ Xios Systems Corporation 
				 1600 Carling Avenue, Suite 150, Ottawa,
				 Ontario, Canada, K1Z 8R8	(613) 725-5411
				 xios!tony
-------------------------------------------------------------------------------
		Not the edge of the world, but we can see it from here.

stuart@bms-at.UUCP (Stuart D. Gathman) (11/30/86)

In article <367@xios.UUCP>, tony@xios.UUCP (Keeper Of News) writes:

> 	atob | uncompress | tar xf -

	*** doesn't work ***

> 	atob > file
> 	uncompress < file | tar xf -
> 
> it works fine.  Related problem?? 

This is a bug in 'tar'.  On our version (Xenix 2.1.3) I can fix
it with:

	atob | uncompress | tar xfb - 1

I have never seen the source to tar, so I don't what the underlying
problem is.  Evidently default block sizes get messed up on extracts.
-- 
Stuart D. Gathman	<..!seismo!dgis!bms-at!stuart>

chris@mimsy.UUCP (Chris Torek) (11/30/86)

In article <360@prairie.UUCP> dan@prairie.UUCP (Daniel M. Frank) writes:
>   Does anyone know why tar should just stop silently?  The only
>thing I can think is that multivol doesn't provide its data in
>disk-sector-size blocks, since part of each multivol block is taken
>up with an eight-byte header.  That doesn't make much sense across
>a pipeline, unless tar is timing its reads, which doesn't make
>much sense either.  Any ideas?

Chances are that this is indeed the problem.  I cannot speak for
`The Standard', but for all us nonstandard folk running 4.2 or
4.3BSD, tar has the `B' option to re-block its input.

Tar has a section of code that looks about like this:

	n = read(tape, buffer, blocksize);
	if (n < blocksize)
		/* complain, or perhaps just exit */ ...

Because it is reading a tape drive, tar is rather fussy about the
number of bytes returned from read calls.  In particular, it should
always be the same, and it should be a multiple of 512 bytes.  But
wait, what is this?  Tar is *not* reading from a tape!  Ai, trouble.

When tar is reading from a pipe, read() returns a number that is
between one and the internal pipe buffer size (or blocksize,
whichever is less).  As it happens, if the writer of a pipe always
writes in multiples of buffer-size bytes, the reader (tar) will
always get buffer-size bytes back.  If the writer is slow enough,
and always writes $c$ bytes, where $c$ is less than buffer-size,
tar will always get $c$ bytes.  But it also sometimes happens that
the writer will not co-operate so nicely.

If you supply the `B' option, or in 4.3BSD, if you tell tar to read
from standard input, that section of code will be replaced with
one more like this:

	left = blocksize;
	p = buffer;
	do {
		n = read(tape, p, left);
		if (n <= 0)
			/* handle eof or error */ ...
		p += n;
		left -= n;
	} while (left);

For those of you suffering with `The Standard', if your tar does not
have a re-blocking option, there is always this trick:

	(commands) | cat | tar xbf 1 -

(assuming your `cat' will do the re-blocking).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
UUCP:	seismo!mimsy!chris	ARPA/CSNet:	chris@mimsy.umd.edu

brad@bradley.UUCP (12/02/86)

We too had the same problem on 2 different 3b5's.  After looking
at the code I decided It wasn't that great.  The reason? Well the
code has the same problem as cpio (on the 3b5's).  It seems that
if you write a partial block on the end of the tape, the hardware
on the tape drive has problems and doesn't put an end of tape marker
out there.  This can make cpio backups no-good.

brad

gnu@hoptoad.uucp (John Gilmore) (12/02/86)

In article <4613@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
>                  ...for all us nonstandard folk running 4.2 or
> 4.3BSD, tar has the `B' option to re-block its input.
> For those of you suffering with `The Standard', if your tar does not
> have a re-blocking option, there is always this trick...

Or you could get a copy of my public domain tar from mod.sources.
PD tar supports the "B" option as do 4.2 and 4.3 tar, it has been ported
to several different `The Standard' systems, and it's free.  It also
supports the new POSIX stuff (owner/group names in addition to numbers;
dumping directories, fifos, /dev, etc).  And it's faster than Unix tar.
I have not tried it with multivol, though I've piped a lot of stuff to it.

I submitted it about a month ago and it should be coming out Real Soon
Now.  Please don't send me mail asking for copies.  Send mail to Rich
$alz and ask him where mod.sources has gone and whether you can
volunteer to help him clean out the backlog.
-- 
John Gilmore  {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu   jgilmore@lll-crg.arpa
Call +1 800 854 7179 or +1 714 540 9870 and order X3.159-198x (ANSI C) for $65.
Then spend two weeks reading it and weeping.  THEN send in formal comments!

tony@uqcspe.UUCP (12/10/86)

In article <10900001@bradley> brad@bradley.UUCP writes:
>
>We too had the same problem on 2 different 3b5's.  After looking
>at the code I decided It wasn't that great.  The reason? Well the
>code has the same problem as cpio (on the 3b5's).  It seems that
>if you write a partial block on the end of the tape, the hardware
>on the tape drive has problems and doesn't put an end of tape marker
>out there.  This can make cpio backups no-good.

	When writing multivol and testing it on several floppies/tapes
I came across the same problem on some devices.  This is one reason why
multivol permits you to specify a limit on the number of blocks written
to a volume.  If an incomplete block is detected when writting at end of tape,
multivol rewrites it at the start of the next volume.  However, on our
micovax tape drive the last (incomplete) block appears complete and when
re-reading the volume it is missing.  Also some floppies I tested pretend to
sucessfully write past the last possible block and similarly require
a block limit.

	Tony O'Hagan
==============================================================================
Tony O'Hagan		Australia: (07) 3774125  International: +61 7 3774125
University of Queensland	CSNET:	tony@uqcspe.oz	ACSnet:	tony@uqcspe.oz
Dept. of Computer Science	UUCP:	...!seismo!munnari!uqcspe.oz!tony
St. Lucia, Brisbane, 		ARPA:	tony%uqcspe.oz@seismo.css.gov
AUSTRALIA  4067	 		JANET:	uqcspe.oz!tony@ukc

wedgingt@udenva.UUCP (Will Edgington/Ejeo) (12/12/86)

A more generic method (for those of you with tar's that don't understand the
'B' or 'b' blocking flags) is to use "dd obs=10240" (or possibly "dd bs=10240",
which will run faster) in place of the "cat" :

program-which-doesn't-block | dd obs=10240 | tar xf -

The "10240" will have to be adjusted to the block size the tar that created
the file used in some cases.  Also, having not used System V for several years
now, I can't remember whether "dd" comes with System V or not, though I would
be *very* surprised if it didn't ...  "dd" is basically a glorified "cat";
"obs" is output block size and "bs" is, you guessed it, "block size".

I've used this method to do tar's over ethernets via BSD's rsh; ethernets
don't like 10K blocked files much !! :-)
--
Will Edgington, Computing and Information Resources, University of Denver
		BusAd 469, 2020 S. Race, Denver CO 80208, (303) 871-2081
{{hplabs,seismo}!hao,ucbvax!nbires,boulder,cires,cisden}!udenva!wedgingt
TESTING: WEDGINGT@DUCAIR.BITNET ( == RHESUS on DU's VMS Cluster )
COMING SOON: wedgingt@nike.cair.du.edu, wedgingt@du.edu