[comp.sys.ibm.pc] ARC/ZOO/TAR

amit@umn-cs.cs.umn.edu (Neta Amit) (11/30/87)

ARC (and derivatives) has been around for quite some time, and has developed
    into the MS-DOS de-facto standard for archiving and info-exchange.

To me, the main advantage of ZOO is its ability to store structure,
    as well as contents. There are two disadvantages: (1) it is not widely
    accepted, and (2) it needs an external source to create the structure
    for it. I.e. lookup the ZOO manual under the -I switch, and notice
    that under Unix, the Find command creates the structure, which is
    subsequently given to ZOO.  Under MS-DOS, Find is non-existant,
    and you need to specify the structure manually -- a tedious job.

This weekend, a public domain TAR (courtesy John Gilmore) has been posted
on comp.sources.unix, and is now implemented under Unix and MS-DOS. It
is likely to be ported to VMS, MAC, Amiga.

PDTAR offers a number of significant advantages over both ZOO and ARC:
  - It is the de-facto standard in the Unix world. Info-exchange with
    Unix machines is much easier with TAR.
  - It creates the structure it needs
  - It is fast; on the small sample that I did -- faster than ARC or ZOO
  - It can compress, and the resulting archive is small; on the sample above, 
    smaller than the .arc or .zoo files

Standards should occasionally be replaced by better standards, not
necessarily offering downward compatability.  After PDTAR will have
stabilized, I suggest that BBS's and national archives adhere to it.



-- 
  Neta Amit 
  U of Minnesota CSci
  Arpanet: amit@umn-cs.cs.umn.edu

davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) (12/01/87)

In article <3027@umn-cs.cs.umn.edu> amit@umn-cs.UUCP (Neta Amit) writes:
| ... discussion of zoo and arc ...
| This weekend, a public domain TAR (courtesy John Gilmore) has been posted
| on comp.sources.unix, and is now implemented under Unix and MS-DOS. It
| is likely to be ported to VMS, MAC, Amiga.
| 
| PDTAR offers a number of significant advantages over both ZOO and ARC:
|   - It is the de-facto standard in the Unix world. Info-exchange with
|     Unix machines is much easier with TAR.
The quantity of discussion in std.unix would make me think that's not
decided yet. Isn't cpio in posix?

|   - It creates the structure it needs
Somewhat. I lost eight hours this weekend redumping stuff on one machine
and loading to another. The first time I used tar, the second cpio. tar
was faster, and small (it handles links at dump rather than load time).
tar doesn't save directories, as you said it creates them. You lose all
info about owner, permissions, time modified, etc. It also *doesn't
create empty directories!!* Many programs which keep status and restart
info in directories will leave the directories empty when shut down.

|   - It is fast; on the small sample that I did -- faster than ARC or ZOO
|   - It can compress, and the resulting archive is small; on the sample above, 
|     smaller than the .arc or .zoo files
The "standard" tar doesn't compress, at least on V7, SysIII, SysV, or
Ultrix. What you are proposing is (yet) another file format entirely.
This is not a bad thing, but somewhat negates your earlier argument
about standard. A regular (uncompressed) tar file will more widely
readable than the compressed format.

---
I'm not saying that your idea is without merit, and it should be
considered as another alternative format. However, the question is not
as clear as you believe. The advantage of archivers is that they allow
easy random access to the files in an archive. The price of this is
compressing them separately, which reduces the compression and increases
the cpu needed. They allow easy replacement of individual files in the
archive.

I will be evaluating pdtar in the next few weeks, and I expect it to be
useful, and reliable (J.G. does good stuff). I don't expect it to be the
only archiver I use, on UNIX, on PCs, and most of all on (yecch) VMS. At
the moment zoo is the only thing I have which runs in all environments.

Please post all reasonable discussion, mail flames to me directly.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

manes@dasys1.UUCP (Steve Manes) (12/01/87)

Where's the Unix portability in a compressing 'tar'?

What's the relative speed of PDTAR to ZOO?  Also, what machines does it
support?  ZOO is the only archiving compressor I've found that will compile
and run reliably on Microport V/AT and with compilers not supporting a
HUGE model.

-- 
+-----------------------------------------------------------------------
+ Steve Manes         Roxy Recorders, Inc.                 NYC
+ decvax!philabs!cmcl2!hombre!magpie!manes       Magpie BBS: 212-420-0527
+ uunet!iuvax!bsu-cs!zoo-hq!magpie!manes              300/1200/2400

jgray@toad.pilchuck.Data-IO.COM (Jerry Late Nite Gray) (12/01/87)

In article <3027@umn-cs.cs.umn.edu>, amit@umn-cs.cs.umn.edu (Neta Amit) writes:
> ARC (and derivatives) has been around for quite some time, and has developed
>     into the MS-DOS de-facto standard for archiving and info-exchange.
> 
> To me, the main advantage of ZOO is its ability to store structure,
>     as well as contents. There are two disadvantages: (1) it is not widely
>     accepted, and (2) it needs an external source to create the structure
> ....

> This weekend, a public domain TAR (courtesy John Gilmore) has been posted
> on comp.sources.unix, and is now implemented under Unix and MS-DOS. It
> is likely to be ported to VMS, MAC, Amiga.
> 
> PDTAR offers a number of significant advantages over both ZOO and ARC:
>  - It is the de-facto standard in the Unix world. Info-exchange with
>    Unix machines is much easier with TAR.
>  - It creates the structure it needs
>  - It is fast; on the small sample that I did -- faster than ARC or ZOO
>  - It can compress, and the resulting archive is small; on the sample above, 
>    smaller than the .arc or .zoo files
> 
Yes I can see TAR and ZOO being more widely used since they deal with
file structures, But I see one possible problem with respect to doing something
like making backups. When you are creating an archive of a single directory
it is easy to see whether the amount of information you are archiving will
fit on the target media (floppy or tape). When arhiving a whole structure
it is much more difficult. Do TAR and/or ZOO allow you to archive a whole
directory structure onto more than one disk in much the same way that
DOS's BACKUP command (or the FASTBACK utility) does?

Presently I use FASTBACK for backups and PKARC for carting around small
collections of files. I have occasionally used FASTBACK to transfer file
structures from one machine to another. This is very nice but it has a
few limitations. Since FASTBACK uses it's own formating method, the archived
files aren't readable by anything else and can't be shipped around the net.


Just some thoughts.

---------------
					Jerrold L. Gray

UUCP:{ihnp4|caip|tektronix|ucbvax}!uw-beaver!tikal!pilchuck!jgray

USNAIL:	10525 Willows Road N.E. /C-46
	Redmond, Wa.  98052
	(206) 881 - 6444 x470

Telex:  15-2167

caf@omen.UUCP (Chuck Forsberg WA7KGX) (12/02/87)

In article <3027@umn-cs.cs.umn.edu> amit@umn-cs.UUCP (Neta Amit) writes:
:Standards should occasionally be replaced by better standards, not
:necessarily offering downward compatability.  After PDTAR will have
:stabilized, I suggest that BBS's and national archives adhere to it.

The current PDTAR has a few shortcomings:
	1.  No support for multiple floppies
	2.  Compression not built in
	3.  MSDOS can't pipe to compress
	4.  MSDOS compress - 12 bit??

It appears PDTAR is not one standard, but several: Classic TAR, New TAR,
and multiple flavors of compressed New TAR, and ne'er the twain shall
meet, and the probability of being able to dearchive on a paritcular
machine is somewhat less than unity.  What a zoo.

danoc@clyde.UUCP (12/03/87)

In article <3027@umn-cs.cs.umn.edu> amit@umn-cs.UUCP (Neta Amit) writes:
> ...
>PDTAR offers a number of significant advantages over both ZOO and ARC:
>  - It can compress, and the resulting archive is small; on the sample above, 
>    smaller than the .arc or .zoo files
>

pdtar will not compress on ms-dos systems. the code to handle compression
uses fork() and is surrounded by #ifdef; if MSDOS is defined that section
is not compiled.

i haven't seen a version of compress (or arc for that matter) that will
compress a file on **ix (vax svr2 style) and uncompress on ms-dos.  is
there one?  i guess i don't actually need it now that i have zoo.

zoo works beautifully.  i've packed up files into a zoo archive on the vax
and upacked it on the 6300; zoo's the first program i know about that
handles that.  some seat-of-the-pants testing (on the zoo source) showed that
the zoo archive is smaller than those created by arc and pkarc.  zoo
is much faster than arc and about the same speed as pkarc on the pc (in
creating an archive).

-dan

gnu@hoptoad.uucp (John Gilmore) (12/03/87)

I don't normally read comp.sys.ibm.pc, but a friend pointed me at the
discussion of PD tar here.

Mostly tar was designed to run on larger systems, and to be compatible
with the Unix tar program.  When the Unix Standards effort (POSIX)
looked reasonable enough to pick tar as a standard tape format, it was
also intended as an easy way to read/write that format.  Now it looks
like the marching morons on the committee will end up picking cpio because
it's System V specific, so tar is just for the folks who want a good
tar program.  It was ported to MSDOS by Mike Rendell (uunet!garfield!michael)
and I rolled his changes back into the mainline sources for the release.

Chuck Forsberg said that tar had a "shortcoming" that "compression
wasn't built in".  I see how people on MSDOS are forced to build tools
that do everything by hand (e.g. compression, searching for files, etc)
but I do not build things that way.  Since my tar is truly public
domain, you can take it and hack in the guts of compress somehow, but I
will not take the changes back.  Tar and compress are perfectly good
tools, like a hammer and a chisel.  I don't want to build a "hasel", I
like them separate.  If your OS and/or shell can't manage to connect
two perfectly good tools, I guess you had better go build yourself a
hasel.

This problem is rampant on MSDOS and I wish you all luck in getting
past it without throwing all your software away.  Somehow I don't think
OS/2 will be it, given Microsoft's past taste in software and current
partnership with the biggest hardware company and the worst software
company.  But it's hard for me to run down an undocumented, unreleased
system (except for its being announced with no doc and no release).

Someone said tar doesn't write directory ownership, permission, etc.
He's using an ancient tar program.  Mine does, as does Berkeley's and
the one spec'd by POSIX.  It *does* archive empty directories, too.
It can read tar archives with or without directories (old or new).
Chuck also complained that there is no apparent standard for tar.
When I started writing it, I followed up every case I heard of (on the net
and off it) of people not being able to read tar tapes from some other
system.  The two problems I found were:  Some systems write tape blocks
larger than other systems' tape drives can read; and: some minor systems
write their tapes out byte swapped because the idiot who programmed
the driver didn't notice and then was too lily-livered to admit it was
a bug.  In short, there are no compatability problems with tar.  People
with old tar's can read the output of mine, and the worst they get is
a few error messages.

One thing Chuck mentioned that I *would* like to support is multiple
volumes (floppies or tapes) but I want to do it such that:

	* You can read back any subset of the volumes, as long as
	  they contain the data you want.  You don't have to start
	  from #1 and go to #N.

	* You don't have to tell tar "how big" a volume is.  This
	  doesn't work on any kind of tapes -- how much fits depends
	  on how many errors occur, how often the tape stops, etc.

	* The design is reasonable enough that Unix tar's will pick it
	  up.

I've heard that DEC has done a multi-volume tar in their newest Ultrix
release, but so far have seen no description of exactly what they did.

I would be tickled if my Unix tar program became an MSDOS standard but
it's not what I expect or hope for.  I wrote it for the GNU project,
the free Unix clone in source for everybody project.  If you can use it,
fine, if not, also fine.
-- 
{pyramid,ptsfa,amdahl,sun,ihnp4}!hoptoad!gnu			  gnu@toad.com
		"Watch me change my world..." -- Liquid Theatre

rmtodd@uokmax.UUCP (12/03/87)

In article <8042@steinmetz.steinmetz.UUCP> davidsen@crdos1.UUCP (bill davidsen) writes:
>In article <3027@umn-cs.cs.umn.edu> amit@umn-cs.UUCP (Neta Amit) writes:
>tar doesn't save directories, as you said it creates them. You lose all
>info about owner, permissions, time modified, etc. It also *doesn't
>create empty directories!!* Many programs which keep status and restart
>info in directories will leave the directories empty when shut down.
  When run in the normal 'new (ANSI) format' mode, PD Tar does make entries
in the tar-file for the directories.  It saves their owner, permissions, time
modified.  If I remember correctly, the old (V7) tar format doesn't include
entries for directories, just for the files, and tar has to construct the
directories itself.  PD Tar, when reading new-format tarfiles, should create
the directories with appropriate mode and owner (if you're root).  It doesn't
handle mod times correctly (it creates the dir with the correct mod time, but
the mod time gets trashed when the files in the directory are extracted); John
Gilmore admits this is a bug which he hopes to fix in future releases.  
I don't see offhand why PD Tar would have any problem with empty directories;
if I get the chance I'll try it out this weekend and see.
  As a side note, the difference between the old and new tarfile format isn't
very great.  The only differences are that the new format has entries for
directories, and the new format headers have extra fields for the user and
group names, not just the uids.  Most standard tar programs should handle
PD Tar's output without problem, and vice versa.
>The "standard" tar doesn't compress, at least on V7, SysIII, SysV, or
>Ultrix. What you are proposing is (yet) another file format entirely.
>This is not a bad thing, but somewhat negates your earlier argument
>about standard. A regular (uncompressed) tar file will more widely
>readable than the compressed format.
The standard tar does not compress in and of itself;  it can produce 
compressed files if you do the appropriate pipe into compress.  You can
extract from a compressed tar-file just by
	zcat tarfile.Z | tar xvf -
The PDTar command
	tar xvfZ tarfile.Z 
just does the piping and executing of compress automatically.  Similarly, 
PDTar's
	tar cvfZ tarfile.Z dir
is equivalent to
	tar cvf - dir | compress >tarfile.Z
All you need to read compressed tarfiles on older systems is a copy of
compress.
>I'm not saying that your idea is without merit, and it should be
>considered as another alternative format. However, the question is not
>as clear as you believe. The advantage of archivers is that they allow
>easy random access to the files in an archive. The price of this is
>compressing them separately, which reduces the compression and increases
>the cpu needed. They allow easy replacement of individual files in the
>archive.
Well, tar allows random *access* to files in the archive; it just doesn't
allow random *updating* of the archive.  I can extract as few files as I want,
but if I change one file I have to rebuild the entire archive.  
>I will be evaluating pdtar in the next few weeks, and I expect it to be
>useful, and reliable (J.G. does good stuff). I don't expect it to be the
>only archiver I use, on UNIX, on PCs, and most of all on (yecch) VMS. At
>the moment zoo is the only thing I have which runs in all environments.
Zoo is awfully nice, and I use it a great deal myself.  Unfortunately, I also
do a great deal of work with MINIX, and ZOO just won't fit in the 64Kcode/
64Kdata space that MINIX allows.  PD Tar does.  (It also beats the living
daylights out of the tar that came with MINIX).
--------------------------------------------------------------------------
Richard Todd
USSnail:820 Annie Court,Norman OK 73069
UUCP: {allegra!cbosgd|ihnp4}!occrsh!uokmax!rmtodd

haim@gvax.cs.cornell.edu (Haim Shvaytser) (12/04/87)

Could somebody please post an executable version of PDTAR for those
of us that do not have MSC 3.0?

davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) (12/09/87)

In article <3456@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
| Chuck Forsberg said that tar had a "shortcoming" that "compression
| wasn't built in".  I see how people on MSDOS are forced to build tools
| that do everything by hand (e.g. compression, searching for files, etc)
| but I do not build things that way.  Since my tar is truly public
| domain, you can take it and hack in the guts of compress somehow, but I
| will not take the changes back.  Tar and compress are perfectly good
| tools, like a hammer and a chisel.  I don't want to build a "hasel", I
| like them separate.  If your OS and/or shell can't manage to connect
| two perfectly good tools, I guess you had better go build yourself a
| hasel.

Why bother to build a DOS version at all, if you take the "my o/s is
better than your o/s" attitude?
| much MS-DOS bashing here, and complaining about OS/2.

| Someone said tar doesn't write directory ownership, permission, etc.
| He's using an ancient tar program.  Mine does, as does Berkeley's and
| the one spec'd by POSIX.  It *does* archive empty directories, too.

No, I'm using a brand new SysV.3 tar. Another case of "my o/s is
better..." The reason people like tar is that it runs on UNIX, and on
MS-DOS, and on VMS, and it works correctly and compatibly in all cases.
It does compression, allows adding, extracting and deleting individual
files, and expands wildcards in non-UNIX systems to use UNIX conventions
and present a seamless and consistant use interface.

Portable to me means "runs in many places." About half the UNIX users in
the world are on USG versions, and to cut them off, and DOS users, and
not even *think* about VMS, is certainly not the portability I need.

I believe that you bash the idea of putting all the functions into one
program, but encourage the idea of using one tool for all jobs ("When
all you have is a hammer, everything looks like a nail"). Tar is a fine
tool for interchange of information. Is is a lot better than nothing,
but does not replace other formats, such as cpio. It does not replace
real archive programs, because they do random file access.

| It can read tar archives with or without directories (old or new).
| Chuck also complained that there is no apparent standard for tar.
| When I started writing it, I followed up every case I heard of (on the net
| and off it) of people not being able to read tar tapes from some other
| system.  The two problems I found were:  Some systems write tape blocks
| larger than other systems' tape drives can read; and: some minor systems
| write their tapes out byte swapped because the idiot who programmed
| the driver didn't notice and then was too lily-livered to admit it was
| a bug.  In short, there are no compatability problems with tar.  People

Anyone who doesn't do it like a VAX is lily-livered and didn't conform
to the (sorry) non standard? This kind of senseless flame does little to
enhance your position.

| I would be tickled if my Unix tar program became an MSDOS standard but
| it's not what I expect or hope for.  I wrote it for the GNU project,
| the free Unix clone in source for everybody project.  If you can use it,
| fine, if not, also fine.

This is the type of rational thinking I would expect.

| {pyramid,ptsfa,amdahl,sun,ihnp4}!hoptoad!gnu			  gnu@toad.com
| 		"Watch me change my world..." -- Liquid Theatre


-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me