mwm@ucbopal.berkeley.edu (Mike (I'll be mellow when I'm dead) Meyer) (05/11/86)
In article <1209@lsuc.UUCP> jimomura@lsuc.UUCP (Jim Omura) writes: >>The par.c mechanism uses the BSD ar format for the file. This format, >>unlike the SysV ar format, is pure ascii. i.e. if the files par'ed >>together are all ascii files, the entire file is ascii. > > That's essentially what I wanted to find out. If there's been >a substantial effort in that direction, then I think we should use >'par.c' for a standard archive. I didn't know how much work had been >done in either direction. > > Any other comments either way? YES! The 4BSD ar format is unsuitable as an OS-9 (and AmigaDOS and Unix and MS-DOS and etc.) standard archive for one simple reason: it doesn't understand directories. This is why James Jones wrote something for OS-9 that correctly dissasembles BSD tar files, including the directory creation, and why I ported it to AmigaDOS - so we could download directory structures (like microemacs). [For those interested, this code - in a form that should compile on both OS-9 and AmigaDOS - has been posted to net.micro.amiga.] Unfortunately, the 4BSD tar uses lots of NULLs, which will get eaten by mailers. Also, there isn't something to build tar files in the public domain (yet). Might I suggest this problem be tackled a different way: decide what the tool should be, then what features it has to have, then which of the public domain archivers will be easiest to modify to do that? To start it off, I think that what we should really be looking for is an archive format for moving source through various mailers, and the archive should be suitable for any system with a Unix-like file structure, not just OS-9. The discussion I've seen tends to suggest that this is what people are really looking for, but the Subject line (which I've changed) didn't suggest that. I feel that the minimum set of features should be: 1) PD versions available for most major OS's, both for micros and Internet hosts. 2) The headers have no non-ASCII/EBCDIC characters, or TABS. 3) A checksum of some kind is included on each file. 4) The format include provisions for creating directories. 5) It shouldn't choke on binary data in the archive. Some discussion: 1) Obviously, to get as wide a distribution as possible. Probably those on non-Unix Internet hosts will have to have someone write a version for that host. 2) Since people on BITNET are interested in sources, we shouldn't make the headers incompatable with their mailers. This means no non-EBCDIC characters. Also, BITNET (for some reason) eats tabs, so we shouldn't put those in the header either. Of course, most source will have them, but why make things more difficult than we have to? 3) Of course; required for sending data through the mail. 4) This is harder, as some of the hosts don't have the same directory format as Unix/OS-9/AmigaDOS. The archive format should specifiy what directories look like (probably Unix), and those implementing versions for other systems can decide how to handle things. For instance, what does OS-9 do with an archive that has both README and Readme in it? 5) Obviously, otherwise we'll have a different format for local use. Most of this is obvious and straightforward; just thought that it ought to get said before a decisions is made. Also, will net.micro readers please excuse the cross-posting. I posted to the original discussion groups, and to net.micro as I thought it was important enough to need to be seen by that group. Followups have been pointed to net.micro ONLY. Thanx for the time, <mike