john@chance.UUCP (John R. MacMillan) (10/01/89)
In order to make it easier for unshar programs to work without using /bin/sh, perhaps we should agree (hah!) upon some keyword directives that shar programs would include as comments. Eg. # FILE filename # NOEXIST # DATA prefix end_delimiter # SIZE [l LINES] [w WORDS] [c CHARS] # CONTINUE filename # EXIST # DECODE how # SUBDIR directory This is just a first shot, but you get the idea. The unpacker could be as paranoid as it wants. If the shar wants to get trickier than the keywords allow, it could, and the unpacker would just not do the tricky parts (perhaps a SKIP or WARN directive). But notice that the above could handle a uuencoded compressed file split across two parts that ends up in a subdirectory. The tough part would be getting people to make their shar programs generate it. -- John R. MacMillan "Don't you miss it...don't you miss it... john@chance.UUCP Some of you people just about missed it." ...!utcsri!hcr!chance!john -- Talking Heads
djm@wam.UMD.EDU (10/02/89)
In article <1989Sep30.171114.12550@chance.UUCP> john@chance.UUCP (John R. MacMillan) writes: >In order to make it easier for unshar programs to work without >using /bin/sh, perhaps we should agree (hah!) upon some keyword >directives that shar programs would include as comments. Eg. This suggestion seems to be moving in the direction of making archives that plain old /bin/sh can't unpack at all. Perhaps it's not a bad idea. An easier to parse, more standardized pure-ASCII archiving format than a shell archive would certainly be more appropriate for Amiga, MS-DOS, VMS, etc. postings, and would allow the packing and unpacking programs more versatility, security and control on Unix systems as well. Right now there is a profusion of shar programs that generate all kinds of codes to split up the included files, using sed, cat, wc, etc. and starting some or all lines with 'X' or '|' or '\tx' or who knows what else; secure unshar programs written in C have to simulate that, and as is becoming clear in this discussion, interpreting all of those formats requires implementing a substantial subset of the /bin/sh syntax -- a task which is much more difficult than required by the task of unpacking an ASCII archive. Of course, only one unshar program really need exist, as long as it is comprehensive and portable. Perhaps Rich Saltz's new release will satisfy everyone. I would like to see a replacement for shell archives that would have a simple to parse format similar to the one John suggested. It would have a header section for the whole archive, giving information like: # PARTS total number of parts # PART number of the this part # CREATED date of creation (ctime format would do, I guess) # CONTAINS names of the files it contains There would be another header section for each file extracted, with information like: # FILE file name or # DIRECTORY directory name # OFFSET starting offset of this part, to allow continuation of long files # BYTES file length # CHECKSUM checksum for original file # MODIFIED last modification date of file # ENCODING encoding method: ASCII, atob, others? Comments could have the same format as shell comments. The ASCII encoding could be accomplished by adding an extra '#' at the start of all lines in enclosed files that start with a '#', and then changing an initial '##' back to '#' when unpacking. I haven't decided whether this format should require the presence of external programs to do part of the work, like atob, compress, and sum. >The tough part would be getting people to make their shar >programs generate it. I think the harder part would be getting the programs that generate the suggested format into the hands of everyone who wants to distribute source code, and the programs that decode it into the hands of everyone who wants to use programs distributed in that format. In addition to tar, cpio, uu*code, [ab]to[ba], compress, arc, zip, and perhaps unshar, people would need to have another archiver/unarchiver. Tower of Babel! -- David J. MacKenzie <djm@wam.umd.edu>
ok@cs.mu.oz.au (Richard O'Keefe) (10/02/89)
In article <8910020054.AA08811@cscwam.UMD.EDU>, djm@wam.UMD.EDU writes: > This suggestion seems to be moving in the direction of making archives > that plain old /bin/sh can't unpack at all. Perhaps it's not a bad > idea. An easier to parse, more standardized pure-ASCII archiving > format than a shell archive would certainly be more appropriate for > Amiga, MS-DOS, VMS, etc. postings, and would allow the packing and > unpacking programs more versatility, security and control on Unix > systems as well. Let's not forget why we use sharchives in the first place. The point was to have a format for distributing sources which could be used by people who HAVEN'T got any specialised "unshar". If I am away from the net for a couple of months and find when I get back that all the sources are in some new format that I can't process, I am not going to be very happy. And saying that something is held on an archive somewhere is not very helpful either; lots of people have no FTP access. It would be ok to go over to a new format IF each of the source groups that used it posted a fresh copy of the decoding program every month, along with the index for the previous month. MS-DOS people can get a shell for a small sum. VMS people can get DEC/Shell; and if they haven't got it, you should remember that a posting in C is useless to many VMS sites anyway. On the other hand, if you're interested in "more standardised" stuff, don't forget that ASCII is (a) a *national* standard, not an international one, and (b) superceded by the ISO 8859 family, and (c) a pain for BITNET mail links. Your new format should let an MS-DOS-using donor mail text containing e-acute and other such characters to a MAC-using recipient with no harm resulting from an intermediate passage through EBCDIC. Get _that_ right first, and then worry about shar.
lhf@aries5.uucp (Luiz H de Figueiredo) (10/02/89)
How about starting with arc from Software Tools? I think there are a number of implementations already available. It might need some modification to allow for automatic decompression, but should serve fine as a start. ------------------------------------------------------------------------------- Luiz Henrique de Figueiredo internet: lhf@aries5.uwaterloo.ca Computer Systems Group bitnet: lhf@watcsg.bitnet University of Waterloo -------------------------------------------------------------------------------
karl@ficc.uu.net (Karl Lehenbauer) (10/02/89)
I think a portable archiver that writes headers in character format is the way to go as a long-term replacement for shar, something along the line of "cpio -oc" It is still very desirable to be able to examine the contents of an archive without having to uudecode, unzoo, etc. This argues for a textual archive, continuing current practice. -- -- uunet!ficc!karl "The last thing one knows in constructing a work is what to put first." -- Pascal
dhesi@sun505.UUCP (Rahul Dhesi) (10/03/89)
The development of my "rap" archiver was suspended due to my move. When I get a chance I will complete it. The idea is to create an archive that is extractable by feeding it to /bin/sh, but which has a rigid enough format that a relatively simply program written in C can also extract it, includes 32-bit CRCs for eror-checking, can be split into pieces at arbitrary points and later concatenated without worrying about message headers, and escapes tabs and characters so they pass survive IBM mainframe-based networks. Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com> UUCP: oliveb!cirrusl!dhesi
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/03/89)
In article <923@cirrusl.UUCP>, dhesi@sun505.UUCP (Rahul Dhesi) writes: | | The development of my "rap" archiver was suspended due to my move. | When I get a chance I will complete it. Since Rahul has a good reputation for being able to write portable software, I think it would be good to wait and see this, as opposed to having a number of standards. Rahul: if you want a copy of the latest shar2 for inspiration I'll mail it to you. Since Rich has expressed dislike for it I suspect that the source I submitted in March is not going to be posted, along with all the subsequent stuff I packaged with it. You might be able to use some of the file splitting or binary file code as a model. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "The world is filled with fools. They blindly follow their so-called 'reason' in the face of the church and common sense. Any fool can see that the world is flat!" - anon
jm36+@andrew.cmu.edu (John Gardiner Myers) (10/04/89)
I came up with an archive format with a rigid format which could be extracted by a relatively simple but secure unpacker and could also be fed to /bin/sh. The trick was to include a small C program in the archive to allow those who hadn't obtained the unpacking program to extract the files. A sample archive follows: #! /bin/sh # This is a mail archive. To unpack it, use the 'unmar' program from # comp.sources.unix. Alternatively, you can remove anything before this # line, then unpack it by saving it into a file and typing "sh file". # Contents: TEST # Wrapped by jm36@beak.andrew.cmu.edu on Tue Oct 3 16:32:24 1989 PATH=/bin:/usr/bin:/usr/ucb ; export PATH cat >,sunmar.c <<'EOF' #include <stdio.h> char *mystrchr(p,c) char *p; int c; { while (*p && *p != c) p++; return *p ? p : 0;} main() { char *p, buf[4096]; FILE *ofile = NULL; while (fgets(buf, sizeof(buf), stdin)) { if (ofile) { if (!strcmp(buf, "END\n")) { fclose(ofile); ofile = NULL; } else fputs(buf+(buf[0]=='X'), ofile); } else { if (!strncmp(buf, "BEGIN ", 6)) { if (p = mystrchr(buf+6, ' ')) *p = '\0'; if (!(ofile = fopen(buf+6, "w"))) { perror(buf+6); } else printf("Extracting file %s\n", buf+6); } else if (!strncmp(buf, "DIRECTORY ", 10)) { if (p = mystrchr(buf+10, ' ')) *p = '\0'; strncpy(buf+4, "mkdir", 5); system(buf+4); }}}} EOF cc -o ,sunmar ,sunmar.c ./,sunmar <<'END_OF_ARCHIVE' BEGIN TEST - 16 XThis is a test. END END_ARCHIVE END_OF_ARCHIVE rm -f ,sunmar ,sunmar.c exit 0 I haven't released this format because it would give any reasonable implementation of "unshar" a severe case of indigestion. The format would only be worth releasing if it had a decent chance of becoming more common than the shar format. It would only become the standard if the moderators of the sources groups adopted it. The moderators will only adopt it if it becomes the de-facto standard. Catch-22. -- _.John G. Myers Internet: John.G.Myers@andrew.cmu.edu (412) 268-2984 LoseNet: ...!seismo!ihnp4!wiscvm.wisc.edu!give!up
john@chance.UUCP (John R. MacMillan) (10/04/89)
In article <2270@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes: |In article <8910020054.AA08811@cscwam.UMD.EDU>, djm@wam.UMD.EDU writes: |> This suggestion seems to be moving in the direction of making archives |> that plain old /bin/sh can't unpack at all. Perhaps it's not a bad |> idea. | |Let's not forget why we use sharchives in the first place. |The point was to have a format for distributing sources which could |be used by people who HAVEN'T got any specialised "unshar". That's why I suggested what I did; it still works for everyone who's happy with shar format, and it makes it easier on people without /bin/sh or who don't trust running /bin/sh on someone elses shars. (I'm neither, by the way). -- John R. MacMillan "Don't you miss it...don't you miss it... john@chance.UUCP Some of you people just about missed it." ...!utcsri!hcr!chance!john -- Talking Heads
allbery@NCoast.ORG (Brandon S. Allbery) (10/04/89)
As quoted from <2270@munnari.oz.au> by ok@cs.mu.oz.au (Richard O'Keefe): +--------------- | In article <8910020054.AA08811@cscwam.UMD.EDU>, djm@wam.UMD.EDU writes: | > This suggestion seems to be moving in the direction of making archives | > that plain old /bin/sh can't unpack at all. Perhaps it's not a bad | > idea. An easier to parse, more standardized pure-ASCII archiving | | On the other hand, if you're interested in "more standardised" stuff, | don't forget that ASCII is (a) a *national* standard, not an international | one, and (b) superceded by the ISO 8859 family, and (c) a pain for BITNET | mail links. Your new format should let an MS-DOS-using donor mail text | containing e-acute and other such characters to a MAC-using recipient | with no harm resulting from an intermediate passage through EBCDIC. Get | _that_ right first, and then worry about shar. +--------------- This sounds a lot like Brad's ABE, which is already in the c.s.misc archives; and I think it can be configured to produce archives with a small dearchiver prepended. Of course, your proposed PC-to-Mac transfer will still have the small problem that Apple and IBM disagree on where to put e-acute.... ++Brandon -- Brandon S. Allbery, moderator of comp.sources.misc allbery@NCoast.ORG uunet!hal.cwru.edu!ncoast!allbery ncoast!allbery@hal.cwru.edu bsa@telotech.uucp, 161-7070 BALLBERY (MCI), ALLBERY (Delphi), B.ALLBERY (GEnie) Is that enough addresses for you? no? then: allbery@uunet.UU.NET (c.s.misc)
peter@ficc.uu.net (Peter da Silva) (10/04/89)
The software tools archiver has apparently been suggested before. It can be found in Volume 4 of comp.sources.unix. -- Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation. Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-' ``I feel that any [environment] with users in it is "adverse".'' 'U` -- Eric Peterson <lcc.eric@seas.ucla.edu>
jm36+@andrew.cmu.edu (John Gardiner Myers) (10/05/89)
tron!moran@umbc3 (Harvey R Moran) writes: > You have a decent idea, but your implementation leaves something to > be desired. [...] > Your program assumes a working sh to prime it. It also does a > compile, one of the things which would raise my paranoia level. Worse > yet, it deletes the thing that was compiled so I "can't" see what was > done. You miss the point. Anyone with half an interest in security would use the "unmar" program which I would have published in comp.sources.unix. This program would ignore everything before the first "BEGIN", could only create files and directories, would not allow absolute pathnames and "..", would handle the "Part M of N" foolishness, etc. I believe the version I have is portable to non-unix systems, but I haven't actually gone through the trouble of beta-testing it. The short C program in the archive is only for people who don't want to hunt down the "unmar" program. In that case, the format is no less secure than the shar format. People on systems where the compiler is not invokable as "cc" can simply cut out the small program, compile it themselves, and feed it the archive. -- _.John G. Myers Internet: John.G.Myers@andrew.cmu.edu (412) 268-2984 LoseNet: ...!seismo!ihnp4!wiscvm.wisc.edu!give!up
jmm@eci386.uucp (John Macdonald) (10/06/89)
In article <1989Oct3.225620.17825@chance.UUCP> john@chance.UUCP (John R. MacMillan) writes: >That's why I suggested what I did; it still works for everyone who's >happy with shar format, and it makes it easier on people without >/bin/sh or who don't trust running /bin/sh on someone elses shars. >(I'm neither, by the way). I'm also neither - if I'm going to compile and run somebody elses C program, that danger in also running their shar program to unpack it seems minimal. The same nasty trojan effects can be put in either place by a dastardly villain, so closing the "sh" door does not do much to improve safety. Of course, I'm sufficiently rarely able to spend enough time on net activities to both get a new set of source from the net and unpack it and try to run it all in the same session. Thus, I have the benefit of expecting that by the time I *do* get around to trying something out the lack of flames on the net implies a lack of trojans in the source. (Thank you to the brave pioneers who offer their file systems up in sacrifice to the Trojan demons. May your offerings never be accepted.) -- "Software and cathedrals are much the same - | John Macdonald first we build them, then we pray" (Sam Redwine) | jmm@eci386