barrett@Daisy.EE.UND.AC.ZA (Alan P. Barrett) (12/29/90)
In article <MEISSNER.90Dec28123513@curley.osf.org>, meissner@osf.org (Michael Meissner) writes: > On the other hand, ever since I've switched to compress + uuencode + > shar for shipping out large sets of patches, I've had fewer people > complain about news/mail trashing the file or subsequent patches > failing, since some 'helpful' intermediary decided to put a newline in > column 79, or change tabs to spaces, or.... I sympathise, having been on the receiving end of a feed that changed tabs to spaces, or appended extra spaces to the ends of some lines, or inserted extra newlines, or broke messages into lots of little pieces without labelling the parts properly. (This is no reflection on the people who provided that feed -- they did their best.) I think that the correct way to fix this is to use an encoding that is both readable and robust. A version of shar that does stuff like encoding tabs as \t and wrapping lines in a reversible way would do it. In fact, there was a lot of discussion on this topic here several months ago. Sorry, I don't remember details, but I thought that somebody was going to do some real work on coming up with a suitable standard? --apb Alan Barrett, Dept. of Electronic Eng., Univ. of Natal, Durban, South Africa Internet: barrett@ee.und.ac.za (or %ee.und.ac.za@saqqara.cis.ohio-state.edu) UUCP: m2xenix!quagga!undeed!barrett PSI-Mail: PSI%(6550)13601353::BARRETT
allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (12/30/90)
As quoted from <1990Dec29.114801.5895@Daisy.EE.UND.AC.ZA> by barrett@Daisy.EE.UND.AC.ZA (Alan P. Barrett): +--------------- | I think that the correct way to fix this is to use an encoding that is | both readable and robust. A version of shar that does stuff like | encoding tabs as \t and wrapping lines in a reversible way would do it. | In fact, there was a lot of discussion on this topic here several months | ago. Sorry, I don't remember details, but I thought that somebody was | going to do some real work on coming up with a suitable standard? +--------------- Brad Templeton's ABE is in the comp.sources.misc archives. It is a robust, readable encoding that includes line-by-line checksums and mapping of characters that don't survive EBCDIC translations (and others). I would personally consider starting from there, as Brad decided not to make it share- ware/commercial in part as a possible solution to this issue. ++Brandon -- Me: Brandon S. Allbery VHF/UHF: KB8JRR on 220, 2m, 440 Internet: allbery@NCoast.ORG Packet: KB8JRR @ WA8BXN America OnLine: KB8JRR AMPR: KB8JRR.AmPR.ORG [44.70.4.88] uunet!usenet.ins.cwru.edu!ncoast!allbery Delphi: ALLBERY
darcy@druid.uucp (D'Arcy J.M. Cain) (12/31/90)
In article <1990Dec29.114801.5895@Daisy.EE.UND.AC.ZA> Alan P. Barrett writes: > [...] >I think that the correct way to fix this is to use an encoding that is >both readable and robust. A version of shar that does stuff like >encoding tabs as \t and wrapping lines in a reversible way would do it. >In fact, there was a lot of discussion on this topic here several months >ago. Sorry, I don't remember details, but I thought that somebody was >going to do some real work on coming up with a suitable standard? I posted my genfiles program which I hoped would be a jumpimg off point for such an effort. Has anyone looked at it and have suggestions to enhance the protocols I suggested? Here is the Readme from the distribution: ------------------------------------------------------------------------- This is my file generation utility. The genfiles program reads in a script from the standard input and creates files based on the contents. There is some parameter substitution as well. The mkscript program is an easy way of creating the scripts used by genfiles. See the source files for further details. These programs are being offered as a possible solution to problems of transfering files between different networks without changing them. The utilities in this distribution were originally written for different purposes and have been hacked on in order to make a start on some sort of solution. Some of the issues addressed (and hopefully solved) are: The lines can be split if desired and restored on the receiving end. Many troublesome characters are translated to less troublesome ones and restored on the receiving end. The files are transmitted in a form that can be read without any further processing. This is ***NOT*** a uuencoding type program. By modifying the code, systems that need some characters converted to trigraphs can do so by simply commenting out the case statement that converts the troublesome character(s). What it doesn't have yet is multi-part support other than splitting up the resulting file and restoring it by hand. I will try to do something about this. It also doesn't unpack itself like shar files do but the program is fairly simple and can easily be written for systems that can't use this one for some reason. In order to use the program that creates the script file you either have to pick up my getarg program which I recently posted or else hack the source to use normal getopt. If you can't get getarg from a local archive site you can get it from my machine's mail server. Send mail to unix-server@druid.UUCP with the following line in the body of the message: send getarg.c Note if the mail to the server gets too heavy I will have to shut it down for my neighbours sake so please use it as a last resort. This is just a lowly leaf node with a single 2400 baud modem. The program to unpack the files is self contained. D'Arcy J.M. Cain D'Arcy Cain Consulting West Hill, Ontario darcy@druid.UUCP --------------------------------------------------------------------- -- D'Arcy J.M. Cain (darcy@druid) | D'Arcy Cain Consulting | There's no government West Hill, Ontario, Canada | like no government! +1 416 281 6094 |
rhys@batserver.cs.uq.oz.au (Rhys Weatherley) (12/31/90)
In <1990Dec30.170302.21665@druid.uucp> darcy@druid.uucp (D'Arcy J.M. Cain) writes: >In article <1990Dec29.114801.5895@Daisy.EE.UND.AC.ZA> Alan P. Barrett writes: >> [...] >>I think that the correct way to fix this is to use an encoding that is >>both readable and robust. A version of shar that does stuff like >>encoding tabs as \t and wrapping lines in a reversible way would do it. > >I posted my genfiles program which I hoped would be a jumpimg off point for >such an effort. Has anyone looked at it and have suggestions to enhance >the protocols I suggested? I missed the original discussion, so I may be repeating things, but the central problem I think there will be in getting a new transmission standard off the ground is actually making it a standard :-). unshar, uuencode and the like are very widespread, and trying to shake their ground may be very hard. Maybe in the interim a cut-down "encoder" is needed that can be wrapped-up in a shar archive, and will be unpacked, compiled and run to unpack the rest. e.g. the shar archive could look something like this: ... head information ... sed ... >/tmp/decode.c <<EOF ... source code for decode.c ... EOF cc -o /tmp/decode /tmp/decode.c sed ... | /tmp/decode >file <<EOF ... file contents ... EOF It should be possible to get a very compact decoding program that could be wrapped up with the shell archives. Won't solve all the problems but may help, as well as its being reasonably compatible with the existing shar archiving system. Well, that's my thoughts on the matter, what do you think? Rhys. P.S. D'Arcy, could you tell us where your program may be found, since I missed it first time around. +===============================+==================================+ || Rhys Weatherley | The University of Queensland, || || rhys@batserver.cs.uq.oz.au | Australia. G'day!! || +===============================+==================================+
terry@galaxia.newport.ri.us (01/01/91)
As long as the discussion about packing source code has been reopened, how about extending the discussion to include alternatives to shar. In addition to USENET postings being routed to different networks with some character incompatiblities there are also now a large number of non-unix machines connected to the network. While these machines do not have SH with which to unpack a shar posting, there are unshar program that will unpack some of the shar postings, I have a couple. Note, I said some of the postings. The trouble is there is always a new version of shar being used which breaks the old unpackers, mine cannot unpack the latest distributions. Also, the source code for these unpackers is not widely distributed, which makes it difficult to change it or for newcommers to obtain one. Furthermore, there has been some concern expressed about the security of using shar. Therefore, I am suggesting that there be some serious discussion about using a packing format, with distributed code as is done with uudecode, to replace shar. Comments anyone? raymond!terry@galaxia.newport.ri.us {rayssd,xanth,lazlo,mirror,att}!galaxia!raymond!terry
xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (01/01/91)
rhys@batserver.cs.uq.oz.au writes: > darcy@druid.uucp (D'Arcy J.M. Cain) writes: >>In article <1990Dec29.114801.5895@Daisy.EE.UND.AC.ZA> Alan P. Barrett writes: >>> [...] >>>I think that the correct way to fix this is to use an encoding that is >>>both readable and robust. A version of shar that does stuff like >>>encoding tabs as \t and wrapping lines in a reversible way would do it. >>I posted my genfiles program which I hoped would be a jumpimg off point for >>such an effort. Has anyone looked at it and have suggestions to enhance >>the protocols I suggested? >I missed the original discussion, so I may be repeating things, but >the central problem I think there will be in getting a new transmission >standard off the ground is actually making it a standard :-). unshar, >uuencode and the like are very widespread, and trying to shake their >ground may be very hard. Maybe in the interim a cut-down "encoder" >is needed that can be wrapped-up in a shar archive, and will be unpacked, >compiled and run to unpack the rest. e.g. the shar archive could look >something like this: > ... head information ... > sed ... >/tmp/decode.c <<EOF > ... source code for decode.c ... > EOF > cc -o /tmp/decode /tmp/decode.c > sed ... | /tmp/decode >file <<EOF > ... file contents ... > EOF >It should be possible to get a very compact decoding program that could >be wrapped up with the shell archives. Won't solve all the problems >but may help, as well as its being reasonably compatible with the >existing shar archiving system. Well, that's my thoughts on the matter, >what do you think? Problem is, lots of shars are unpacked on systems where the C compiler command isn't spelled "cc", lots of shars don't contain C code and may be unpacked on systems where, e.g., Modula-2 is the only compilable language, in fact, I unpack lots of shars on my Amiga, where "sed" doesn't exist, and the "unshar" program fakes it by knowing the format of ordinary shar file "sed" commands and doing what's right. Probably, despite the calls here for clear text, a much more robust way to transmit source files is the one used in, for example, comp.binaries.ibm.pc, where the expected resources at a site are "uudecode", which can be transmitted in clear text as a BASIC or C program, and some widely available archiving program; the one of choice now is zoo, but lharc is coming up fast due to a superior packing algorithm. Add to that the "brik" CRC check, the zoo internal CRC checks, and the short line, limited character set, uuencode format with line by line checksums, and you have an extremely robust encoding that can transit ASCII to EBCDIC to ASCII intact, and doesn't challenge developmentally disabled news software, which we will always have with us. The major requirement for this method is that there needs to be a very explicit clear text explanation of the purpose and contents of the archive to let the reader make a decision whether it is worth unpacking. I'm not thrilled when I take the time to unpack and catenate and uudecode an archive with an interesting description from the PC-clone universe, to find out that it doesn't contain the source code I was seeking/expecting; in hopes of stealing some code and ideas for a port of the functionality. A minimal description should include source or not, data types, platforms, compiler technology required, functionality, and copyright status. To another poster's comments that folks on EBCDIC systems have to solve their own character set and newline encoding problems, that misses the point. Lots of ASCII to ASCII routings these days arrive with a BITNET host as an intermediary, so even the ASCII destination sites have to be concerned about the problem of an encoding that can survive the transit. I think the current pleas to keep the comp.sources.{unix,games,misc} and alt.sources postings all clear text, while understandable, are misdirected on today's net. And, again to another posting, no, the world is not all becoming USENet, to live under our way of doing things, just because the nets are being gatewayed together and sharing code in a much larger universe. The greater net is a community of peer networks, each with its own peculiar needs and requirements, not a set of subordinates to the least organized and most contentious member of the set, USENet. Thus it behooves us to find methods that cause as few problems as possible in getting code across this wider universe of communication, and clear text transmission doesn't seem to be the appropriate technique anymore. In my opinion, but I pack and unpack a _lot_ of source; .6 gigabytes compressed, at last count, not bad for a personal archive. That translates into several thousand archives of various sorts that I've unpacked. Kent, the man from xanth. <xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>
tneff@bfmny0.BFM.COM (Tom Neff) (01/01/91)
In article <62-raymond-terry> raymond!terry@galaxia.newport.ri.us (Terry Raymond) writes: >The trouble is there is always a new version of shar being used which breaks >the old unpackers, mine cannot unpack the latest distributions. Yes indeedy, because there is always someone out there to gild the lily. Witness the latest pointless "TOUCH=cannot" oddity. What would be nice would be for someone to concoct a Perl or C program which accepts a signature prefix to look for (/^X/ for instance) and writes files from raw shar's, regardless of format. >I am suggesting that there be some serious discussion about using a packing >format, with distributed code as is done with uudecode, to replace shar. There are ASCII archives out there. The problem is enforcing a standard.
emv@ox.com (Ed Vielmetti) (01/01/91)
In article <1990Dec31.232624.23510@zorch.SF-Bay.ORG> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
The major requirement for this method is that there needs to be a very
explicit clear text explanation of the purpose and contents of the
archive to let the reader make a decision whether it is worth unpacking.
The text explanation should be in a separate article from the globs of
binary encoded stuff. It should probably be cross-posted to a
relevant group, so that people who don't read the binary-encoded
sources group can be informed in a timely fashion of new stuff. In
addition it should have a very clear and precise description of where
a reasonably fresh version can be FTP'd from, if the code is available
in that way; that will facilitate reposting (just) the announcement
into comp.archives.
For that matter, a separate "this is what it is and where you can get
it" would be useful for any set of postings to alt.sources, sort of a
"part 0 of 15" which would be sent around more widely than the other
half a megabyte blortful.
--Ed
emv@ox.com
darcy@druid.uucp (D'Arcy J.M. Cain) (01/01/91)
In article <6540@uqcspe.cs.uq.oz.au> rhys@batserver.cs.uq.oz.au writes: >In <1990Dec30.170302.21665@druid.uucp> darcy@druid.uucp (D'Arcy J.M. Cain) writes: >>I posted my genfiles program which I hoped would be a jumpimg off point for >>such an effort. Has anyone looked at it and have suggestions to enhance >>the protocols I suggested? > >I missed the original discussion, so I may be repeating things, but >the central problem I think there will be in getting a new transmission >standard off the ground is actually making it a standard :-). unshar, True. >uuencode and the like are very widespread, and trying to shake their >ground may be very hard. Maybe in the interim a cut-down "encoder" >is needed that can be wrapped-up in a shar archive, and will be unpacked, >compiled and run to unpack the rest. e.g. the shar archive could look This is still not universal. It only works on Unix like systems. A standard should operate under any OS. That is why my system was made simple so that unpackers can be easily written for any platform. Also note that using shell to unpack can be a security hole. [stuff deleted] > >P.S. D'Arcy, could you tell us where your program may be found, since > I missed it first time around. I posted to alt.sources so check with the local archive sites. I was going to post to comp.sources.misc once I got some feedback and fixed up any problems people had with it but so far there have been very few suggestions for fixing it up. Naturally this means that the code is perfect and bug-free and no fixes are necessary. :-) Actually I have been adding support for multiple input files which was not present in the first version. As the shar is only 13634 bytes (13415 for a genfiles script) I could probably post another interim version but I don't want to clutter up everyone's archives unnecessarily so I will probably wait till I have done a few more fixes and tested it. In particular the program to unpack is almost done but the program to create the scripts, while working, can use some more enhancements. -- D'Arcy J.M. Cain (darcy@druid) | D'Arcy Cain Consulting | There's no government West Hill, Ontario, Canada | like no government! +1 416 281 6094 |
rhys@batserver.cs.uq.oz.au (Rhys Weatherley) (01/01/91)
In <1990Dec31.232624.23510@zorch.SF-Bay.ORG> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes: >Problem is, lots of shars are unpacked on systems where the C compiler >command isn't spelled "cc", lots of shars don't contain C code and may >be unpacked on systems where, e.g., Modula-2 is the only compilable >language, in fact, I unpack lots of shars on my Amiga, where "sed" >doesn't exist, and the "unshar" program fakes it by knowing the format >of ordinary shar file "sed" commands and doing what's right. So much for that idea :-) Then again, maybe we just need better co-ordination between the source and binary groups and the FTP sites around the world, since they have the best transmission method: store it the way it was meant to be! How about the moderators refuse to submit a program unless it has first been submitted to one or more of the major FTP sites, and they are obliged to list the FTP sites where it can be found in the initial blurb about the program? Rhys. +===============================+==================================+ || Rhys Weatherley | The University of Queensland, || || rhys@batserver.cs.uq.oz.au | Australia. G'day!! || +===============================+==================================+
barrett@Daisy.EE.UND.AC.ZA (Alan P. Barrett) (01/02/91)
In article <1990Dec31.232624.23510@zorch.SF-Bay.ORG>, xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes: > Probably, despite the calls here for clear text, a much more robust > way to transmit source files is [some uuencoded binary format]. > > The major requirement for this method is that there needs to be a very > explicit clear text explanation of the purpose and contents of the > archive to let the reader make a decision whether it is worth > unpacking. The lack of such descriptions in the postings that we do get is a big argument in favour of readable source postings. If the posting is readable then we can at least look through it in an attempt to find out what it does, and whether it relies on system features we do not have. Even a short (say 30 line) description is frequently too little to allow a decision to be made. --apb Alan Barrett, Dept. of Electronic Eng., Univ. of Natal, Durban, South Africa Internet: barrett@ee.und.ac.za (or %ee.und.ac.za@saqqara.cis.ohio-state.edu) UUCP: m2xenix!quagga!undeed!barrett PSI-Mail: PSI%(6550)13601353::BARRETT
bengtl@maths.lth.se (Bengt Larsson) (01/05/91)
In article <75110375@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes: >In article <62-raymond-terry> raymond!terry@galaxia.newport.ri.us (Terry Raymond) writes: >>I am suggesting that there be some serious discussion about using a packing >>format, with distributed code as is done with uudecode, to replace shar. > >There are ASCII archives out there. The problem is enforcing a >standard. How about an RFC for a basic "shar" format? After all, there's RFCs for mail digest formats and such. The RFC could define a pretty fixed format (easy to parse for foreign "unshars"), and it just happens to unpack itself using "sh". I think the "standard shar" should contain the "sed" command with "X" starting lines, the "wc" check for (a little) security, and (maybe) the check for overwriting files. Plus "#" for comments, "echo" for messages, and "exit" to skip signatures at the end. That should do it, short and simple (and KISS!) Admittedly, the "shar" is Unix-centric, but it would be standardizing existing practice (normally considered to be a Good Thing). Comments? Bengt L. PS. I had a text archive format proposed last time this discussion was around. This would be more general than "shar". I suppose I could repost that, if there's any interest. DS. -- Bengt Larsson - Dep. of Math. Statistics, Lund University, Sweden Internet: bengtl@maths.lth.se SUNET: TYCHE::BENGT_L