heiby@cuae2.UUCP (Ron Heiby) (08/26/86)
This article is in response to an article in net.sources which is just one example of a misguided trend. That is, the use of compression and uuencoding by posters of "large" text information. I'm sure that the posters who do this have the best of intentions, and I don't blame them for not understanding how the net works. A recent posting of the adventure source was compressed on the poster's machine, then uuencoded into ASCII. This actually reduced the number of characters being posted by a fair amount. I presume that the same is the case with the above referenced article. There are two (at least) factors working against doing this. The first is that many sites don't have the uudecode program or the compress program, so can't read the posting at all. This means that they are paying to transmit "less" (see below) bytes, but none of them are usable by that site. The other factor against the scheme is that it doesn't actually save any money. I took the bodies of the adventure source articles and put them together in a directory. I recorded the total size of the files, then ran them through uudecode (which I had a heck of a time getting, see above), and then through uncompress. I noted that the files were now a fair amount larger. (Looks pretty good, huh?) The rub is that many/most news feeds that are concerned with phone costs or dialer time are already using compress on the batches of news being sent, so what really needs to be compared is the size of the original text (compressed) and the compressed uuencoded files (compressed). When I checked the adventure source, the original files, when compressed individually, totalled just under 300 blocks of disk space. The files as actually posted (original | compress | uuencode), when compressed individually, totalled about 375 blocks, an *increase* of 25% over the very links that are concerned enough about costs to use compress on their news links. The "bottom line" is, "Post CLEAR text. Let the transmission mechanism worry about compression." Thanks. -- Ron Heiby {NAC|ihnp4}!cuae2!heiby Moderator: mod.newprod & mod.os.unix AT&T-IS, /app/eng, Lisle, IL (312) 810-6109 "'Cause there's lots of things in this world that need to BE turned around."
campbell@maynard.UUCP (Larry Campbell) (08/28/86)
In article <2315@cuae2.UUCP> heiby@cuae2.UUCP (-Ron Heiby) writes: >This article is in response to an article in net.sources which is just >one example of a misguided trend. That is, the use of compression and >uuencoding by posters of "large" text information. ... > ... many sites don't >have the uudecode program or the compress program, so can't read the >posting at all. This means that they are paying to transmit "less" (see >below) bytes, but none of them are usable by that site. And some sites (like mine) are 16-bit machines that can't uncompress files compressed with "-b 13" or greater (the S2575 text was posted with "-b 14"). I would be most grateful if someone could mail me the clear text of S2575, or the text compressed with "-b 12" (AFTER first offering and getting a response from me, of course -- I don't want to get fifteen copies!). -- Larry Campbell The Boston Software Works, Inc. ARPA: campbell%maynard.uucp@harvard.ARPA 120 Fulton Street, Boston MA 02109 UUCP: {alliant,wjh12}!maynard!campbell (617) 367-6846
jose@utcs.uucp (08/28/86)
In article <2315@cuae2.UUCP> heiby@cuae2.UUCP (-Ron Heiby) writes: >This article is in response to an article in net.sources which is just >one example of a misguided trend. That is, the use of compression and >uuencoding by posters of "large" text information. I'm sure that the >posters who do this have the best of intentions, and I don't blame them >for not understanding how the net works. > I second the motion! Just imagine that if 5 people at 1 site decide that they want to look at these files, never mind trying to use them. Each and every user has to: 1) save the posted file somewhere (probably in a NEW directory) 2) uudecode/de-compress/whatever to get the REAL files 3) look at them and decide While each person is doing this his process is using cpu cycles and LOTS of disk space. And all for nothing if the files are not apropriate! Please folks: let's not waste our time trying to "improve" the system by out-thinking what some system administrator has spent hours working on! Send files as they are and let the software worry about it! -- Jose A. Dias University of Toronto Computing Services ------------------------------------------------------------------------------- The above ascii characters are not, have not ever been, or will ever be, the opinion of anybody, being, or super-intelligent shade of the colour blue. They were just a fluke. They were put together by randomnly selecting phrases from Vogon poetry... ------------------------------------------------------------------------------- uucp: {decvax,ihnp4,utcsri,{allegra,linus}!utzoo}!utcs!jose bitnet: JOSE@UTORONTO 300/1200: (416)535-5360 (As the crow flies... :-)
bogstad@brl-smoke.ARPA (William Bogstad ) (08/28/86)
[I apologize in advance to the people on the ARPANET who will get this
in UNIX-SOURCES. Maybe there should be a UNIX-SOURCES-D mailing list?]
I decided to take a look at the effect of compress as suggested.
I used as a sample the full text of the Aug 12th Senate bill 2575.
(Note this was never posted to the net, but is a result of the two
seperate postings made by myself and Glenn Tenney.) I used the
following notation for the file names:
Roots Suffix
===== ======
whole - the original text .Z - compressed (b16 -default on vax?)
part? - one of a # of parts .uu - uuencoded
total - sum of all parts (read from left to right with each
(not always the same as whole) conversion being done in turn.)
[A list of files and their sizes is at the end of this posting.]
The first thing to note is that because the text as a whole is
>64K it should not be posted uncompressed as a single message. Too many
sites truncate messages at that limit to make this a viable option.
The figures also indicate that if you are going to post a
uuencoded compressed message it is better to compress the whole rather
then breaking it into parts. The figures to compare are for whole.Z.uu
and total.Z.uu.
You now have to compare a single compressed posting with two
uncompressed postings. For sites that do not use compression on their
newsfeeds the gain is large (total - whole.Z ~= 40K). Sites that
do compress their news feeds have a small loss (total.Z - whole.Z.uu.Z
~= 5.5K). Using the figures from mod.newslists of 800 baud throughput
and $.15 a minute cost this loss per such feed is ~= 18 cents.
Thus far we have looked at the effects of compression on the cost of
transmission. In addition, we have assummed that any sites that "pay"
for their calls will use compression on their newsfeeds and that line
charges are the only cost. Let's add some "reasonable?" figures for
disk storage. I will use the following figures: $10,000 for 400M
drive, 3 year life span, 2 week expiration times. This translates to $1
for 40K for 3 years (no I didn't fudge the figures). 2 weeks is 1/78 of
that period which gives a cost differential (for EVERY site) of 1.2
cents. Many sites use longer expiration times on sources so the average
could easily be higher.
I'd like to come to a conclusion here, but I'm afraid I still
can't do so. I don't know what the ratio of long distance (LD) feeds to
sites is and I don't know the ratio of compressed LD to non-compressed
LD sites is. In addition, there is the cost for the CPU time used for
the per site compression/uncompression and actual transmission of the
feed. Many sites, however, can "hide" these costs and only have to
justify their LD charges.
Perhaps the whole thing should hinge on the ease of use. Some
people apparently do not have access to compress and uuencode or have
machines that can not handle the larger -b values which is the default
with compress on many machines. For myself, I will probably post
straight text in the future in order to avoid having to mail copies to
people who can't use the original. I do think, however, that before a
net-wide rule (suggestion?) is made that these other factors be
considered. If you do use compress be sure to use the -b 12 option so
anyone with compress can read it. It really doesn't save much
additional space to use the larger bit values.
70490 whole
28524 whole.Z
39320 whole.Z.uu
35934 whole.Z.uu.Z
38669 part1 70490 total
17013 part1.Z 30647 total.Z
23466 part1.Z.uu 42276 total.Z.uu
21748 part1.Z.uu.Z 39449 total.Z.uu.Z
31821 part2
13634 part2.Z
18810 part2.Z.uu
17701 part2.Z.uu.Z
Bill Bogstad
bogstad@hopkins-eecs-bravo.arpa
bogstad@brl-smoke.arpa
tenney@well.UUCP (Glenn S. Tenney) (08/29/86)
As the offending poster of S.2575 I want to say I'm sorry. I had never posted anything that large before and was concerned to send out something that large. I had asked the question of a knowledgeable net user, but the response came back the day after my posting. Boy, I'll never do that again. Now I'm faced with the 4 or 5 people that need the clear text of S2575. I'll wait a couple of days to see if there are any responses, but should I: (1) repost it clear; or (2) mail to those that need it clear? With egg on my posting... -- Glenn
woods@hao.UUCP (Greg Woods) (08/30/86)
Sites that are really concerned about cost will compress their news. Those that do not compress obviously have local feeds, money to burn, or more money for phone bills than spare CPU cycles. At any rate, from all this discussion, it seems as though uuencode/compress penalizes the very sites that have expressed enough concern about phone costs to do something about it. Please don't do it. --Greg
bobmon@iuvax.UUCP (Robert Montante) (08/31/86)
As one of those who has contributed in a small way to this "misguided trend", I want to offer a couple of excuses for my belie that user-compression was useful. In my case I was thinking more of binary files than of text files, so that readability wasn't such an issue ('So what are they doing in NET.SOURCES?' you ask. Well, uh... oops?) Since I wasn't aware that many mailers DO compress things, I never took that conflict into consideration. Again in my case, a much more significant point is that I want to retransmit many of the more interesting (read: LARGE) files to my home computer. I do this via kermit on a 7-bit phone line, and kermit in binary mode is a pig-dog. So I find it desirable to compress the file somehow for the same reason the mailers do, and then I want to uuencode it for kermit's sake. If the originator of the posting uploaded it from a personal computer, then it may have gone through compression/uuencoding in the first place. AND, since the popular compression programs for personal comp's aren't all compatible with compression programs on the mainframes, it would be a real problem to compress, upload, decompress, and post, followed by copy, compress, download, decompress at home. The preceding paragraph is at best an argument for compression of files that are specific to personal computers. In general, I have to concede the argument that the inter-USENET mailing programs should manage compression/transmission (I said these were excuses, not defenses :-). As a final follow-up to all that, I would like someone in the know to explain 'shar' postings to me. As far as I can see they merely repackage text files into a slightly different text format, and they seem to cost a percent or so in size. What advantage am I unaware of that makes them worth the effort? Thanx to all for the education... ...Bob... *-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-*-=-* Opinion + Enforcement => Fact ; 'believe it or else!' RAMontante Computer Science "Have you hugged ME today?" Indiana University
campbell@maynard.UUCP (Larry Campbell) (09/01/86)
In article <1280@iuvax.UUCP> bobmon@iuvax.UUCP (Robert Montante) writes: >As a final follow-up to all that, I would like someone in the know to explain >'shar' postings to me. As far as I can see they merely repackage text files >into a slightly different text format, and they seem to cost a percent or so >in size. What advantage am I unaware of that makes them worth the effort? "shar" postings are shell scripts that can be fed directly to the shell. Thus, with one simple command, you can get your news-reading program to automatically split the posting up into files and directories. If it's just a single file, it's true that there's not much point (except uniformity). But when the posting consists of three or four directories containing fifteen or twenty files, it sure is nice to be able to say "s | sh" and have the files pop into existence. The alternative, picking the files apart by hand with an editor, would be a royal pain. In addition, some flavors of shar postings contain consistency checks that help detect whether the files have been munged somehow in transit. Shar postings also can set permission bits so that, for example, shell scripts are made executable, saving you the trouble of doing it by hand. And finally, most versions of shar prepend an 'X<tab>' to the start of each line. This helps prevent mungage by certain brain-damaged mail software that truncates messages at any lines containing only a single period. Remember that Usenet postings must often traverse some pretty convoluted and unreliable paths before they reach certain readers. These are the reasons that shar postings are, and ought to be, the de facto standard for posting files in net.sources. -- Larry Campbell The Boston Software Works, Inc. ARPA: campbell%maynard.uucp@harvard.ARPA 120 Fulton Street, Boston MA 02109 UUCP: {alliant,wjh12}!maynard!campbell (617) 367-6846
WDMCU@CUNYVM.BITNET (09/11/86)
I would appreciate a plain-text posting of S.2575 or a copy sent to me as I am on a VM system and the very UNIX things you are talking about are unavailable. Thanks. /*--------------------------------------------------------------------*/ /* Bill Michtom - work: (212) 903-3685 home: (718) 788-5946 */ /* */ /* WDMCU@CUNYVM (Bitnet) Timelessness is transient */ /* BILL@BITNIC (Bitnet) */ /* */ /* Never blame on malice that which can be adequately */ /* explained by stupidity. */ /* A conclusion is the place where you got tired of thinking. */ /*--------------------------------------------------------------------*/