time@ox.com (Tim Endres) (05/30/90)
I am working on an implementation of News for the lowly Macintosh. I have had many requests to support compressed/batched news. I have easily added the support for batched news, but have found the support of compressed news to be a dilemma. Most people have indicated that "compress", the PD version, is what is normally used for news compression. This would *seem* fine, but the darn thing requires 500K RAM just to uncompress. This not only seems extraordinary, but I can not see how implementations on a PC limited to 640K could even work. QUESTIONS: Is "compress" the common news compression algorithm? Do all news feeds compress at 16 bits or 12 bits? What are the implications of using compress in commercial sw? Are there any other, more miserly, compress programs available? Thanks, tim. time@ox.com
jmv@sppy00.UUCP (J. Vickroy +1 614 764 4343) (05/30/90)
In article <1990May29.202056.26271@ox.com> time@ox.com (Tim Endres) writes:
=>
=> [ stuff deleted ]
=>
=>QUESTIONS:
=>Is "compress" the common news compression algorithm?
=>Do all news feeds compress at 16 bits or 12 bits?
=>What are the implications of using compress in commercial sw?
=>Are there any other, more miserly, compress programs available?
I've been told that the compression algorithm that sendbatch uses is
a "modified" L-Z. If this is true, then the PD compress should work.
This is the theory that I'm working from, at least. I'm in the
middle of a port to Think C 4.0.
jim
--
Jim Vickroy | Voice: +1 614 764 4343
Telecommunications Systems Engineering | Internet: jmv@rsch.oclc.org
Online Computer Library Center, Inc. | uucp: osu-cis!sppy00!jmv
6565 Frantz Road, Dublin Ohio, 43017-0702 | CompuServe: 73777,662
det@hawkmoon.MN.ORG (Derek E. Terveer) (05/31/90)
In article <1990May29.202056.26271@ox.com> time@ox.com (Tim Endres) writes: > > I am working on an implementation of News for the lowly Macintosh. > I have had many requests to support compressed/batched news. > I have easily added the support for batched news, but have found > the support of compressed news to be a dilemma. > > Most people have indicated that "compress", the PD version, is what > is normally used for news compression. This would *seem* fine, but > the darn thing requires 500K RAM just to uncompress. This not only > seems extraordinary, but I can not see how implementations on a PC > limited to 640K could even work. We ran into the same problem with PCnews -- compress is just too big to run effectively in the (limited) pc environment. So, we found a pd program called "u16" (i can send it to you if you like), which does nothing but uncompress, is written in assembly and is optomized for space. We are therefore able to support standard 16-bit and 12-bit newsfeeds on the ibm pc with this program. It still wasn't easy wedging it in though but it worked! Unfortunately, this program may not work on a mac (probably doesn't since it is written at least partly, i believe, in 8086 code. But, the original authors may have a 68k version as well or perhaps you can port the assembly version of it. derek -- Derek Terveer det@hawkmoon.MN.ORG
SA44@LIVERPOOL.AC.UK (Kevin Maguire) (05/31/90)
In article <1990May29.202056.26271@ox.com>, time@ox.com (Tim Endres) says: >I am working on an implementation of News for the lowly Macintosh. >I have had many requests to support compressed/batched news. >Most people have indicated that "compress", the PD version, is what >is normally used for news compression. This would *seem* fine, but >the darn thing requires 500K RAM just to uncompress. This not only >seems extraordinary, but I can not see how implementations on a PC >limited to 640K could even work. I had a look at the compress.c (comp.c ??) file in C news and it does indeed seem to be the standard UN*X compress(1) command incognito. It depends which defines you choose when compiling this file whether you get 16bit/12bit/whatever compression. Yes 16 bit does need ~400K to compress in but uncompressing with even 16bit need much less ~70K (?) So just get this file (compress.c) and compile in 16bit mode. Once compiled compress automatically determines the compression mode a file was compressed in so on your Mac you can compress your batches in 13 bit mode and send them out and uncompress incoming 16bit batches as well. The commandline option is "compress -b13" for thirteen bits. This way you'll only need about 70K of work space to send out 13bit batches (which any compress should automatically handle) and uncompress 16bits batches. Alternatively, ask your feed site to use 13bit (or 12) compression on your batch (don't think either C news or B news automatically allow this however :-() Kevin Maguire Nsfnet : sa44%liv.ac.uk@nsfnet-relay.ac.uk Uucp : ...!mcsun!ukc!liv-ib!sa44
gz@cambridge.apple.com (Gail Zacharias) (06/01/90)
In article <1990May29.202056.26271@ox.com> time@ox.com (Tim Endres) writes: >Most people have indicated that "compress", the PD version, is what >is normally used for news compression. This would *seem* fine, but >the darn thing requires 500K RAM just to uncompress. This not only >seems extraordinary, but I can not see how implementations on a PC >limited to 640K could even work. I have a program (actually, a set of library routines) that will do unix-compatible compression on the Mac. The worst case uncompression (16 bit) requires 200K. I'd be happy to send it to anybody who asks. It's about 20K of MPW assembler source, for functions meant to be called from MPW C. --- Home: gz@entity.com or ...mit-eddie!spt!gz Work: gz@cambridge.apple.com
guy@auspex.auspex.com (Guy Harris) (06/01/90)
>I've been told that the compression algorithm that sendbatch uses is >a "modified" L-Z. Uhh, the compression algorithm that "sendbatch" uses *is* "compress": -c) COMP='| $LIB/compress $cflags' ECHO='echo "#! cunbatch"' continue;; and "/usr/lib/news/compress" is simply a symlink to "/usr/ucb/compress" on our system. There is a variant of "compress" supplied in source form with the netnews source, but it's basically the same program as the one that comes with, say, BSD.
bob@MorningStar.Com (Bob Sutterfield) (06/01/90)
In article <1990May29.202056.26271@ox.com> time@ox.com (Tim Endres) writes:
Is "compress" the common news compression algorithm?
Yes.
Do all news feeds compress at 16 bits or 12 bits?
Hard to say, but I don't know of any doing 12-bit compression. But
then, I don't run in those circles.
What are the implications of using compress in commercial sw?
Ask your lawyer. Net advice is worth exactly what you pay for it :-)
Are there any other, more miserly, compress programs available?
Some years ago, someone posted a micro-zcat to net.sources. It
performed the uncompress-in-a-pipe function in about two dozen lines
of C, really pretty elegantly. Of course, I can't dredge it up any
more. Perhaps your friendly neighborhood source archive site would
have a copy?
clewis@eci386.uucp (Chris Lewis) (06/01/90)
In article <861@sppy00.UUCP> jmv@sppy00.UUCP (J. Vickroy +1 614 764 4343) writes: > In article <1990May29.202056.26271@ox.com> time@ox.com (Tim Endres) writes: > =>QUESTIONS: > =>Is "compress" the common news compression algorithm? In B-news and C-news "compress" is what is invoked. compress source comes with B-news, but apparently not C-news. I suspect TMN (aka news 3.0) uses compress too. > =>Do all news feeds compress at 16 bits or 12 bits? B-news by default uses whatever the compress source is compiled for. C-news by default uses 12 bits (because >12 bits isn't always possible on things like 286 Xenix or other small address space machines like pdp11's) regardless of how compress is compiled. 16 versus 12 bit usually only makes a difference of a few percent in size, but oodles in time (and memory space). B-news sites (particularly with scant CPU cycles and memory space) would be well advised to use -b12 anyways for outgoing batches (and possibly recompile for 12 bit provided that your feeders know about it). Build your Mac thingie with compress *compiled* for 12 bit compression, and include in the software notes that anybody feeding you has to set their compression to 12. Which means C-news feeders probably don't have to do anything, and B-news feeders have to stuff the "-b12" option into sendbatch's compress invocation (sendbatch can be parameterized for this I believe). Compress compiled for 12 bit compression has a data area of something like 32K compared to 400K+ when compiled for 16 bits. There's absolutely *no* problem with feeding 12 bit compressed data to a 16 bit compress program (it figures it out itself). (eg: a 12 bit feeding a 16 bit) The problem arises when a 16 bit compress program generates 16 bit output and tries to run it through a compress compiled for 12 bits. (eg: a 16 bit feeding a 12 bit) Just make sure that your compress source is at least version 4.0. > =>What are the implications of using compress in commercial sw? The source appears PD. Don't charge money for compress itself - the authors will get pissed.... Nor pretend you wrote it. A mention of the authors in your documentation would be nice... Given the source, you should be able to contact the original authors to make sure. > =>Are there any other, more miserly, compress programs available? Not that would be compatible with the majority of existing news sites. Please don't introduce another batch protocol! There appears about a dozen "supported" in B-news and C-news land, plus untold numbers in actual practise. (Though *all* of the compressed-batch protocols that are directly supported by the batchers use "compress" somewhere). The "standard" compressed batch format is a normal uncompressed news batch, run through compress, and occasionally (unmodified B-news) prepended with "#!cunbatch" or "#!rnews". C-news doesn't do the prepends, but will *accept* either prefix or none automatically. (which is what I suggest yours do too). > I've been told that the compression algorithm that sendbatch uses is > a "modified" L-Z. If this is true, then the PD compress should work. In true USENET tradition, the defacto standard is "compress". Which just so happens to be "an" implementation of L-Z. Without comparing output formats, there's no way of telling whether any other L-Z implementation would be compatible. -- Chris Lewis, Elegant Communications Inc, {uunet!attcan,utzoo}!lsuc!eci386!clewis Ferret mailing list: eci386!ferret-list, psroff mailing list: eci386!psroff-list
wcf@psuhcx.psu.edu (Bill Fenner) (06/01/90)
In article <BOB.90May31152322@volitans.MorningStar.Com> bob@MorningStar.Com (Bob Sutterfield) writes: |In article <1990May29.202056.26271@ox.com> time@ox.com (Tim Endres) writes: | Do all news feeds compress at 16 bits or 12 bits? | |Hard to say, but I don't know of any doing 12-bit compression. But |then, I don't run in those circles. hogbbs<>psuhcx used to be 12 bit, before I got 16-bit compression working. sendbatch simply passes the -b option to compress, if it's on the sendbatch command line, so I said sendbatch -b12 hogbbs. Bill -- Bill Fenner psuhcx is going away 5/31. Use wcf@wcfpc.scol.pa.us or sysop@hogbbs.fidonet.org (1:129/87 - 814/238-9633) ..psuvax1!hogbbs!wcfpc!wcf
jaw@riacs.edu (James A. Woods) (06/01/90)
# "don't compress that dwarf, hand me the pliers." -- after firesign theatre > What are the implications of using compress in commercial sw? since 'compress' is a component of s5r4 unix, it's been done very publicly. you, too, can appropriate it in much the same manner as has at&t. > Are there any other, more miserly, compress programs available? >>Some years ago, someone posted a micro-zcat to net.sources. It >>performed the uncompress-in-a-pipe function in about two dozen lines >>of C, really pretty elegantly. Of course, I can't dredge it up any >>more. Perhaps your friendly neighborhood source archive site would >>have a copy? the first micro-zcat was done in 1987 by karl f. fox of morningstar technologies. since then, it's become both more (and less), as discussed under the rubric of other cult postings which never were directed to an official public archive. please excuse the ellipticity here, but since mr. fox, myself, and paul eggert of twinsun.com have a closely-related official entry in chongo's 7th intl. obfuscated C contest, we ask you *not* to dredge up the old code for publicity here. after mid-june, when the judges have pronounced, you'll come to know more than you'd ever wanted to about this twisted effort. ames!jaw
steve@thelake.mn.org (Steve Yelvington) (06/01/90)
[In article <1990May30.200519.12409@hawkmoon.MN.ORG>, det@hawkmoon.MN.ORG (Derek E. Terveer) writes ... ] > In article <1990May29.202056.26271@ox.com> time@ox.com (Tim Endres) writes: (...) >> Most people have indicated that "compress", the PD version, is what >> is normally used for news compression. This would *seem* fine, but >> the darn thing requires 500K RAM just to uncompress. This not only >> seems extraordinary, but I can not see how implementations on a PC >> limited to 640K could even work. > > We ran into the same problem with PCnews -- compress is just too big to run > effectively in the (limited) pc environment. So, we found a pd program called > "u16" (i can send it to you if you like), which does nothing but uncompress, I have not been motivated to add compression to our home-grown news software for the Atari ST. It seems to me that anybody who's shoveling a large quantity of news is soon going to have a modem that does compression on the fly. Is compression by the modem an adequate substitute for compression by the CPU? I'd like to hear from somebody who has tried it both ways. -- Steve Yelvington at the lake in Minnesota (Ah, summer... leech season...)
chip@tct.uucp (Chip Salzenberg) (06/01/90)
According to bob@MorningStar.Com (Bob Sutterfield): >In article <1990May29.202056.26271@ox.com> time@ox.com (Tim Endres) writes: >>Do all news feeds compress at 16 bits or 12 bits? >Hard to say, but I don't know of any doing 12-bit compression. But >then, I don't run in those circles. By default, C News uses 12-bit compression. Its authors, Geoff Collyer and Henry Spencer, are compulsive measurers [:-)]. Their measurements of the relative efficiency of 12- and 16-bit compression on news batches indicated that it wasn't that much of a gain, especially considering how large a 16-bit compression process is compared to a 12-bit compression process. -- Chip, the new t.b answer man <chip%tct@ateng.com>, <uunet!ateng!tct!chip>
zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (06/01/90)
>Hard to say, but I don't know of any doing 12-bit compression. But >then, I don't run in those circles. C News tends to use 12 bit compression. It's unfortunate since even small address space sites can do 14 bit uncompress by using a dedicated uncompress program or something like my rnews.c. -- Jon Zeeff (NIC handle JZ) zeeff@b-tech.ann-arbor.mi.us I found a groundhog chewing on my car!
smith@groucho (06/01/90)
In article <3411@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: >>I've been told that the compression algorithm that sendbatch uses is >>a "modified" L-Z. Your right. > >Uhh, the compression algorithm that "sendbatch" uses *is* "compress": > > -c) COMP='| $LIB/compress $cflags' > ECHO='echo "#! cunbatch"' > continue;; > >and "/usr/lib/news/compress" is simply a symlink to "/usr/ucb/compress" >on our system. There is a variant of "compress" supplied in source form >with the netnews source, but it's basically the same program as the one >that comes with, say, BSD. Your right but wrong. The compress program used adaptive Lempel-Ziv coding. Also /usr/lib/news/compress is only a symlink to /usr/ucp/compress on Berkely systems. __ | | I know its long but I ----------------------------- | | like it. William Smith |. | Microelectronics Research Center \ | University of Idaho / \ Moscow, ID 83843 | \ (208)885-6500 | \ | ---| E-mail: wsmith@groucho.mrc.uidaho.edu | | ---------------------------- | | |----------
guy@auspex.auspex.com (Guy Harris) (06/03/90)
>>Uhh, the compression algorithm that "sendbatch" uses *is* "compress": ... >>and "/usr/lib/news/compress" is simply a symlink to "/usr/ucb/compress" >>on our system. There is a variant of "compress" supplied in source form >>with the netnews source, but it's basically the same program as the one >>that comes with, say, BSD. > >Your right but wrong. The compress program used adaptive Lempel-Ziv >coding. Uhh, did I say it didn't? I just pointed out something that the original poster seemed not to know, namely that netnews used "compress" itself to compress batches. >Also /usr/lib/news/compress is only a symlink to /usr/ucp/compress on >Berkely systems. I *did* say "on our system" (which I guess you could call a "Berkeley system"; it's SunOS, which while it resembles BSD in a number of ways, is not BSD), and *did* point out that a variant of the "compress" program was supplied in source form for systems that don't have "compress". Of course, the Makefile for netnews may need to be modified if System V Release 4 puts "compress" in "/usr/bin", as I expect it does....
zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (06/03/90)
>C-news by default uses 12 bits (because >12 bits isn't always >possible on things like 286 Xenix or other small address space >machines like pdp11's) regardless of how compress is compiled. 16 versus You can use something like my rnews.c that is not only faster, more secure, never runs out of disk space, etc but also allows 14 bit uncompression even on small machines. -- Jon Zeeff (NIC handle JZ) zeeff@b-tech.ann-arbor.mi.us
zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (06/03/90)
>By default, C News uses 12-bit compression. Its authors, Geoff >Collyer and Henry Spencer, are compulsive measurers [:-)]. Their >measurements of the relative efficiency of 12- and 16-bit compression >on news batches indicated that it wasn't that much of a gain, Let's look at some actual figures for fairly large batches (250k after compress). Believe me, with C News you want to use large batches, especially if your link is reliable: 16 bit = 1.0 14 bit = 1.12 12 bit = 1.28 I consider a 28% difference quite significant. Also consider that even small address space sites can do 14 bit uncompress with the right software (and no efficiency loss). -- Jon Zeeff (NIC handle JZ) zeeff@b-tech.ann-arbor.mi.us
henry@utzoo.uucp (Henry Spencer) (06/03/90)
In article <90151.173150SA44@LIVERPOOL.AC.UK> SA44@LIVERPOOL.AC.UK (Kevin Maguire) writes: >I had a look at the compress.c (comp.c ??) file in C news and it does >indeed seem to be the standard UN*X compress(1) command incognito. Actually, the C News distribution as shipped from here doesn't include a compressor at all; we simply assumed that everyone had compress. (We realize this isn't a safe assumption, but we have to cut things off somewhere.) >... Alternatively, ask your feed site to use 13bit (or 12) >compression on your batch (don't think either C news or B news automatically >allow this however :-() C News defaults to 12-bit compression, precisely to be compatible with small machines. In fact, you have to go in and work at it to get 16-bit compression. (That probably should be made easier.) -- As a user I'll take speed over| Henry Spencer at U of Toronto Zoology features any day. -A.Tanenbaum| uunet!attcan!utzoo!henry henry@zoo.toronto.edu
root@ozdaltx.UUCP (root) (06/03/90)
Here lately I've noticed that compress seems to "choke" on certain news files. (we're running Bnews PL17 on a 286 XENIX). I've checked compress and recompiled it several times with various options always with the same results. It hits a point in the file and just stops. I suspect that some of the recent uuencoded files that have been passing thought the net might have something to do with this. But don't really have a way to check it. Outta courisity, could some one e-mail me a working make for SCO XENIX 286 for compress - maybe there is something I've missed and am not aware of it. Thanks in advance... scotty ozdaltx!sysop