[comp.sys.misc] Copyright status of ARC, LZW, and COMPRESS programs questioned

W8SDZ@SIMTEL20.arpa (Keith Petersen) (05/09/87)

After announcing the availability of a recent update of SEA's ARC
program, I received the following message which raises serious doubts
as to the validity of the copyrights of SEA, Phil Katz, and Vernon
Berg's ARC programs and well as other LZW-type compression programs
and the status of the popular Unix "compress" program.

--Keith Petersen
Arpa: W8SDZ@SIMTEL20.ARPA
Uucp: {bellcore,decwrl,harvard,lll-crg,ucbvax,uw-beaver}!simtel20.arpa!w8sdz
GEnie Mail: W8SDZ

--forwarded message--
To: Keith Petersen <W8SDZ@SIMTEL20.ARPA>
Re: Message for the authors of ARC

I don't know how to get in touch with the authors of ARC (I didn't see
any addresses in INFO-IBMPC), but since you seem to be posting information
about new versions, etc., I thought that you might be able to forward the
following mail to them.

1) The correct spelling of the name is Ziv.  So you should call it
Lempel-Ziv (or Ziv-Lempel because that was the order of the author's
names in the original paper) encoding.

2) The original Ziv-Lempel method is patented (#4,464,650 -- Willard
Eastman, Abraham Lempel, Jacob Ziv, Martin Cohen) assigned to Sperry
Univac (now Unisys).  Since the Welch modifications are to this
method, I would think that some sort of license agreement from Unisys
would be necessary (this is really only a practical problem for
commercial customers).  Does such an agreement exist?

--end forwarded message--

jaw@aurora.UUCP (05/12/87)

# "Don't compress that dwarf -- hand me the pliers!" -- after Firesign Theatre

> 2) The original Ziv-Lempel method is patented (#4,464,650 -- Willard
> Eastman, Abraham Lempel, Jacob Ziv, Martin Cohen) assigned to Sperry
> Univac (now Unisys).  Since the Welch modifications are to this
> method, I would think that some sort of license agreement from Unisys
> would be necessary (this is really only a practical problem for
> commercial customers).  Does such an agreement exist?
> 
     Professor Lempel once telephoned me to praise the existence of a
public domain implementation of the LZ algorithm.  Though I can take credit
only for the encoder hashing method currently used in 'compress', as well
as its "block-adaptive" table reset strategy (we remain indebted to Spencer
Thomas of the Univ. of Utah who gave USENET the basic framework), I'll
repeat here a comment relayed to me after a Lempel lecture at HP Labs.

     The story goes like this:  apparently the Welch paper came to the
light of day only after Sperry Research Labs was disbanded, this occurring
long before the Burroughs acquisition.  Supposedly, discoveries revert
to the general public when a lab ceases to exist.  Note this is *not*
the same as the situation engendered by the recent asset transfer from GE
to SRI of the RCA David Sarnoff Labs.

     In any event, the Welch implementation (Computer, vol. 17, #6, 1984)
only claimed that the presented "hardware" string table creation and access
method was "Sperry proprietary".  Details of any software-based code/index
storage scheme in existence at Sperry were deliberately left fuzzy in the paper.

     Since patents cover only "apparatus", no one is making claims for
any of the algorithmic variants of LZ, of which there are many (see below).
As for the copyright status of 'compress', Thomas and I (who both work at
public institutions) have signed (meaningless?) waivers not only to UCB
for the 4.3 distribution, but to Hewlett Packard for inclusion in their
own Unix release.  The latter is most ironic, since HP retained Lempel
as a consultant for a year on sabbatical leave from Technion in Israel,
where he was chairman of the computer science department.

     ARC is another matter, of which I know little.  It is fine by me
if someone sells a value-added 'compress' (you'd pay for the packaging
and "support").  Other companies sell the Unix LZ as part of a product
(the Talaris 'troff' software includes compressed fonts this way).  Now
I hear that Dan Robinson of Telebit (our friendly neighborhood 18 kbps
modem supplier) has valiantly jammed compression into the modem ROM,
adding a few tricks of his own, no doubt.  Speaking again only for myself,
it doesn't matter even if raw unadorned 'compress' were sold for a megabuck --
word would get around very quickly that it's available free from other
sources.

     LZ algorithms are not the be-all-end-all of data compression techniques.
They don't particularly work well (unmodified), for digital sound processing
or color picture reduction, for example.  Many variants employ equally many
time-space tradeoffs, with software implementations using data structures
ranging from binary trees, to "trie forests", to hash coding, to a direct
sparse array access exercise (for multi-megabyte machines) I posted to
USENET back in 1984/5.  Software work continuing at the Univ. of Calgary
should be mentioned, where Tim Bell claims a 5-10% rate improvement (for
ASCII-only input, alas), and unfortunately using an encoder which runs
a hefty order-of-magnitude slower, limiting application.  (See his IEEE Trans.
Comm. paper of Dec. 1986, which oddly sidesteps direct comparisons with
'compress').  Also, many ad hoc and not-so-ad hoc methods have been offered
to squeeze data, including the very involved Markov schemes of Cleary
and Witten, and the nouveau self-adaptive splay-tree amortization 
algorithms of Bentley and Tarjan.

     I could go on, but close by indicating that though optimal data
compression in general is unsolvable in the Turing sense, and though
many problem subclasses are NP-complete, the beautifully simple, linear,
and general method of Ziv and Lempel is hard to improve upon, and certainly
affords many approaches not subject to legal intervention.

     -- James A. Woods (ames!jaw)