jaw@ames.UUCP (James A. Woods) (03/05/85)
# What are words worth? -- The Tom Tom Club Proper words in proper places, make the true definition of a style. -- Jonathan Swift, Letter to a Young Clergyman, 1720 Words butter no parsnips. -- Southern proverb _____ As promised long ago, I am making available the wordlist from Webster's Second International Dictionary to those with 'ftp' access to ARPAnet. The kind soul at Bell Labs who provided me with this word hoard maintains that it is public domain. The note inside the covers of Webster's Third indicates a copyright date of 1934 for 'web2'; legal protection for 'web3' began in 1961 and is still in effect. Since dictionaries are living entities, you be the judge of its efficacy -- we will have to wait for the likes of Lawrence Urdang and Univ. of Toronto to finish input of the OED for the (pen)ultimate word on the English language. Web2 is by far too large for 'uucp' transmission. In fact, I have encoded the files for ARPA xmission by a factor of four (to about one MB) by using a combination of the ever-popular 'compress' program and a specialized "incremental encoder" written in a few lines of C. This has been done in order to lighten the load on our gracious host (RIACS -- Research Institue for Advanced Computer Science), at the expense of increased decoding time on the recipient machine. This should all be invisible to you, if you wish, since the procedure is simply: - login via "anonymous ftp" to riacs.ARPA - cd ~ftp/pub/web2 - retrieve web2.shar, web2.sq.Z, and web2a.sq.Z followed by installation with sh web2.shar make web2 which also makes 'compress' and 'unsqueeze' before turning over 2.4MB of output to 'sort -f'. If you think that this is also a ploy to get you to install the second-generation 'compress' on your system, indeed it is such. This way, ARPAnauts can do some one-stop shopping. Web2a is a supplementary list of hyphenated terms as well as assorted noun and adverbial phrases. Web2 has already served me and others well in conducting certain frivolous research into "word jazz". Inquire within. -- James A. Woods {ihnp4,hplabs}!ames!jaw (or, jaw@riacs)
jaw@ames.UUCP (James A. Woods) (03/07/85)
# The word "accident" should be erased from the dictionary. -- Isaac Bashevis Singer, 1978 Loose ends: (1) The wordlist is just that, no definitions. You'll just have to wait until one of the copywronged databases (Wang owns the rights for the Random House Dictionary) shows up on one of those cute 550 MB digital CD ROMs. (2) It has been moved to the 'riacs-icarus.arpa' mail gateway to soothe a swamped time-sharing CPU. (Sorry, Dave!) (3) Four-to-one data compression is sub-optimal, I know. (I used, in addition to the incremental decoder [remember zippy.h, K.T.?], "compress -b12" so that machines with tiny address space can unbutton the list. (4) Good luck if you use it as a spelling detector/corrector list! The pitfalls induced by this are nicely discussed in M. D. McIlroy's "Development of a Spelling List", in IEEE Trans. Commun., January 1982. -- James A. Woods {hplabs,ihnp4}!ames!jaw (or jaw@riacs)
mwherman@watcgl.UUCP (Michael W. Herman) (03/07/85)
> Since dictionaries are living > entities, you be the judge of its efficacy -- we will have to wait for > the likes of Lawrence Urdang and Univ. of Toronto to finish input of the OED > for the (pen)ultimate word on the English language. HOLD EVERYTHING. It's the University of Waterloo that is undertaking the New Oxford Dictionary Project; not that other university down the road.