[sci.crypt] Need information on data compressio

kadie@uiucdcsb.cs.uiuc.edu (04/23/87)

I think the number of English words is an order of magnitute
greater than 2^14-1 (16383), especially if we count all the
forms of words (word,words,wording,worded) and common proper
names (California, Fred, ...).



Carl Kadie
University of Illinois at Urbana-Champaign
UUCP: {ihnp4,pur-ee,convex}!uiucdcs!kadie
CSNET: kadie@UIUC.CSNET
ARPA: kadie@M.CS.UIUC.EDU (kadie@UIUC.ARPA)

pdg@ihdev.UUCP (04/23/87)

In article <161200002@uiucdcsb> kadie@uiucdcsb.cs.uiuc.edu writes:
>I think the number of English words is an order of magnitute
>greater than 2^14-1 (16383), especially if we count all the
>forms of words (word,words,wording,worded) and common proper
>names (California, Fred, ...).

The issue is not the number of words, but the number of often used
words.  My understanding is (although I have no sources to back up
this claim - anyone??) that the average person has a vocabulary of
aprox 10,000 words.  Actually, this would be easy to derive from all
of the news articles (a good cross section of (reasonably :-) educated
people).  I wasn't planning on going through the dictionary AARDVARK=1
etc.  Something more like (THE=1, AN=2, A=3, TO=4, etc).  

-- 

Paul Guthrie
ihnp4!ihdev!pdg			This Brain left intentionally blank.