[comp.text] Text Statistics, Letter Frequencies

jms@close.columbia.edu (Jonathan M. Smith) (06/17/88)

Sorry I lost the article that this responds to. A fine source of letter
frequencies is the paper:
%A L. E. McMahon
%A L. L. Cherry 
%A R. Morris
%T Statistical Text Processing
%J The Bell System Technical Journal,
%D July-August 1978
%P 2137-2154

In addition, it provides methodology for gathering your own, as well as a
short reference list that may help you.

I have some [1-4]-gram statistics gathered from man pages (/usr/man)
and C programs (/usr/src/cmd, Sys V R 2) that I'll provide if you're
interested.

							-Jonathan