[net.religion] Statistical Authorship Analysis for the Book of Mormon

russ@dadla-a.UUCP (11/09/83)

In this article I am going to discuss a recent technique that is used for
determining authorship and the application of this technique to the Book of
Mormon.  I will leave most of the details for later articles if there is
any interest.

Recently computer analysis techniques have been used to establish
authorship of several disputed documents.  An example is the Federalist
papers.  Although they were published anonymously, the author of 73 of
these was determined to be John Jay (5) and the rest divided between
Alexander Hamilton and James Madison.  There were twelve that were left
open to question.  Using frequency of usage of the small filler words, they
found overwhelming evidence favoring Madison as the author of all twelve
disputed papers.

A second example deals with an unfinished novel by Jane Austin when she
died in 1817.  A skilled author completed the novel and had it published.
Although she duplicated the style she failed to duplicate the subconscious
habits of detail.  When these habit patterns were examined, the difference
was clearly evident.

The noncontextual words which have been most successful in discriminating
among authors are the filler words of the language such as prepositions and
conjunctions, and sometimes adjectives and adverbs.  Authors differ in
their rates of usage of these filler words.

Three different types of "wordprints" or stylometry were used in examining
the authors of the Book of Mormon:  (1) frequency of letters, (2) frequency
of commonly occurring non-contextual words, (3) frequency of rarely
occurring noncontextual words.  Three types of statistical methods will be
used with this data: Multivariate Analysis of Variance (MANOVA), Cluster
Analysis, and Discriminant or Classification Analysis.

Most of the Book of Mormon was abridged by Mormon and his son Moroni.  A
section of plates called the small plates of Nephi include the writing of
mainly Nephi and Jacob.  Additionally several sections appear to be quoting
from other authors.  These are included as additional authors.  We end up
with a total of 22 authors that are represented by at least 1000 words.

By comparing the 10 most frequent words (and, the, of, that, to unto, in,
it, for and be) the statistical odds of a single author was found to be 1
in 100 billion.  "However, this number should not be taken too literally.
It depends on several assumptions, one of which is that we have a random
sample of each author's writings." Using all 38 frequently occurring words,
42 uncommon words and frequency of letters a similar result was obtained.

Writing of Joseph Smith and his contemporaries was also included. Ninety
blocks of words were used from Joseph Smith, W. W. Phelps, Oliver Cowdery,
Parley P. Pratt, Sidney Rigdon, and Solomon Spaulding.  Two important
points came out of this comparison: (1) There is some evidence of a
wordprint time trend within the Book of Mormon; i.e. writers are more
similar to their contemporaries than to writers in other time periods. (2)
None of the Book of Mormon selections resembled the writing of any of the
suggested nineteenth-century authors.  This remained true even when formal
words such as hath, unto, etc. were removed from the analysis.  The results
depended as much on words such as [and, for, of] as on any other of the
words.

The preceding information was derived using MANOVA.  Cluster analysis tries
to group similarities in a multidimensional comparison.  Using 9 words
which discriminated best from the MANOVA, the cluster analysis yielded some
of the following: (1) Nephi's word blocks paired with those of his father,
Lehi and they together paired with Nephi's brother Jacob and Isaiah, the
prophet most quoted by Nephi and Jacob, (2) Alma's word blocks grouped with
those of Amulek, his missionary companion; and they both paired with
Abinadi, the man who converted Alma's father, (3) Samuel the Lamanite and
Nephi, son of Helaman grouped together and they were contemporaries.  When
the nineteenth-century authors were added, in general, word blocks of Book
of Mormon authors clustered with those of Book of Mormon authors, and word
blocks of non-Book of Mormon authors clustered with those of non-Book of
Mormon authors.

Disriminant Analysis was used with 2000 word groups comparing 18 words.
93.3 percent of the blocks were correctly identified.  This is very high
for this many authors and was unexpected.  If the block was left out while
computing the classification functions and then classified the results were
still in the 70 and 80 percent range.

Another test was to compute two discriminant functions that allowed the
authors to be plotted in a two-dimensional plot.  The Book of Mormon and
non-Book of Mormon authors were clearly separate and a straight line could
be drawn between both groups.

Joseph Smith was also compared with the four major Book of Mormon authors
and plotted in two dimensions.  Again each author clustered by themselves
and Joseph Smith's writing is very definitely distinct from that of the
authors in the Book of Mormon.

Russell Anderson
Tektronix

-------------------------
Wayne A. Larsen and Alvin C. Rencher, "Who Wrote the Book of Mormon?  An
Analysis of Wordprints,"  Book of Mormon Authorship, ed. Noel B. Reynolds.
(Bookcraft, 1982)