flinn@seismo.UUCP (E. A. Flinn) (01/09/84)
Allen Lichtman and Volodya Keilis-Borok published a paper in the Proceedings of the National Academy of Sciences in 1981 showing that it should be possible, using a straightforward pattern recognition technique, to predict the outcome of presidential elections. The paper is "Pattern Recognition Applied to Presidential Elections in the United States, 1860-1980: Role of integral social, economic, and political traits" (PNAS vol. 78, pp. 7230-7234, 1981). The rule which assigns a given election into class "I" (favoring the incumbent party) or "C" (favoring the challenging party) is based on the answers to 12 questions: 1. Has the incumbent party been in office more than a single term? (no) 2. Did the incumbent party gain more than 50% of the popular vote cast in the previous election? (yes) 3. Was there major third party activity during the election year? (no) 4. Was there a serious contest for the nomination of the incumbent party candidate? (no) 5. Was the incumbent party candidate the sitting president? (yes) 6. Was the election year a time of recession or depression? (no) 7. Was the yearly mean per capita rate of growth in real gross national product during the incumbent administration equal to or greater than the mean rate in the previous eight years, and equal to or greater than 1%? (yes) 8. Did the incumbent president initiate major changes in national policy? (yes) 9. Was there major social unrest in the nation during the incumbent administration? (no) 10. Was the incumbent administration tainted by scandal? (no) 11. Is the incumbent party candidate charismatic or a national hero? (yes) 12. Is the challenging party candidate charismatic or a national hero? (no) The answers in parentheses favor the victory of the incumbent party. The algorithm used is Cheming's distance algorithm, in which each election year is described by the binary vector Y sub i, whose components are X sub 1, X sub 2, ..., X sub n, where X sub i = 1 or X sub i = 0 is the answer to the i'th question. For each question, two numbers are computed to indicate the distribution of X sub i in class I or class C: P(i/I) = n(i,I)/n(I) P(i,C) = n(i,C)/n(C) where n(i,I) is the number of elections in which X sub i = 1 for class I, similarly for n(i,C), and n(I) and n(C) are the number of elections included in the learning material. The P's are used to form a kernel representing the traits favoring I or C. The kernel is a binary vector whose components K sub i = 1 if P(i,I) - P(i/C) >= k, and K sub i = 0 if P(i/C) - P(i/I) >= k; otherwise X sub i is not used in the kernel. Lichtman and Keilis-Borok used k = 0.1, but found that the set of questions used is stable to variations in k. The distance D between the kernel and a given election is defined as D = sum over i W sub i delta(K sub i, X sub i) where delta(K sub i, X sub i) = 1 if K sub i != X sub i (i.e., that the value of X sub i for that election differs from the value for I victory) and delta(K sub i, X sub i) = 0 when K sub i = X sub i. Lichtman and Keilis-Borok used W sub i = 1. D is the Cheming distance, the number of answers favoring C victory. Defining the maximum value of D for all preceding elections won by I as DI+, and the minimum value of D for all preceding elections won by C as DC-, the prediction is that an election will be won by I if D < DC- and D <= DI+ and that it will be won by C if D > DI+ and D >= DC-. If neither condition is satisfied, the prediction is indeterminate. Lichtman and Keilis-Borok performed a number of statistical tests on their questions, and were satisfied that alternate hypotheses can be rejected, and that a number of logical-sounding other questions do not add to the reliability of the predicting algorithm. They also concluded that the circumstances favorable to the same party retaining control of the Presidency have not changed since Lincoln's time. Taking the elections one by one since 1860 and predicting the next election, Lichtman and Keilis-Borok found that every election except those of 1908 and 1912 were successfully predicted - 1908 was incorrect, and 1912 was indeterminate. Using the results 1860-1976 as learning material, the algorithm predicted that Reagan would defeat Carter if the answers to six or more of the questions favored the challenging party. Keilis-Borok is a geophysicist who has done a lot of work on pattern recognition approaches to earthquake prediction; Lichtman is a professor of history at American University in Washington, D.C. This work was done in early 1979 when both were Fairchild Scholars at Caltech. I was at Caltech at the time, working with Keilis-Borok on earthquake prediction, and had lively discussions with the authors about the best assumed answers for the 1980 election. The paper was not published until after this election, although it contains their prediction that it was overwhelmingly likely that Carter would lose to Reagan: their calculation was for D = 10. To predict the result of the 1984 election using this algorithm it would be necessary to program the algorithm to include 1980. However, it seems to me that nine of the twelve questions favor Reagan (2, 4, 5, 6, 7, 9, 10, 11, and 12), while only one favors the Democrats (1), while I think two are indeterminate at present (3 and 8). On this basis it appears to me that there is no way any of the Democratic candidates could defeat Reagan this fall.