paulv@cwi.NL (Paul Vitanyi) (07/18/90)
The following result on average case complexity, contained in a `learning' paper by Ming Li and myself, FOCS89, 34-39, may be of interest to a wider audience. Theorem: Average case complexity (like time, space complexity) under the universal distribution on the inputs equals THETA (worst case complexity). Example: the average running time for Quicksort under the universal distribution is THETA (n^2). Comments: For many algorithms the average case running time under some distributions on the inputs is less than the worst-case running time. For Quicksort and the uniform distribution the former is n log n, while the latter is n^2. Clearly, for each algorithm we can find a distribution on the inputs such that the average running time is equal to the worst case running time (concentrate the probability on the worst cases). The question arises, if for some fixed natural distribution there are problems which have average time equal to worst case time. L.A. Levin (1985) has shown that there are NP-complete problems (like tiling) such that the average time under the uniform distribution is polynomial iff P = NP. The above theorem shows that there is one fixed distribution, the universal one, such that the average case time complexity equals the worst case time complexity (up to a constant factor) for *all* problems. Background: The universal distribution (R.J. Solomonoff 1964, L.A. Levin 1970, 1974) is m(x) = 2^-K(x), where K(x) is the prefix variant of Kolmogorov complexity. (The length of the shortest program from which the reference universal TM computes x --- the set of programs of this machine being prefix-free.) Thus, a string x = 11..1 of length n has K(x) <= log n + 2 log log n and m(x) >= 1/(n log^2 n). a string y of length n generated by fair coin tosses has with overwhelming probability K(y) >= n and m(x) <= 1/2^n. (Everything up to fixed constant terms/factors.) Thus, m(x) embodies Occam's razor by assigning high probability to simple objects and low probability to complex objects. Is the universal distribution natural to consider? One would think yes. It dominates all computable distributions P: m(x) >= c P(x) for a fixed constant c depending only on P (c=2-K(P)) by a result of Levin. This implies that, for any computable distribution P(x), the P-probability of P(x) being close to m(x) is high: sum $P(x): k m (x) >= P(x) >= m (x)/k >= 1 - 1/k, for all k>=K(P). These statements actually hold for an even wider class of distributions which can be approximated from below by recursive functions: the enumerable distributions. All distributions which have a name, or have been used in statistics, appear to be enumerable. In absence of any a priori knowledge of the actual distribution therefore, apart from that it is enumerable, studying the average behavior under m may be more meaningful than studying the average behavior under any other particular enumerable distribution. By the above result this average is the worst case. Copies of the full paper, mercifully short, can be obtained upon request at the above email address. Since K(x) is not computable, the question arises about feasible analogues to the above result. This has since been studied by Peter Bro Milterson, The complexity of difficult measures, Manuscript, DAIMI, Univ. Aarhus, May 1990.