[sci.math] Lankinen's recursive factoring algorithm

ld231782@longs.LANCE.ColoState.EDU (L. Detweiler) (03/17/91)

Let f(u,v,b,n) := | [otherwise]
                  |  f(u+b,v,2b,(n-v)/2) or f(u,v+b,2b,(n-u)/2) if n odd
                  |  f(u,v,2b,n/2) or f(u+b,v+b,2b,(n-u-v-b)/2) if n even
 
Thm: f(1,1,2,(m-1)/2) = (p,q) iff pq=m for odd m.
---
 
Pf:
--
 
Let the term `call' refer to any (recursive) instance of the function
(including the `base case' of the thm above).  More formally, if in the
definition above `or' is taken as set union, and undefined cases as the
empty set, then the range of the function f (when called in the specified
way) is a set of pairs, and a pair (p,q) is contained in this set iff pq=m
(the domains of all parameters are integers).  For purposes of explanation
let the term `return' informally denote this membership.
 
Main Thm:
--------
 
Then the thm becomes
 
f returns (p,q) iff pq=m for odd m when called in the above form.
 
That is, f returns the set of all factors of m.  The pf will be established
inductively.
 
Inductive Hypotheses:
--------------------
 
  (i) f is called for all uv=m (mod b) with
 (ii) u,v odd, b even, u<b, v<b,
(iii) n=(m-uv)/b, uv<=m, and
 (iv) all if n>=0.
 
Inductive Thm:
-------------
 
The inductive thm to be established here is:
 
If b>m, then f returns (u,v), otherwise f is called for all u'v'=m (mod b')
with u',v' odd, b' even, u'<b', v'<b', n'<(m-u'v')/b', and u'v'<=m.
 
It will be shown that there is an exact correlation between the inductive
thm, instances of the function, and ultimately the main thm above.  So if
the inductive thm holds and the base case satisfies the hypotheses then the
main thm would be established, because the thm either satisfies the main
assertion or `delegates' the proof to another case that does by preserving
all inductive conditions (other than n'>=0, but the function is undefined
when n'<0).
 
Hence, for the inductive hypotheses and thm, the variables can be taken for
those in any (recursive) instance of the function, where the `prime'
variables denote the new values of the recursive call.
 
Main Thm by Hypotheses:
----------------------
 
Proceeding, note that for all recursive calls in f, n'<n and b'>b, so
eventually b>m or n<0.  The case where n<0 is irrelevant to the question of
what f returns (since it is undefined, it `doesn't return anything').
Otherwise, b>m.
 
Since, by hypothesis,                     uv = m (mod b)
                                          (u mod b)(v mod b) mod b = m mod b
Also, since u<b and v<b it follows that   (uv) mod b = m mod b
Then with uv<=m<b,                        uv=m
Finally, with n=(m-uv)/b,                 n=0
 
That is, when n>=0, b>m iff n=0 iff uv=m.  this establishes the exact
correspondence of the inductive thm, instances of the function, and the main
thm above.
 
It remains to establish that the inductive conditions hold in the recursive
function calls.
 
Inductive Conditions:
--------------------
 
(ii)
----
 
For a convenient abbreviation, herein let the variable `z' designate `both u
and v' or `either u or v' as determined by context.  That is, whatever holds
for z holds for both u and v, or either u or v, as the case may be.
 
By hypothesis, z is odd, and all recursive calls have either z'=z or z'=z+b
where b is even.  Hence z' is always odd.  Also, in all cases b'=2b, so b'
is even (in fact it must be a power of two from the base case).  For the
inequalities, if z'=z then z'<b' because z<b<2b.  Otherwise z'=z+b and z'<b'
because z+b<2b where z<b by hypothesis.
 
(iii)
-----
 
The equality is established for all cases from the equation n=(m-uv)/b of
the hypotheses.
 
                 n-v   m-uv-vb
In one case n' = --- = ------- and u'=u+b,v'=v, so n'=(m-u'v')/b'.
                  2      2b
 
By symmetry this holds for the other case when n is odd.  When n is even,
in one case
 
     n   m-uv
n' = - = ---- with u'=u and v'=v, so again n'=(m-u'v')/b'.
     2    2b
 
In the final case where u'=u+b, v'=v+b,
 
                           2
     n-u-v-b   m-uv-ub-vb-b    m-(u+b)(v+b)   m-u'v'
n' = ------- = ------------- = ------------ = ------
        2           2b              2b          b'
 
Finally, u'v'<=m holds because with n'=(m-u'v')/b' in all cases,
m=b'n'+u'v', and u'v'<=b'n'+u'v' for all (relevant) n'>=0.
 
(iv)
----
 
As noted, since the function is undefined when n<0 the case is
inconsequential (it is automatically satisfied).
 
(i)
---
 
First the justification of the `all' qualifier will be postponed, so that
it remains to show that u'v'=m (mod b').  All cases are established with
m=nb+uv by hypothesis. When u'=u+b and v'=v,
 
 
      u'v' = m (mod b')
(u + b)(v) = (nb + uv) (mod 2b)
   uv + vb = nb + uv (mod 2b)
        vb = nb (mod 2b)
         v = n (mod 2)
 
which follows because v is odd by hypothesis and n is odd in this case.
(``The reversion to synthesis is easy.'' --Fermat)  By symmetry
u'v'=m (mod b') holds for the other case of odd n.  When u'=u and v'=v,
 
u'v'= m (mod b')
 uv = m (mod 2b)
 uv = nb + uv (mod 2b)
  0 = nb (mod 2b)
  0 = n (mod 2)
 
which follows because n is even in this case.  Finally, when u'=u+b,v'=v+b,
 
              u'v' = m (mod b')
    (u + b)(v + b) = nb + uv (mod 2b)
uv + ub + vb + b^2 = nb + uv (mod 2b)
     ub + vb + b^2 = nb (mod 2b)
    (u + v)b + b^2 = nb (mod 2b)
             u + v = n (mod 2)
 
which follows because u and v are odd by hypothesis and n is even in this
case.
 
Finally, the `all' qualifier must be justified to hold as an inductive
condition.  It asserts that every case that satisfies all conditions
(i)-(iv) is encompassed by the multiple recursive calls (as well as the base
case).  For this some additional notation will be introduced.  For any pq=m
let
 
    p - p mod b      q - q mod b
x = -----------, y = -----------
        b                 b
 
for each instance of the function (in every recursive call).  With u<b and
v<b, the substitutions xb+u=p, yb+v=q imply u = p mod b, v = q mod b, and
since n=(m-uv)/b, n can then be expressed as
                                                         2               2
m-(p mod b)(q mod b)   pq-(p-bx)(q-by)   pq-(pq-qbx-pby+b xy)   qbx+pby-b xy
-------------------- = --------------- = -------------------- = ------------
          b                   b                    b                  b
 
or n=qx+py-bxy.  From this the following is readily verified, given that
that b is even and p,q odd: x,y have the same or opposite parity according
as n is even or odd, respectively.  Hence when n is odd x and y have
opposite parity, so let x=2x'+1 and y=2y'.
 
        xb + u = p      yb + v = q
(2x' + 1)b + u = p    2y'b + v = q
  x'2b + b + u = p    y'2b + v = q
 
With these variables the considerations above establish that x'b'+u'=p and
y'b'+v'=q.  This requires that u'=u+b and v'=v, as is exactly the case in
the function itself when n is odd (symmetry gives the other assignment with
u'=u and v'=v+b).  Similar considerations show that when n is even, the
inductive hypotheses alone require that u'=u and v'=v or u'=u+b and v'=v+b,
again as in the function.  Thus the function is called recursively for all
permissable values in the prime variables (that is, all those that satisfy
the inductive conditions).
 
Base Case:
---------
 
Because the main thm either holds for the inductive hypothesis or preserves
the inductive conditions, it only remains to show that the base case
satisfies the conditions as well to prove the thm.  However, this is trivial
to verify for all conditions (i)-(iv).  Since the base case is the *only*
form that satisfies (i) for b=2, the function is called for *all* parameters
satisfying (ii)-(iv), thereby satisfying (i) and completing the proof of the
main thm.
 
* * *
 
It may be illuminating to consider the relation of the Lankinen function in
a `computational hierarchy' of other factoring functions.*  Assumptions are
made herein on the basis of conventional digital (binary) computers.  Also,
complexity orders are given for the worst case scenarios (when the number to
be factored is prime).  However, all algorithms would probably perform to
the same constant multiple of the given orders for complete composite
factorizations.
 
Thm: Eratosthenes' Sieve is very roughtly O(ln(n)/n) in time and
     O(n*log2(n)) in space.
Pf: It works with all prime factors less than n (about ln(n)/n by the prime
    number thm), requiring an array of size proportional to n with log2(n)
    space for each entry.
 
Thm: `Odd factors' is O((sqrt(n)/2)*log2(n)) in time and O(log2(n)) in
     space.
Pf: It tests all odd factors less than the square root of n (about
    sqrt(n)/2), with log2(n) time for each division.  It requires only
    log2(n) space for the number and divisors.
 
Thm: Lankinen's algorithm is O(sqrt(n)/2) in time and O((sqrt(n)/2)*log2(n))
     in space.
Pf: The algorithm is easily modified to seach only for factors p<q for all
    pq=m.  Then the recursive call tree forms a geometric progression
    starting at one, and doubling until reaching sqrt(n)/2, or a length of
    log2(sqrt(n)/2).  From the formula for a geometric progression, there is
    a total of about 2^log2(sqrt(n)/2) = sqrt(n)/2 calls.  Assuming that
    addition, subtraction, comparison, and multiplication/division by two
    occur in constant time, this implies O(sqrt(n)/2) time and a
    O((sqrt(n)/2)*log2(n)) requirement of stack space.
 
Hence, if a ranking of factoring algorithms is viewed as a continuum, with
Eratothenes' sieve and `odd factors' at the extremes (the former with
relatively large space requirements but minimal time demand, and vice-versa
with the latter), Lankinen's function comes somewhere in the middle.  In
regard to timing characteristics, there is probably some number N such that,
in factoring it, Lankinen's algorithm and `odd factors' have the same
execution time (the large time overhead associated with the stack
maintenance of the former would cause a delay in overtaking the latter).
However, because Lankinen's is faster by a factor of about O(log2(n)), for
numbers around the range of twice N the expectation is that Lankinen's
algorithm would be twice as fast, and for numbers around twice that
(quadruple N), the algorithm should be three times faster, and so on.  Of
course, space requirements are much more significant, but still less than
the extreme.
 
* The author admits to some loss of rigor and confidence here, but hopefully
  nonetheless within the bounds of error.  This section isn't intended to be
  authoritative so much as it is to spur interest and research.
 
P.S. Lankinen's function might be useful in attacking the numerous
     unresolved conjectures about primes (e.g. on the infinitude of `twins'
     or Goldbach's).
 
P.P.S. Email the author for C code that implements Lankinen's algorithm or
       information on its historical development.
 

ld231782@longs.LANCE.ColoState.EDU

ld231782@longs.LANCE.ColoState.EDU (Lawrence Detweiler) (03/18/91)

- - -

Curse the phantom USENET lineater! <13559@ccncsu.ColoState.EDU> should read:

]                   |  undefined if n<0,
]                   |  (u,v) if n=0,
] Let f(u,v,b,n) := | [otherwise]
]                   |  f(u+b,v,2b,(n-v)/2) or f(u,v+b,2b,(n-u)/2) if n odd
]                   |  f(u,v,2b,n/2) or f(u+b,v+b,2b,(n-u-v-b)/2) if n even
]
] Thm: f(1,1,2,(m-1)/2) = (p,q) iff pq=m for odd m.
] ---


ld231782@longs.LANCE.ColoState.EDU