bs@faron.UUCP (Robert D. Silverman) (11/16/84)
I have implemented both in C and in VAX assembler a set of multiple precision arithmetic routines which are quite fast. Using them I have also implemented numerous state-of the art factorization and prime testing routines including the Continued Fraction Algorithm, Pollard P-1 and P+1, Pollard Rho, and primality testing based upon Lucasian criteria. They are all quite fast and I believe that they run about as fast as possible on the VAX. The continued fraction program, for example, factors typical 40 digit numbers in about an hour, and 50 digits numbers in about 35 hours. The key problems involved in implementing multi-precision routines on the VAX is the high cost of subroutine calls and the relatively slow speed of the EMUL and EDIV instructions. A typical CALLS instruction on the VAX with 3 arguments takes about 15 usec to execute. Thus, calling a routine to (say) add 2 multi-precise numbers is quite expensive. The EMUL instruction takes slightly less than 7 usec to multiply 2 32 bit numbers giving a 64 bit product and EDIV takes about 12 usec to divide a 64 bit number by a 32 bit number. One cannot generate code to utilise EDIV and EMUL from C so the low level arithmetic primitives must either be done in assembler, or you restrict yourself to working in radix 2^15 so you can multiply 2 integers without overflow. I would be happy to privately respond to any queries. P.S. I am also currently engaged in implementing the new Cohen-Lenstra prime testing algorithm and would like to hear from anyone who is interested in it.
keith@reed.UUCP (Packard) (11/29/84)
I have a very good use for multi precision arithmetic. I have already written my own (not the best but functional) in C for the VAX as a part of a package that includes a pseudo machine that executes code generated by a compiler doing multi precision rational arithmetic. The compiler accepts a large subset of C as input. The pseudo machine is implemented as a stack machine with several built in functions: scanf, printf, gcd, ... As a bench mark, my pseudo machine computes the primes to 100 in 3.4 seconds (sorry it's so slow): /* * compute prime numbers using a silly algorithm */ main () { auto i,j,k, limit, c; printf ("limit: "); scanf ("%d", limit); c = 2; printf ("2\t3\t"); for (i = 5; i < limit; i = i + 2) { j = sqrt (i, 4); for (k = 3; k <= j; k += 2) { if (~ i % k) { goto bad; } } printf ("%d", i); if ((c += 1) == 8) { c = 0; printf ("\n"); } else printf ("\t"); bad: ; } if (c) printf ("\n"); } sqrt (n, m) { auto i, t, j; j = int (n / 2); /* note built in function int */ for (i = 0; i < m; i += 1) { if ((t = int (n / j)) == j) return j; j = int ((t + j) / 2); } return j; } but it computes the factorial of 100: 9332621544394415268169923885626670049071596826438162146859296389521759999322991 5608941463976156518286253697920827223758251185210916864000000000000000000000000 in only 0.3 seconds main () { /* * note storage class used instead of type, no types in * this language */ auto i; while (scanf ("%d", i) == 1) printf ("%d\n", i!); exit (0); /* actually returns 0 all the way back to the shell! */ } I have basically finished with this project but if I found better arithmetic routines (better than those found in Knuth) I would certainly like to include them. I am a compiler writer by nature, this complicated math stuff takes too much work to verify, out of ~5000 lines of code for this system, the 1500 lines of math routines took at least 50% of the time to get working. I am willing to distribute this package to anyone interested. Unfortunately, the math routines are rather VAX specific I think they will work on a 16032 or any other 32 bit machine with correct byte order but, I have not tested this (even though I have access to a 16032 machine running 4.2 unix, if you want to know if it works I could ethernet the sources down there and try it before sending the whole mess.) keith packard ...tektronix!reed!keith or ...tektronix!tekmdp!keithp - no sources here but a better connection
steven@mcvax.UUCP (Steven Pemberton) (12/03/84)
We are producing here a new programming language, B, that includes unbounded-length rational arithmetic. The implementation we have is an interpreter rather than a compiler, but despite that it can generate the primes to 100 in about the same time as that mentioned (actually slightly faster), though factorial 100 takes longer. In comparing it with bc on Unix, it compares well, often being very much faster. The whole system, including the arithmetic, is largely machine independent - we've ported it to dozens of machines - though it is Unix dependent. We're currently producing a non-Unix version for the IBM PC. For non-commercial use it only costs the price of the media - if you're interested mail me. One lovely example of the use of unbounded arithmetic is this following B program by Lambert Meertens for producing the decimal digits of pi (after printing 80 digits of pi, a, b, c, and d are all larger than 10**200). HOW'TO PI: WRITE '3.' PUT 3, 0, 40, 4, 24, 0, 1 IN k, a, b, c, d, e, f WHILE 1=1: PUT k**2, 2*k+1, k+1 IN p, q, k PUT b, p*a+q*b, d, p*c+q*d IN a, b, c, d PUT f, floor(b/d) IN e, f WHILE e=f: WRITE e<<1 PUT 10*(a-e*c), 10*(b-f*d) IN a, b PUT floor(a/c), floor(b/d) IN e, f:> Most of it should be obvious: PUT ... IN ... is the assignment command; WRITE e<<1 writes e in width 1. Rather than let the program run to completion :-), here is a little of its output: 3.14159265358979323846264338327950288419716939937510582097494459230781640628620899 Steven Pemberton, CWI, Amsterdam.
west@sphinx.UChicago.UUCP (Steve Westfall) (12/05/84)
I think that you should rename your new language something other than "B". I have never seen it, but a number of books mention an already- existing language called "B" that was a predecessor of C (see, for example, S.R. Bourne, The Unix System, page 74). -- Steve Westfall, Staff Analyst University of Chicago Computation Center uucp: ihnp4!gargoyle!sphinx!west Mailnet: staff.westfall@UChicago.Mailnet
keithd@cadovax.UUCP (Keith Doyle) (12/06/84)
Speaking of PI and multiple precision arithmetic, after once upon a time looking at the infinite series for computing PI, it seemed possible to write a routine that could run forever, computing digits for PI. No need for umpteen digit precision arithmetic, as it can be determined when a digit will no longer be affected by further computations, thus allowing the digit to be produced, and the 'remainder' of the PI computation to be scaled up in such a way as no precision is lost, and only a reasonable integer precision is necessary for any of the intermediate results. I've always wanted to do this, and produce a subroutine that would return the 'next' digit of PI, but never got around to it. Has anyone out there done this? Thought it would be fun to see how it performs as a random number generator, etc.. Keith Doyle {ucbvax,ihnp4,decvax}!trwrb!cadovax!keithd "You'll PAY to know what you REALLY think!" P.S. Who knows, maybe I'll finally get around to working this one out myself. Hardly high priority.
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (12/08/84)
> ... it seemed possible to write a routine that could run forever, > computing digits for PI. ... I once saw a MACRO-11 routine (quite small) for computing the digits of "e" indefinitely, using only a small amount of storage. I tried it and it worked just fine. Does anyone recall how this is done?
tstorm@vu44.UUCP (Theo van der Storm) (12/09/84)
Here is another (faster) algorithm to calculate pi. The algorithm that was posted by mcvax!steven seems to be working with unnecessary big numbers, although it is in some sense "incremental". This algorithm is not "incremental". ::::::::::::::::::::::::::: First a B implementation of the algorithm: ::::::::::::::::::::::::::: HOW'TO PI n: \ Print the first n decimals of pi. mcvax!vu44!tstorm WRITE 'Approximation of pi in `n` digits' / WRITE (arcam(48,18,n))+(arcam(32,57,n))+(arcam(-20,239,n)) / YIELD arcam(a, m, n): \ Calculate (10**n)*a*arctan(1/m), mcvax!vu44!tstorm PUT floor(((10**n)*a)/m), -m*m, 0, 1, 0 IN t, mc, s, kk, i WHILE t<>0: PUT s+floor(t/kk) IN s PUT floor(t/mc), kk+2, i+1 IN t, kk, i IF m=18: WRITE 'Error < `i+i` units of the last decimal' / RETURN s+ceiling(i/2) ::::::::::::::::::::::::::: And now a BC implementation Old BC versions use the =OP notation the version I'm working with uses OP= notation. Change line 5 if necessary. ::::::::::::::::::::::::::: define t(a, m, n){ /* calculate (10^n)*a*arctan(1/m), mcvax!vu44!tstorm */ t = ((10^n)*a)/m; c = -m*m; s = 0; k = 1 while (t != 0){ s += t/k; t /= c; k += 2 } return(s) } define p(n){ "Approximation of pi. Number of digits: "; n scale = 0; return(t(48, 18, n)+t(32, 57, n)+t(-20, 239, n)) } ::::::::::::::::::::::::::: -- Theo van der Storm, 52 20'N / 4 52'E, {seismo|decvax}!mcvax!vu44!tstorm
jfh@browngr.UUCP (John "Spike" Hughes) (12/10/84)
Computing 'e', using finite precision: I had this as an assignment in a programming course at Princeton in 1975 (although I never did do the asst.) The idea (at least in part) was to use variable radix expansions, i.e. to express a number not in 10th's, hundreth's, etc., but instead as some number of 1/(2!)'s, some number of 1/(3!)'s, etc. In this form, e = 1.1111111111111... etc. I can't remember what one did from there...
cooper@pbsvax.DEC (Topher Cooper HLO2-3/M08 DTN225-5819) (12/12/84)
lines: 45 > ... it seemed possible to write a routine that could run forever computing > digits for PI. No need for umpteen digit precision arithmetic. Sounds neat, but I'm afraid its not possible; at least not unless PI is rational. I seem to remember that the non-rationality of PI has never been proven, but the smart money is on that side. Let's start by assuming an iterative (non-recursive) algorithm which cranks along with a fixed amount of space, emitting digits of PI every once in a while. The state at the end of the code being iterated over, and what is emitted during its execution, is completely determined by the state at the start of the code's execution. Since there is a fixed amount of storage (say N bits) there is a fixed number of states which it could be in. After at most 2^N iterations a state would be repeated and the a repeating pattern would start to be emitted. Of course, very good approximations can be done in principal with very little storage. Conceivably, a single 16 bit state variable could result in 64Kbits (roughly 20,000 digits). The same argument applies to recursive algorithms, but in this case the extra state may be hidden in the stack. An exact algorithm which didn't start with infinite storage would have to grow (perhaps slowly). This is usually done with dynamic alocation of multiple precision integers, floating-points, or rationals; but could be done in other ways. > I once saw a MACRO-11 routine (quite small) for computing the digits > of "e" indefinitely, using only a small amount of storage. The same argument applies. Either the routine was an approximation to some large number of digits, or it used a trick like a slowly increasing recursive depth to approximate infinite storage. I think any elegant algorithms along these lines would be of interest. Topher USENET: ...decvax!decwrl!dec-rhea!dec-pbsvax!cooper ARPA: cooper%pbsvax.DEC@decwrl.ARPA