bvs@light.uucp (Bakul Shah) (11/01/90)
In article <90302.140837CXT105@psuvm.psu.edu> CXT105@psuvm.psu.edu (Christopher Tate) writes: > ... >I will mention again, though, that this code contains a very well-behaved >implementation of Euclid's classic algorithm for calculating the Greatest >Common Divisor of two integers. Anyone who's looking for such a beast, >feel free to borrow it! A couple of comments on the GCD algorithm. - To justify posting this to a PostScript group, here is a faster version of basically the same algorithm: /gcd { % given m, n on the stack dup 3 1 roll mod % leaves n, m mod n on the stack dup 0 eq { pop % leaves n on the stack when done } { gcd % find gcd of the new pair } ifelse } bind def Not too surprisingly, this recursive version is faster than the one below which uses an explicit loop: /gcd { { dup 3 1 roll mod dup 0 eq { pop exit } if } loop } bind def - I discovered a curious fact when I looked through a bunch of math/CS books for an exact description of Euclid's algorithm. The three math books I looked at describe the _faster_ algorithm (the one used in the spirograph program) as Euclid's alg., while the three CS books describe the following _slower_ one as Euclid's: /* using guarded commands... */ loop { a > b => a -= b; b > a => b -= a; } The loop terminates when neither guard is true, i.e. when a == b. This extends in a nice (but slow) way for the GCD of N numbers. As an example, gcd(a, b, c, ... n) is loop { a > b => a -= b; b > c => b -= c; c > d => c -= d; ... n > a => n -= a; } So.... Who is right? How did two different alg. came to be known as Euclid's? Does the faster alg. also have such a nice generalization for gcd of N numbers? The slower alg. can be modified slightly to simulteneously compute LCM. Can the same thing be done with the faster one? Time to visit a library! -- Bakul Shah bvs@BitBlocks.COM ...!{ames,decwrl,sun,uunet}!amdcad!light!bvs
CXT105@psuvm.psu.edu (Christopher Tate) (11/01/90)
In article <1990Oct31.182705.3184@light.uucp>, bvs@light.uucp (Bakul Shah) says: >- To justify posting this to a PostScript group, here is a faster > version of basically the same algorithm: > > /gcd { % given m, n on the stack > dup 3 1 roll mod % leaves n, m mod n on the stack > dup 0 eq { > pop % leaves n on the stack when done > } { > gcd % find gcd of the new pair > } ifelse > } bind def > > Not too surprisingly, this recursive version is faster than > the one below which uses an explicit loop: > > /gcd { > { dup 3 1 roll mod dup 0 eq { pop exit } if } loop > } bind def > > [stuff deleted] > > So.... Who is right? How did two different alg. came to be > known as Euclid's? Does the faster alg. also have such a nice > generalization for gcd of N numbers? The slower alg. can be > modified slightly to simulteneously compute LCM. Can the same > thing be done with the faster one? Time to visit a library! As it happens, I'm takinga computational number theory course even as we speak :-), so here's what my text has to say on the subject of Euclid's GCD algorithm.... Essentially, the whole thing works because of this odd property of the GCD; namely, that gcd(a, b) = m*a + n*b for some integers m and n (not necessarily unique). From this, we can do some fiddling with the properties of division and so on to show that gcd(a, b) = gcd(b, a - m*b) for ANY integer m. In Euclid's algorithm, we essentially take m to be as large as possible, i.e. a - m*b is as close to zero as possible. Since we can write r = a mod b = a - m*b (for m as large as possible), we simply use a mod b for the next iteration, and try to calculate gcd(b, a mod b). I once ran across a version of this algorithm (in C) which looked like this: int gcd(int a, int b) { if (a>b) return gcd(a, a-b); else if (a<b) return gcd(b, b-a); else return a; } Essentially, taking the smallest steps down (using a-b instead of a mod b) possible at a time, but avoiding any division. Still used an awful lot of stack space sometimes, though.... I'm very suprised that the recursive version is faster than the iterative one; I suspect it has to do with the way recursion is implemented in PostScript. Is it still faster if you don't bind the gcd procedures? How about if you give it two large, almost-equal primes? In most programming languages (i.e. native code compilation, etc.), given an iterative algorithm and an equivalent recursive algorithm, the iterative algorithm will be faster if only because it won't have to go through all the overhead of the recursive function calls.... There's a version of this algorithm known as the binary GCD algorithm, which works on the binary representation of the numbers (indirectly) instead of on the numbers themselves. This allows it to take advantage of the fact that just finding a modulus is computationally icky (as it requires division), while dividing an integer by two is what computers do best. It's not usually faster than the above, though, unless implemented directly in machine code. And, for everyone who's been following this, the version of the GCD algorithm shown above IS much faster than mine -- use it instead! Mine was a quick- and-dirty rewrite of a Rexx version of the algorithm, and so was not at all suited to the real stack-handling capabilities of PostScript. ------- Christopher Tate | "Pardon me, but is this your bar | of soap?" cxt105@psuvm.psu.edu | "Why, yes, I suppose it is...." {...}!psuvax1!psuvm.bitnet!cxt105 | cxt105@psuvm.bitnet | "So do we."
glenn@heaven.woodside.ca.us (Glenn Reid) (11/02/90)
In article <90305.105824CXT105@psuvm.psu.edu> CXT105@psuvm.psu.edu (Christopher Tate) writes: >In article <1990Oct31.182705.3184@light.uucp>, bvs@light.uucp (Bakul Shah) says: > >>- To justify posting this to a PostScript group, here is a faster >> version of basically the same algorithm: >> >> /gcd { % given m, n on the stack >> dup 3 1 roll mod % leaves n, m mod n on the stack >> dup 0 eq { >> pop % leaves n on the stack when done >> } { >> gcd % find gcd of the new pair >> } ifelse >> } bind def >> [ stuff deleted ] >And, for everyone who's been following this, the version of the GCD algorithm >shown above IS much faster than mine -- use it instead! Mine was a quick- >and-dirty rewrite of a Rexx version of the algorithm, and so was not at all >suited to the real stack-handling capabilities of PostScript. I put in the newer version of "gcd" and it went a bit faster. Even more dramatic was to bind the for loop at the end, like this: 0 3 limit { dup frac mul /phi exch def % calculate phi /theta exch def % recalculate theta theta limit div 1.0 1.0 sethsbcolor % go through all colors theta cos diff mul phi cos a mul add % this is x theta sin diff mul phi sin a mul add % this is y 2 copy lineto stroke moveto % same as x y lineto stroke x y moveto } bind for % ^^^^ (add "bind" here) (Glenn) cvn pop -- Glenn Reid RightBrain Software glenn@heaven.woodside.ca.us PostScript/NeXT developers ..{adobe,next}!heaven!glenn 415-851-1785
bvs@light.uucp (Bakul Shah) (11/04/90)
In an earlier article I gave two versions of a GCD algorithm and claimed the recursive vesion was faster. In response Christopher Tate writes: > I'm very suprised that the recursive version is faster than the iterative > one; I suspect it has to do with the way recursion is implemented in > PostScript. Is it still faster if you don't bind the gcd procedures? How > about if you give it two large, almost-equal primes? In most programming > languages (i.e. native code compilation, etc.), given an iterative algorithm > and an equivalent recursive algorithm, the iterative algorithm will be faster > if only because it won't have to go through all the overhead of the recursive > function calls.... In this case it seems the overhead to set up a loop and take it down when done is higher than just calling a procedure. Anyway, after some more timing tests I find that which version is faster depends on input numbers, on binding and on the interpreter. See test results below. One surprise was that ghostscript 2.0 on a sun 3/50 was about twice as fast as the QMS PS410 even though the 3/50 cpu is slower and spends some fraction of time in refreshing the screen. -- Bakul Shah bvs@BitBlocks.COM ..!{ames,att,decwrl,pyramid,sun,uunet}!amdcad!light!bvs The test procedures: % {recursive, iterative}{bind def, def} versions of Euclid's gcd alg. /gcd1 { dup 3 1 roll mod dup 0 eq { pop } { gcd1 } ifelse } def /gcd2 { dup 3 1 roll mod dup 0 eq { pop } { gcd2 } ifelse } bind def /gcd3 { { dup 3 1 roll mod dup 0 eq { pop exit } if } loop } def /gcd4 { { dup 3 1 roll mod dup 0 eq { pop exit } if } loop } bind def % time to exec a procedure /time { /t0 usertime def exec usertime t0 sub } bind def /test { /ncycles exch def /num1 exch def /num0 exch def % compute overhead time for running the test /overhead { ncycles { num0 num1 pop pop } repeat } time def % collect test times in an array [ [ /gcd1 load /gcd2 load /gcd3 load /gcd4 load ] % run each procedure ncyles times and compute run time. { /gcd exch def { ncycles { num0 num1 gcd pop } repeat } time } forall ] % subtract overhead from test times and print { overhead sub == ( ) print } forall } bind def Some sample tests: 31415 31415 1000 test 111111 11111 1000 test 11111 111111 1000 test 123456 23456 1000 test Results: (time in millseconds, gs, ralpage run on a sun 3/50) gcd1 gcd2 gcd3 gcd4 ghostscript: 340 440 480 600 31415 31415 1000 test 660 860 820 1020 111111 11111 1000 test 1020 1280 1200 1480 11111 111111 1000 test 2300 2940 2600 3240 123456 23456 1000 test QMS PS410: 736 860 900 1064 31415 31415 1000 test 1472 1708 1564 1840 111111 11111 1000 test 2204 2548 2224 2604 11111 111111 1000 test 5132 5908 4872 5684 123456 23456 1000 test RALpage: 2450 3484 3000 4351 31415 31415 1000 test 4934 6867 5267 7518 111111 11111 1000 test 7433 10267 7516 10666 11111 111111 1000 test 17634 23701 16534 23351 123456 23456 1000 test BTW, the C version is about 50 times faster than the fastest PostScript version under ghostscript.