[comp.graphics] Mandelbrot algorithm summary

cbuckley@vax1.tcd.ie (Colm Buckley) (02/05/90)

Thanks to everybody who replied to me concerning my earlier query (in
comp.lang.c) about mandelbrot sets - here is a summary of the suggestions I
received.

-------------------------------------------------------------------------------

The main thrust of the replies which I received concerning the acceleration of
mand() was that, since the only interesting bits are contained in a disc
centered on the origin (0+0i) of radius 2, the use of full-blown floating-point
variables is very wasteful of processor time.
     
Of course, to machines with a FP coprocessor, the extra overhead involved with
using "double" variables in the function is very small, because floating-point
arithmetic is performed by the copro (eg : an 80387) almost as fast as integer
arithmetic on the main processor itself. However, many machines, particularly
PC's, do not usually have FP copros, so the best way to improve the speed of
iterative functions such as mand() which involve many thousands of arithmetic
operations is to switch to fixed-point arithmetic. Acting on this suggestion, I
implemented a fixed-point version of mand(), using 32-bit "long"s to store the
variables.
     
The numbers are stored multiplied by 2**28, so, for example 1.0 becomes
268435456L, 0.5 becomes 134217728L, and so on. Using this convention, numbers
from -8 to 8 can be represented, to an accuracy of (1/(2**28)) or 3.725E-9.
While this is accurare enough for fairly low-resolution or low-magnification
pictures, it is not sufficient if you want to go really "deep" into the set.
Unfortunately, there does not exist in most versions of C any integer type with
greater precision than "long", so any further advances require an assembler
program (don't rush me, I'm working on it!)
     
Addition and subtraction can be performed as usual, using ordinary integer +
and - (  (x * 2**28) + (y * 2**28) = (x+y) * 2**28  ). However, multiplication
poses a problem. When two numbers stored in this fashion are multiplied
together, the result will be 2**28 times too big ( (x * 2**28) * (y * 2**28) =
(x * y * 2**56), not (x * y * 2**28). Therefore, when multiplication is
necessary, the multiplicand and multiplier must both be divided by (2**14).
This is easily and quickly accomplished by right-shifting x and y by 14 bits
(x >> 14 = x / (2**14) ).
     
The function fmand() given below is an implementation of mand() using this
convention. As mentioned before, it is not as accurate as the full version
using "double"s, but it is between 5 and 10 times faster, depending on the
details of the machine architecture.
     
#define fixp(x) (long)((x) * (1 << 28))
#define abs(x) ((x) < 0 ? -x : x)
#define FTWO  536870912L
#define FFOUR 1073741824L
     
int fmand(long x, long y) {
        int i = 255;
        long a = 0L, b = 0L, m = 0L, n = 0L;
     
        if (abs(x) >= FTWO || abs(y) >= FTWO) return 0;
        do {
                /* divide a and b so multiplication is possible */
                m = (a >> 14) | ((a & -2147483648L) ? -262144L : 0L);
                n = (b >> 14) | ((b & -2147483648L) ? -262144L : 0L);
     
                /* b = 2 * a * b + y */
                b = (m * n << 1) + y;
     
                m *= m;
                n *= n;
                a = m - n + x;
                if (m >= FFOUR - n || m + n < 0L ||
                    abs(b) >= FTWO || abs(a) >= FTWO) break;
        } while (--i);
        return 255 - i;
}
     
Note the peculiar-looking construction after the "m = (a >> 14)". This is a
bit-mask which handles negative numbers. Dividing signed integers by a power of
two is not simply a matter of bit-shifting, the number must be sign-extended as
well (the leftmost bit must be copied into the bit-spaces left empty after the
shift)
For example, -1.0 is stored as -268435456L, which is
      11110000000000000000000000000000 in binary. When this is divided by 2**14
we should get -16384L, which is 11111111111111111100000000000000 in binary.
However, (-268435456L >> 14) is 00000000000000111100000000000000 - the first 14
bits are 0 instead of 1. Therefore, if the original number is negative (the
leftmost bit is 1 - this is the (a & -2147483648L) test), the number is
logically ORed with -262144L, which is 11111111111111000000000000000000 - ie:,
the first 14 bits are set to 1.
     
Note that this routine takes "long" parameters, not "double" - therefore the
values must be converted BEFORE the routine is called. The macro fixp(x)
converts the number (x) (any type) into the fixed-point representation. An
example call to fmand(), would be
     
        double x = 0.4, y = -1.02;
        int i;
     
        .
        .
        .
     
        i = fmand(fixp(x), fixp(y));
     
This type of optimisation seems to be about as far as the actual mand()
function is concerned, but several people emailed with algorithms to reduce the
number of times that the function has to be called. One of the most interesting
came from Alastair Sutherland in SCO in California. He suggests looking for a
rectangle with all edge pixels the same colour, and filling it in, on the
assumption that, if the edge is solid, then the interior is too. See the
"diagram" below :
     
                *********       *********
                *       *       *********
                *       *  ->   *********
                *       *       *********
                *********       *********
     
As yet, I don't have a source-level algorithm for this, as Alastair's machine
is temporarily down - perhaps I'll have it in a few weeks; if you mail me then,
I might be able to send it along.
     
Anyway, thanks for the interest in mand() - if you have any better ideas, let
me know!
     
                        C o l m.
     
+-----------------------------------------------------------------+
| Colm Buckley. (Computer Science, Trinity College, Dublin)       |
|                                                                 |
| EMAIL      : cbuckley@vax1.tcd.ie                               |
| Phone      : Dublin 770101                                      |
+---+---------------------------------------------------------+---+
    |           "I'm a lean, mean computing machine!"         |
    +---------------------------------------------------------+

hascall@cs.iastate.edu (John Hascall) (02/07/90)

In article <5480@vax1.tcd.ie> cbuckley@vax1.tcd.ie (Colm Buckley) writes:
      
    [using 32bit fixed point for mandelbrot computations]

}The numbers are stored multiplied by 2**28, ...
 
    [addition and subtraction work ok, but muliply...]

}poses a problem. When two numbers stored in this fashion are multiplied
}together, the result will be 2**28 times too big ( (x * 2**28) * (y * 2**28) =
}(x * y * 2**56), not (x * y * 2**28). Therefore, when multiplication is
}necessary, the multiplicand and multiplier must both be divided by (2**14).
}This is easily and quickly accomplished by right-shifting x and y by 14 bits
}(x >> 14 = x / (2**14) ).
      
     Umm, when you right shift aren't you losing a great deal of
     your precision?  Or am I missing something here?

John Hascall

doug@xdos.UUCP (Doug Merritt) (02/08/90)

In article <5480@vax1.tcd.ie> cbuckley@vax1.tcd.ie (Colm Buckley) writes:
>Unfortunately, there does not exist in most versions of C any integer type with
>greater precision than "long", so any further advances require an assembler
>program (don't rush me, I'm working on it!)

What? Not at all. You can do extended precision in C. You can't do
a 32x32 -> 64 bit multiply in a single step, which *some* (but not all)
machines allow in assembler. So, granted, there may be good reason to use
assembler for speeding up such operations. But that's really no different
than any other algorithm. As usual the smart approach is to get it
working in C first, and then (if desired) recode that inner loop in
assembler to get some extra speed. But don't go off thinking it's *necessary*.

>number of times that the function has to be called. One of the most interesting
>came from Alastair Sutherland in SCO in California. He suggests looking for a
>rectangle with all edge pixels the same colour, and filling it in, on the
>assumption that, if the edge is solid, then the interior is too. See the

Although this is a useful heuristic, note that it can fail. Some regions
of the set will have a "tendril" that sneaks between two pixels, and then
expand to visible size inside the region you've filled in. You should be
able to find this effect by trying this algorithm in one of the areas
that has a (pseudo-) duplicate of the entire set. These little-M's are
generally connected by extremely thin tendrils, so it should be easy to
get the tendril to fall between two pixels.

For this reason, it's helpful to enhance the basic idea by adding a little
double checking. Always probe some number of random points inside the
rectangle; if one of them is a different color than you'd expect from
the heuristic, do the interior "the hard way".
	Doug
-- 
Doug Merritt		{pyramid,apple}!xdos!doug
Member, Crusaders for a Better Tomorrow		Professional Wildeyed Visionary

rspangle@jarthur.Claremont.EDU (Randy Spangler) (02/08/90)

>In article <5480@vax1.tcd.ie> cbuckley@vax1.tcd.ie (Colm Buckley) writes:
>      
>    [using 32bit fixed point for mandelbrot computations]
>
>}The numbers are stored multiplied by 2**28, ...
> 
>    [addition and subtraction work ok, but muliply...]
>
>}poses a problem. When two numbers stored in this fashion are multiplied
>}together, the result will be 2**28 times too big ( (x * 2**28) * (y * 2**28) =
>}(x * y * 2**56), not (x * y * 2**28). Therefore, when multiplication is
>}necessary, the multiplicand and multiplier must both be divided by (2**14).
>}This is easily and quickly accomplished by right-shifting x and y by 14 bits
>}(x >> 14 = x / (2**14) ).

Well, those of us with 80386's can multiply two 32-bit numbers and get a 
64-bit number in EDX:EAX (first 32 in EDX, next 32 in EAX) with the 
following code:

	MOV	eax,number1
	IMUL	number2

where number1 and number2 are 32-bit memory operands.  I don't know if the
68000 chips have similar instructions, but I would assume so.

The next is pretty system-specific (IBM 80x87 only).  I have a program I
wrote in assembly to plot mandelbrot sets using the floating point chip.
I optimized it so that all the calculations are done using the chip's
registers - no conversion to regular memory operands.  It's slower than 
32-bit integer math, but more accurate than any other program I've seen for
the IBM at high resolution, since the numbers never leave the 10-byte format
that the 80x87 stores them in (as opposed to 8-byte long floating point type
for memory operands).

Since the code only applies to a small segment of the readers, I'll email
copies to interested people, unless I get so many responses that I think it's
worth posting.


-- 
 --------------------------------------------------------------------------
|    Randy Spangler                    |    The less things change, the    |
|    rspangle@jarthur.claremont.edu    |    more they remain the same      |
 --------------------------------------------------------------------------

mclek@dcatla.UUCP (Larry E. Kollar) (02/08/90)

In article <5480@vax1.tcd.ie> cbuckley@vax1.tcd.ie (Colm Buckley) writes:
     
> One of the most interesting [ optimizations ]
> suggests looking for a
>rectangle with all edge pixels the same colour, and filling it in, on the
>assumption that, if the edge is solid, then the interior is too.

One of the newest Amiga Mandelbrot generators goes beyond that.  It traces
around any "black" region it finds, fills it in, and moves on.  The program
doesn't bother tracing other colors, on the assumption that you wouldn't save
enough time doing other regions to make that worthwhile.

The normal unzoomed view takes only a few minutes to draw on a stock
68000-based Amiga using a 16-color screen.  Best of all, the source (JForth)
is available.  Amigans can find this on Fish disk 2?? (the one with the
JGoodies programs -- #237 seems to come to mind, although I think that's
too early), or the JGoodies #1 disk from Delta Research.

Send email if you want more info.  I'd rather not email 3000 copies of the
program, but I'd consider snail-mail if you send a disk and return postage.
-- 
Larry Kollar	UUCP ...!gatech!dcatla!mclek		VoiceMail:One  2670
Georgia 400:  the Southeast's premier 48-mile 4-lane dragstrip.