[comp.lang.c] random

jrv@sdimax2.mitre.org (VanZandt) (08/30/90)

random() in Turbo C is supposed to return a random integer in the range
0...n-1.  Actually, it will occasionally (once in 32768 calls) return n.

random() is implemented by a macro

     #define random(n) ((int)((long)rand()*(n))/RAND_MAX)

However, when rand() returns RAND_MAX (which is 32767), this evaluates 
to n.

To correct this, increment RAND_MAX (noting that it must thereby become
a long):

     #define random(n) ((int)((long)rand()*(n))/((long)RAND_MAX+1))
     
                   - Jim Van Zandt (jrv@mbunix.mitre.org)

p.s.  Note that the common device of using rand()%n is guaranteed to
yield a number in the range 0...n-1, but the probabilities are not uniform.
For n=30000, the probability of getting 9 is twice the probability
of getting 29000.

karl@haddock.ima.isc.com (Karl Heuer) (09/01/90)

In article <118662@linus.mitre.org> jrv@sdimax2.mitre.org (VanZandt) writes:
>     #define random(n) ((int)((long)rand()*(n))/((long)RAND_MAX+1))
>p.s.  Note that the common device of using rand()%n is guaranteed to
>yield a number in the range 0...n-1, but the probabilities are not uniform.

This is equally true of your version (proof: there are RAND_MAX+1 equally
likely outcomes but n possible results); only the particular "hot" values have
been redistributed.  If it's important, here is a correct algorithm.
	#include <stdlib.h>
	int random(int n) {
	    register int r;
	    do r = rand() / ((RAND_MAX + 1UL) / n); while (r >= n);
	    return (r);
	}

Karl W. Z. Heuer (karl@kelp.ima.isc.com or ima!kelp!karl), The Walking Lint

njk@diku.dk (Niels J|rgen Kruse) (09/09/90)

karl@haddock.ima.isc.com (Karl Heuer) writes:
> If it's important, here is a correct algorithm.
>       #include <stdlib.h>
>       int random(int n) {
>           register int r;
>           do r = rand() / ((RAND_MAX + 1UL) / n); while (r >= n);
>           return (r);
>       }

This require 2 divisions however.  Here is an algorithm, that
only require one division:

        #include <stdlib.h>
        int random (int n) {
          register int res, sor;

          do {
            res = (sor = rand()) % n;
            sor = RAND_MAX - (sor - res);
          } while (sor < n-1);
          return res;
        }

No conversions either, it's all plain integer arithmetic.
-- 
Niels J|rgen Kruse 	DIKU Graduate 	njk@diku.dk

daveg@near.cs.caltech.edu (Dave Gillespie) (09/09/90)

>>>>> On 8 Sep 90 20:15:20 GMT, njk@diku.dk (Niels J|rgen Kruse) said:

> karl@haddock.ima.isc.com (Karl Heuer) writes:
>>      ... do r = rand() / ((RAND_MAX + 1UL) / n); while (r >= n); ...

> This require 2 divisions however.  Here is an algorithm, that
> only require one division:

>       ... do {
>             res = (sor = rand()) % n;
>             sor = RAND_MAX - (sor - res);
>           } while (sor < n-1); ...

Most rand() implementations use linear congruential random number
generators, which are notorious for having unpleasant properties
in their low bits.  Experts recommend always dividing instead of
mod'ing, so that you are using the high bits of the result of
rand().

A linear congruential generator uses the formula

	x = (a*x + c) % (RAND_MAX+1L)

to generate a sequence of pseudo-random numbers; the first "x" is
the seed you supply to srand().  The numbers "a" and "c" are chosen
according to a black art you can read about in Knuth or elsewhere.
"RAND_MAX" is chosen so that "RAND_MAX+1" is a power of two; this
makes the modulo operation cheap in binary.

Consider the least significant bit of "x", i.e., whether or not "x"
is even or odd, as this formula is applied to it:  The binary "%"
doesn't affect this bit; adding "c" either inverts this bit or leaves
it the same; multiplying by "a" either clears this bit or leaves it
the same.  So no matter what the values of "a", "c", and "RAND_MAX",
the low bit of your "random" sequence will be extremely non-random.
It turns out that only the most-significant bits of "x" really act
like good random numbers.

Sometimes rand() will tweak its results to be less blatantly bad
in the low bits, but every one I've seen was ultimately linear
congruential.

A couple of convenient references:  Knuth's _Art of Computer
Programming_, volume II; _Numerical Recipes_ by Press et al,
chapter 7.

								-- Dave
--
Dave Gillespie
  256-80 Caltech Pasadena CA USA 91125
  daveg@csvax.cs.caltech.edu, ...!cit-vax!daveg

njk@diku.dk (Niels J|rgen Kruse) (09/09/90)

daveg@near.cs.caltech.edu (Dave Gillespie) writes:

>>>>>> On 8 Sep 90 20:15:20 GMT, njk@diku.dk (Niels J|rgen Kruse) said:
>> This require 2 divisions however.  Here is an algorithm, that
>> only require one division:

I used rand() because Karl Heuer did and because it is the ANSI
standard interface for getting random numbers.  Nothing other
than inertia prevents rand() from being good.  Good random
number generators don't have to be slower than bad ones.  Just
plug a good random number generator into the code i gave instead
of rand(), if the one you are using is bad.

>Most rand() implementations use linear congruential random number
>generators, which are notorious for having unpleasant properties
>in their low bits.  Experts recommend always dividing instead of
>mod'ing, so that you are using the high bits of the result of
>rand().

Oh.  So experts use rand()?  ;-)
-- 
Niels J|rgen Kruse 	DIKU Graduate 	njk@diku.dk

henry@zoo.toronto.edu (Henry Spencer) (09/10/90)

In article <DAVEG.90Sep9012053@near.cs.caltech.edu> daveg@near.cs.caltech.edu (Dave Gillespie) writes:
>Sometimes rand() will tweak its results to be less blatantly bad
>in the low bits, but every one I've seen was ultimately linear
>congruential.

The original Unix rand() did a 32-bit calculation internally and returned
the HIGH-order 16 bits.  Unfortunately, certain idiots porting the system
to 32-bit machines wanted a 32-bit rand() and didn't bother understanding
why it was done the way it was...
-- 
TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
OSI: handling yesterday's loads someday|  henry@zoo.toronto.edu   utzoo!henry

manning@coil.caltech.edu (Evan Marshall Manning) (09/10/90)

henry@zoo.toronto.edu (Henry Spencer) writes:

>The original Unix rand() did a 32-bit calculation internally and returned
>the HIGH-order 16 bits.  Unfortunately, certain idiots porting the system
>to 32-bit machines wanted a 32-bit rand() and didn't bother understanding
>why it was done the way it was...

Modern coding practice might suggest the use of comments in the source
code to prevent such accidents :-(

***************************************************************************
Your eyes are weary from staring at the CRT for so | Evan M. Manning
long.  You feel sleepy.  Notice how restful it is  |      is
to watch the cursor blink.  Close your eyes.  The  |manning@gap.cco.caltech.edu
opinions stated above are yours.  You cannot       | manning@mars.jpl.nasa.gov
imagine why you ever felt otherwise.               | gleeper@tybalt.caltech.edu