[comp.theory] Help!

grimlok@hubcap.clemson.edu (Mike Percy) (11/24/90)

I am looking for a solution for this problem, 
 
I have a black box with an unknown number (k) of balls with colors 1..k

While the box is not empty, I will pull out a ball.  I will either
discard the ball I had and discard the ball I just withdrew, or vice
versa.
 
At the end, I wish there to be an equal probability that I am holding
ball colored i.  What function can I use to decide which ball to discard at each
step?
 
For example
I first pull out a red ball.
I then pull out a green ball.  Based on some predicate f, I discard a
ball.
I then pull out a blue ball.  Again, I must discard one.
The box is now empty.
I want f() to be such that I have a 1/3 probability of the ball I am
holding to be red, blue, or green.  f() cannot depend on how many balls
are in the box (unknown), but may use the number of balls already
withdrawn.
 
Can anyone give me a decent f?  One that is only approximate woul be
good.  I am currently using f() = random(0,1) < 0.50, but this is no
good.

rolandh@tigger.Colorado.EDU (Roland Hubscher) (11/24/90)

In article <11801@hubcap.clemson.edu> grimlok@hubcap.clemson.edu (Mike Percy) writes:
>I am looking for a solution for this problem, 
> 
>I have a black box with an unknown number (k) of balls with colors 1..k
>
>While the box is not empty, I will pull out a ball.  I will either
>discard the ball I had and discard the ball I just withdrew, or vice
>versa.
> 
>At the end, I wish there to be an equal probability that I am holding
>ball colored i.  What function can I use to decide which ball to discard at each
>step?
> 
>For example
>I first pull out a red ball.
>I then pull out a green ball.  Based on some predicate f, I discard a
>ball.
>I then pull out a blue ball.  Again, I must discard one.
>The box is now empty.
>I want f() to be such that I have a 1/3 probability of the ball I am
>holding to be red, blue, or green.  f() cannot depend on how many balls
>are in the box (unknown), but may use the number of balls already
>withdrawn.
> 
>Can anyone give me a decent f?  One that is only approximate woul be
>good.  I am currently using f() = random(0,1) < 0.50, but this is no
>good.

Let f(i) = 1 - 1/i where i = 1, ..., k is the ith ball your drawing.
Discard the new ball with probability f(i). The first time you don't
discard the new ball since f(1) = 0, the second ball is destryed with
probability f(2) = 1/2, etc.

It's easy to show that yoy really get what you want.

Let p(i) = not f(i) = 1/i the probability that the ith ball is kept
and the old ball is discarded. P(i) is the probability that you end up
with the ball drawn in round i.

P(i) = p(i) * (1-p(i+1)) * (1-p(i+1)) * ... * (1-p(k))
     = 1/i * (1-1/(i+1)) * (1-1/(i+2)) * ... * (1-1/k)
     = 1/i * ((i+1)-1)/(i+1) * ((i+2)-1)/(i+2) * ... * (k-1)/k
     = [1 * i * (i+1) * ... * (k-2) * (k-1)] /
           [i * (i+1) * (i+2) * ... * (k-1) * k]
     = 1/k

I used this function to pick a random move without having to store all the
moves generated by the move generator.

roland

cokasaki@PROOF.ERGO.CS.CMU.EDU (Chris Okasaki) (11/25/90)

In article <11801@hubcap.clemson.edu> grimlok@hubcap.clemson.edu (Mike Percy) writes:
>I am looking for a solution for this problem, 
> 
>I have a black box with an unknown number (k) of balls with colors 1..k
>
>While the box is not empty, I will pull out a ball.  I will either
>discard the ball I had and discard the ball I just withdrew, or vice
>versa.
> 
>At the end, I wish there to be an equal probability that I am holding
>ball colored i.  What function can I use to decide which ball to discard at each
>step?
> 
>I want f() to be such that I have a 1/3 probability of the ball I am
>holding to be red, blue, or green.  f() cannot depend on how many balls
>are in the box (unknown), but may use the number of balls already
>withdrawn.

How about this?  Let f(i) be the probability of retaining at turn i
the ball you just drew.  After any turn i, we want the probability that
we're currently holding any ball that we've already seen to be 1/i.  (This
will assure that after the last ball,  the probability that we're
holding any of the balls is 1/k, which is the desired result.)

Now, say we've just picked up the i-th ball.  After this turn we want the
probability that we're holding any of balls we've encountered to be 1/i.
In particular, we want the probability that we're holding THIS ball to be
1/i.  Therefore, we must retain it with probability 1/i, i.e. f(i) = 1/i.

This gives us the correct result for THIS ball, but what about the other
balls?  Well by our (implicit) inductive assumption, we came into this turn
with a probability of holding any ball that we'd already seen of 1/(i-1).
Since we keep the i-th ball with probability 1/i, we keep the ball that
we ALREADY had with probability (i-1)/i.  So the probability that we're holding
any of the earlier balls is now 1/(i-1) * (i-1)/i = 1/i.

Of course we need a base case for our induction.  As usual with induction, the
base case is almost embarassingly simple.  f(1) = 1/1 = 1 so after the first
turn we're holding the first ball with probability 1.  That's it!

--
-------------------------------------------------------------------------------
| Chris Okasaki       |                          Life is NOT a zero-sum game! |
| cokasaki@cs.cmu.edu |                                                       |
-------------------------------------------------------------------------------
-- 
-------------------------------------------------------------------------------
| Chris Okasaki       |                          Life is NOT a zero-sum game! |
| cokasaki@cs.cmu.edu |                                                       |
-------------------------------------------------------------------------------

grimlok@hubcap.clemson.edu (Mike Percy) (11/26/90)

Thanks for the help everyone.  I suppose I should have just kept with
my real problem.  In my simulation, I want to select a random move.
My move generator is a deterministic search for legal moves, but there
is no way of telling how many legal moves there are until we've seen
them all.  I don't want to store them up, nor do I want to run through
the search more than once.  
 
In essence:
 
Start search.
If current move better than previous best, keep this move.
If current move equal in value to the previous best, randomly keep one
of the two moves.
When search is over, I want to have a random move selected from the
best moves.
Simply keeping first, last, or ith is no good as it tends to direct
the search, missing paths that might lead to better solutions.

A couple of people suggested 1/i, which is what I thought of originally,
but dismissed (idiot) as being to easy, and my induction proof kept
going wacky -- must have been at it too long.
 
Thanks to all who offered help.


"I don't know about your brain, but mine is really...bossy."
Mike Percy                    grimlok@hubcap.clemson.edu
ISD, Clemson University       mspercy@clemson.BITNET
(803)656-3780                 mspercy@clemson.clemson.edu