[net.math] Pistachio Probabilities

eklhad@ihnet.UUCP (K. A. Dahlke) (08/22/85)

< literally, munch! >

	You people are going to start thinking I am weird.
While eating pistachios last night, I noticed some of them were unopenable
(using the simple fingernail method).  These, I simply tossed back into the
bag, only to run across them later.
So, another useless irrelevant question was born!!!
Suppose you begin with a bag containing O openable pistachios,
and U unopenable pistachios.  A trial consists of selecting a nut at random,
and eating it (if possible), or returning it to the bag.
How many trials, on the average, are required to consume all the "openable"
pistachios?  Express the answer in terms of U and O.
Any ideas on the variance/distribution of trials(U,O)?
Although I have not given this problem a lot of thought (yet),
it looks surprisingly difficult.
The pistachios, by the way, were delicious.
-- 
	This .signature file intentionally left blank.
		Karl Dahlke    ihnp4!ihnet!eklhad

rjnoe@riccb.UUCP (Roger J. Noe) (08/23/85)

> Suppose you begin with a bag containing O openable pistachios,
> and U unopenable pistachios.  A trial consists of selecting a nut at random,
> and eating it (if possible), or returning it to the bag.
> How many trials, on the average, are required to consume all the "openable"
> pistachios?  Express the answer in terms of U and O.
> Any ideas on the variance/distribution of trials(U,O)?
> Although I have not given this problem a lot of thought (yet),
> it looks surprisingly difficult.
> The pistachios, by the way, were delicious.
> -- 
> 	This .signature file intentionally left blank.
> 		Karl Dahlke    ihnp4!ihnet!eklhad

Actually, it's not that hard.  Express the problem in terms of a recurrence
relation.  Let T(No,Nu) denote the expected number of trials given that there
are "No" openable nuts and "Nu" unopenable nuts in the bag.  Assume also that
we know No before we begin so that we don't keep searching for openable nuts
when there are none left.  Then for No=0, T(0,Nu)=0.  Now, what if No>0?
Clearly there is a No/(No+Nu) probability of selecting an openable nut, which
adds a trial and decrements No.  There is a Nu/(No+Nu) probability of picking
an unopenable nut, which adds a trial but does not change No nor Nu.  (Try
pronouncing "No" as "no" and "Nu" as "new" - these sentences can be a lot of
fun!)  Then the relation for No>0 is

	                No                   Nu
	T(No,Nu) = 1 + ----- * T(No-1,Nu) + ----- * T(No,Nu)
	               No+Nu                No+Nu

Rearranging,

		T(No,Nu) = 1 + Nu/No + T(No-1,Nu)

So the obvious solution is
		                     _No
		                     \
		T(No,Nu) = No + Nu *  >  1/k		(No > 0)
		                     /__
		                     k=1
--
Roger Noe			ihnp4!ihopa!riccb!rjnoe
Rockwell International

dclee@wateng.UUCP (David C. Lee) (08/23/85)

Subject: Re: Pistachio Probabilities
Newsgroups: net.math
Distribution: net
References: <285@ihnet.UUCP>

Here is the original question about Pistachio Probabilities:

>Suppose you begin with a bag containing O openable pistachios,
>and U unopenable pistachios.  A trial consists of selecting a nut at random,
>and eating it (if possible), or returning it to the bag.
>How many trials, on the average, are required to consume all the "openable"
>pistachios?  Express the answer in terms of U and O.
>Any ideas on the variance/distribution of trials(U,O)?

Rather than using Openable & Unopenable (O & 0 is confusing),
I will use Good and Bad.

Suppose we associate a stage of the experiment with the consumption of 
a good pistachio.  For example at the beginning of stage (1), 
we have (G) good ones and (B) bad ones; at stage (2), [G-1,B];
... ;at stage (i), [G-i+1,B]; ... ; and finally at stage (G), [1,B].

Let us analyze this experiment stage-by-stage.
At any stage (i) with [G-i+1,B], we have the following results:
      p = #good / (#good+#bad) = (G-i+1) / (G-i+1 + B)
          - probability of picking a good one at each trial
      q = 1-p
   P(k) = p q^(k-1)
          - probability of finding the first good one at the k-th trial
          - recall geometric distribution
              with expected value E(k) = 1/p
              and variance V(k) = q/(p^2)

So at stage (1), [G,B]   (found and consumed the 1st pistachio)
  the expected number of trials is E(k) = (G+B)/G = 1 + B/G
  with the variance of             V(k) = (B/(G+B)) / (G/(G+B))^2
                                        = (B/G) + (B/G)^2
At stage (2),    [G-1,B] (found and consumed the 2nd pistachio)
  the expected number of trials is E(k) =  1 + B/(G-1)
  with the variance of             V(k) =  (B/(G-1)) + (B/(G-1))^2
...
At stage (G),    [1,B]   (Ha-Ha, the last one!)
  the expected number of trials is E(k) =  1 + B
  with the variance of             V(k) =  (B) + (B)^2

The expected number of trials to consume all pistachio is:
        N = k(1) + k(2) + ... + k(G)
     E(N) = E(k(1))     + ... + E(k(G))
          = { 1 + B/G } + ... + { 1 + B }
          = G + B { 1/G + 1/(G-1) + ... + 1 }           (reducible?)

Since k(1), k(2), ..., k(G) are independent random variables,
(i.e. each stage of the experiment is independent of others)
the variance of N is:
     V(N) = V(k(1))       + V(k(2))     + ... + V(k(G))
          =   B { 1/G     + 1/(G-1)     + ... + 1 } + 
            B^2 { (1/G)^2 + (1/(G-1))^2 + ... + 1 }     (reducible?)

Further comments:

     No doubt about it!  The task of finding a good pistachio 
gets harder with each being consumed.  To get the (i+1)-th pistachio 
we would need an extra effort of B/{(G-i)(G-i+1)} trials 
compared to (i)-th pistachio, i.e. an increase of no more than 100/G% 
at the beginning for large values of G and B, but with the increase 
of as much as 200% at the end for a large B.

     Finally by "Central Limit Theorem", N tends to be normally 
distributed with mean E(N) and variance V(N) as dervied above,
for large values of B and G.

     David C. Lee @ University of Waterloo, Ontario