I recently came across the following problem:

given a “fair” coin, what is the expectation value for the number of flips you have to make before getting 2 heads in a row?

this is evidently a pretty well known, and you can google around and find lots of write ups on how to solve it, many of which will be more exhaustive than mine.

** SPOILER ALERT **

Im going to give the answer, so stop reading if you want to work it out yourself.

the answer is, I thought, a pretty elegant application of Markov chains. i.e., suppose I start in a state with 0 heads in a row (call this s0) and call my expectation value to achieve the goal (get 2 heads in a row) x0. then, starting from here, I can either get a heads (I transition to state s1) or a tails (I stay in state s0). in either case I’ve used up 1 flip so,

x0 = 1/2 (x0+1) + 1/2 (x1+1)

starting from s1, I have 1/2 probability to go back to s0 and 1/2 to go to s2, i.e. to complete the goal, so

x1 = 1/2 (x0+1) + 1/2 (1)

this is a (linear) system of 2 equations in 2 unknowns, which I can therefore solve. the solution is,

M**-1 x V

where M**-1 = inverse of M,

M_00 = 1/2

M_01 = M_10 = -1/2

M_11 = 1

and V = 1 (a column vector)

M**-1 has elements,

_00 = 4

_01 = _10 = 2

_11 = 2

and so the solution for the expectation value of the number of coin flips to get 2 heads in a row is,

4 + 2 = 6

now, I can generalize this to a weighted coin, i.e. the probability to get a heads is p and the probability to get a tails is q=(1-p). in this case I have,

x0 = q*(x0+1) + p*(x1+1)

x1 = q*(x0+1) + p*(1)

==>

M_00 = p

M_01 = -p

M_10 = -q

M_11 = 1

M**-1 =

_00 = 1/p**2

_01 = 1/p

_10 = 1/p**2 -1/p

_11 = 1/p

and x0 = 1/p**2 + 1/p

generalizing even more to a streak of n heads in a row, I get

M_00 = p

M_i0 = -q

M_i(i+1) = -p

M_ii = 1

M_ij = 0, otherwise

with solution

x0 = sum(x**i, i=1 to n)

= sum(x**i, i=0 to n) – 1

where x = 1/p.

this can be evaulated explicitly as,

(x**(n+1) – 1)/(x – 1) – 1 =

(x**(n+1) – x)/(x – 1)

this can be adapted to hitting streaks if p is the probability to get a hit in any given game. looking at historical values for hits per plate appearence I come up with ~0.25 as the average. for a really great season, i.e. Lajoie this might be 0.4. For a great season, i.e. Williams or DiMaggio 1941 it might be 0.35. suppose the mean plate appearences per game is 5, then the probability to get a hit is

1 – (1-p)**5

which for the numbers above is ~ 0.75 for the mean and ~ 0.9 for a great season.

ergo, the expectation value for the number of games one must play to get hits in 56 in a row ranges from

p = 0.75 ==> 39 million

p = 0.90 ==> 3641

interestingly, if a player with p=0.9 gets to game 55, then the expectation value, x55, is still 365, i.e. there is a 90% chance they will achieve game 56, but that 10% chance that they wont pulls the mean to much larger values. so, I think this is in some ways a good illustration of how the expectation value is really a very blunt measure of probability.