# hitting streaks and markov chains

I recently came across the following problem:

given a “fair” coin, what is the expectation value for the number of flips you have to make before getting 2 heads in a row?

this is evidently a pretty well known, and you can google around and find lots of write ups on how to solve it, many of which will be more exhaustive than mine.

Im going to give the answer, so stop reading if you want to work it out yourself.

the answer is, I thought, a pretty elegant application of Markov chains. i.e., suppose I start in a state with 0 heads in a row (call this s0) and call my expectation value to achieve the goal (get 2 heads in a row) x0. then, starting from here, I can either get a heads (I transition to state s1) or a tails (I stay in state s0). in either case I’ve used up 1 flip so,

x0 = 1/2 (x0+1) + 1/2 (x1+1)

starting from s1, I have 1/2 probability to go back to s0 and 1/2 to go to s2, i.e. to complete the goal, so

x1 = 1/2 (x0+1) + 1/2 (1)

this is a (linear) system of 2 equations in 2 unknowns, which I can therefore solve. the solution is,

M**-1 x V

where M**-1 = inverse of M,
M_00 = 1/2
M_01 = M_10 = -1/2
M_11 = 1

and V = 1 (a column vector)

M**-1 has elements,
_00 = 4
_01 = _10 = 2
_11 = 2

and so the solution for the expectation value of the number of coin flips to get 2 heads in a row is,
4 + 2 = 6

now, I can generalize this to a weighted coin, i.e. the probability to get a heads is p and the probability to get a tails is q=(1-p). in this case I have,

x0 = q*(x0+1) + p*(x1+1)
x1 = q*(x0+1) + p*(1)

==>
M_00 = p
M_01 = -p
M_10 = -q
M_11 = 1

M**-1 =
_00 = 1/p**2
_01 = 1/p
_10 = 1/p**2 -1/p
_11 = 1/p

and x0 = 1/p**2 + 1/p

generalizing even more to a streak of n heads in a row, I get

M_00 = p
M_i0 = -q
M_i(i+1) = -p
M_ii = 1
M_ij = 0, otherwise

with solution

x0 = sum(x**i, i=1 to n)
= sum(x**i, i=0 to n) – 1

where x = 1/p.

this can be evaulated explicitly as,
(x**(n+1) – 1)/(x – 1) – 1 =
(x**(n+1) – x)/(x – 1)

this can be adapted to hitting streaks if p is the probability to get a hit in any given game. looking at historical values for hits per plate appearence I come up with ~0.25 as the average. for a really great season, i.e. Lajoie this might be 0.4. For a great season, i.e. Williams or DiMaggio 1941 it might be 0.35. suppose the mean plate appearences per game is 5, then the probability to get a hit is

1 – (1-p)**5

which for the numbers above is ~ 0.75 for the mean and ~ 0.9 for a great season.

ergo, the expectation value for the number of games one must play to get hits in 56 in a row ranges from

p = 0.75 ==> 39 million
p = 0.90 ==> 3641

interestingly, if a player with p=0.9 gets to game 55, then the expectation value, x55, is still 365, i.e. there is a 90% chance they will achieve game 56, but that 10% chance that they wont pulls the mean to much larger values. so, I think this is in some ways a good illustration of how the expectation value is really a very blunt measure of probability.