In part 1 I talked about a simple random walk in 1-dimension, where the states all the way to the left and all the way to the right are sinks (or roach motels). The next step from that is to ask what is the expectation value associated with each state? I mentioned that we could think of the state as scoring a touchdown (+6 or +7 or +8 points) and as giving up a safety (-2 points), but to illustrate the problem, it is convenient to assume that landing at location +2 is worth +1 point and at location -2, -1 point.
As a concrete example, let me take p=0.6 and ask what the expectation values are, given that I start in state or . Iterating the transition matrix to convergence, as I did in part 1, I get,
This says that if I start at state , I end in state with probability 0.308 and with probability 0.692. If these are worth -1 and +1 points, respectively, then the expectation value for points will be . In a similar way I can see that the expectation values for states and are and , respectively. I will use as sort of a test case and refer back to these values later. So this gives me the answer using the first method.
The question I want to address now is, other than iterating to convergence, how do we go from the transition matrix (which is relatively easy to code up for football) to the expectation value (which is what we really want to know)?
Logically, the expectation value, , when starting from the state , should be,
In words, its the expectation value of the subsequent state in the chain (for example, ), multiplied by the probability to transition to that state (for example ). Inspection shows that this says (using Einstein notation for matrix/vector multiplication),
or, in vector notation,
, where is the identity matrix.
For the particular transition matrix we are discussing, we have,
Because , where is the transpose of , I can also write this as,
This way, with the matrix multiplying the vector from the left, is the more common way to see a set of equations represented, and some code for numerically solving such an equation expect this convention. The values and , however, aren’t really variables, i.e. quantities that need to take on particular values for the set of equations to hold; they are parameters, such that their values are arbitrary, and, once chosen, impact the expectation values for the non-sink states.
Let me denote by (s, t) states that are sinks, and (n,m) states that are NOT sinks. What I really want to know are the set of values , i.e. the expectation values for states that are not sinks. So I can write,
Rearranging this equation gives,
or in vector notation,
Taking the transpose of both sides I get,
Note that the superscripts here denote sink or non-sink and do not denote exponentiation. Also, let denote the number of sinks and the number of non-sinks; then the dimensions of , and are
, , , and , respectively.
For the particular problem we’ve been discussing, this can be expanded as,
Note that once I choose values for and , the the right hand side is just a constant vector.
The formal solution is,
so if I set and , then I get,
and then if I further set , then I get,
which is exactly what I got from iterating the transition matrix until it converges and then reading off the columns. One last thing to note, is that the equation,
from above implies that (if I multiply by from the right on both sides),
and it follows that,
which is exactly what we got by iterating the transition matrix to convergence.