true talent estimates from 2 correlated samples

This post basically came about in trying to answer to Tom Tango’s question about making career and seasonal regressions add up,
http://tangotiger.com/index.php/site/comments/making-seasonal-and-career-level-regressions-add-up

Or in other words what true talent level do you estimate based on two correlated samples?

As a starting point and as a reminder, if you have one observation, x, and you assume true talent is Gaussian with mean t_0 and standard deviation \sigma_t, and statistical fluctuations are Gaussian with mean 0 and standard deviation \sigma_x, then the posterior distribution for true talent, t, is,

P(t) = \frac{1}{2 \pi \sigma_x \sigma_t} e^{-\frac{(x-t)^2}{2 \sigma^2_x}} e^{-\frac{(t-t_0)^2}{2 \sigma^2_t}}

You can use this to compute the mean of the posterior distribution for true talent, but because it is Gaussian, the mean will be equal to the mode. You can determine the mode by solving,

\frac{\partial}{\partial t} P(t) = 0

or

\frac{\partial}{\partial t} (\frac{(x-t)^2}{2 \sigma^2_x} + \frac{(t-t_0)^2}{2 \sigma^2_t}) = 0

which is solved by
t = \frac{x/\sigma^2_x + t_0/\sigma^2_t}{1/\sigma^2_x + 1/\sigma^2_t}

and if \sigma_x = k/\sqrt{n}, with n the number of plate appearances, then

t = \frac{x n + t_0 k^2/\sigma^2_t}{n + k^2/\sigma^2_t}

which says, estimate true talent by regressing to the mean (t_0) by k^2/\sigma^2_t plate appearances.

The way all of this gets modified when you have two correlated observations — say true talent t1 in season 1 and true talent t2 in season 2 — is that your prior for the true talent distribution includes a correlation between t1 and t2. Specifically,

P(t1, t2) = \frac{1}{2 \pi \sqrt{|C|}} e^{-\frac{1}{2} (t-t_0) \cdot C^{-1} \cdot (t-t_0)}

where t is the vector (t1, t2), C is the covariance matrix (and || denotes the determinant),

C =   (\begin{array}{cc}  \sigma_t^2 & \rho \sigma_t^2 \\  \rho  \sigma_t^2 & \sigma_t^2 \\  \end{array}) \\  C^{-1} =   \frac{1}{\sigma_t^2}   (\begin{array}{cc}  \frac{1}{1-\rho^2} & -\frac{\rho}{1-\rho^2}  \\   -\frac{\rho}{1-\rho^2} & \frac{1}{1-\rho^2} \\  \end{array})

Multiplying this all out, and also including the probability distributions to observe performances x1 and x2, given true talents t1 and t2 and standard deviations of statistical fluctuations \sigma_1 and \sigma_2, gives

P(t1, t2) \propto e^{g}

I am writing it this way since I am going to find the mode as in the case with one observation, and since the exponential is always greater than 0, \frac{\partial}{\partial t} e^{g} = 0 means that \frac{\partial}{\partial t} g = 0, and all I really need to know is the argument of the exponential.

that function g is,

g = \frac{1}{2} (\frac{(x_1 - t_1)^2}{\sigma_1^2} + \frac{(x_2 - t_2)^2}{\sigma_2^2} + \frac{(t_1 - t_0)^2 + (t_2 - t_0)^2 - 2 \rho (t_1 - t_0)(t_2 - t_0)}{\sigma_t^2 (1-\rho^2)})

to find the mode I take the partial derivatives with respect to t1 and t2 and set them both equal to 0, then solve those 2 simultaneous equations for t1 and t2. The equations can be written in a matrix form,

(\begin{array}{cc}  \frac{1}{\sigma_1^2} + \frac{1}{\sigma_t^2 (1 - \rho^2)}  & \frac{-\rho}{\sigma_t^2 (1-\rho^2)} \\  \frac{-\rho}{\sigma_t^2 (1-\rho^2)} & \frac{1}{\sigma_2^2} + \frac{1}{\sigma_t^2 (1 - \rho^2)}   \\  \end{array})   (\begin{array}{c}  t1 \\  t2 \\  \end{array}) =   (\begin{array}{c}  \frac{x_1}{\sigma_1^2} + \frac{t_0 (1-\rho)}{\sigma_t^2 (1-\rho^2)} \\  \frac{x_2}{\sigma_2^2} + \frac{t_0 (1-\rho)}{\sigma_t^2 (1-\rho^2)} \\  \end{array})

the solution for t1 is,

t_1 = \frac{x_1 (\frac{1}{\sigma_1^2} + \frac{(1-\rho^2) \sigma_t^2}{\sigma_1^2 \sigma_2^2})   +   x_2(\frac{\rho}{\sigma_2^2})  + t_0 (\frac{1}{\sigma_t^2} + \frac{(1-\rho)}{\sigma_2^2}) }  {\frac{1}{\sigma_1^2} + \frac{1}{\sigma_2^2} + \frac{1}{\sigma_t^2} + \frac{(1-\rho^2) \sigma_t^2}{\sigma_1^2 \sigma_2^2}}

and for t2,
t_2 = \frac{x_2 (\frac{1}{\sigma_2^2} + \frac{(1-\rho^2) \sigma_t^2}{\sigma_1^2 \sigma_2^2})   +   x_1(\frac{\rho}{\sigma_1^2})  + t_0 (\frac{1}{\sigma_t^2} + \frac{(1-\rho)}{\sigma_1^2}) }  {\frac{1}{\sigma_1^2} + \frac{1}{\sigma_2^2} + \frac{1}{\sigma_t^2} + \frac{(1-\rho^2) \sigma_t^2}{\sigma_1^2 \sigma_2^2}}

Advertisements

One thought on “true talent estimates from 2 correlated samples

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s