### The (fat?) tails of true talent – Part 1- Preliminaries

I’m interested here in looking at the tails of the true talent distribution. My conjecture is that they are fat, i.e., the probability to have a very good level of true talent is greater than implied by a Gaussian (or Normal) approximation.

First, a few preliminaries,

The model says that true talent, in OBP, wOBA, FIP-, whatever, is distributed according to T, i.e., if I choose a player at random, the probability their true talent is between t and t+dt is T(t) dt.

The statistical error distribution is S. For large enough values of plate appearances, this is well approximated by a Gaussian distribution. Specifically this means that the probability to observe x, given a true talent t is,

$S(x|t) dx = \frac{1}{\sqrt{2 \pi} \sigma_x} e^{- \frac{(x-t)^2}{2 \sigma_x^2}} dx$

$\sigma_x$ is $\sqrt{t (1-t)/\mathrm{PA}}$. This approximation is justified in everything that I’ll be looking at.

Now, supposing that you know T (the true talent distribution), then, given an observation x, you can compute the “best” estimate of true talent in the Bayesian framework, i.e., the mean of the posterior probability distribution,

$\bar{t} = \frac{ \int dt' ~t ~S(x-t') T(t')}{\int dt' S(x-t') T(t')}$

If T is Gaussian,

$p(t) dt = \frac{1}{\sqrt{2 \pi} \sigma_t} e^{- \frac{(t-t_0)^2}{2 \sigma_t^2}} dt$

and if we approximate $\sigma_x$ as being independent of $t$, $\sigma_x = k/\sqrt{\mathrm{PA}}$ (where $k=0.5$ is a reasonable choice), then this can be written explicitly as

$\bar{t} = \frac{x/\sigma_x^2 + t_0/\sigma_t^2}{1/\sigma_x^2 + 1/\sigma_t^2}$

that is, the true talent estimate is the weighted average of observed performance and mean of the true talent distribution, where the weights are the inverse of the variance. If $\sigma_x = k/\sqrt{\mathrm{PA}}$, then this can be written,

$\bar{t} = \frac{x \cdot \mathrm{PA}+ t_0 \cdot k^2/\sigma_t^2}{\mathrm{PA} + k^2/\sigma_t^2}$,

in other words, to regress to the mean, take the observed performance, $x \cdot \mathrm{PA}/\mathrm{PA}$ and add $k^2/\sigma_t^2$ PA of average performance.