I’m interested here in looking at the tails of the true talent distribution. My conjecture is that they are fat, i.e., the probability to have a very good level of true talent is greater than implied by a Gaussian (or Normal) approximation.
First, a few preliminaries,
The model says that true talent, in OBP, wOBA, FIP-, whatever, is distributed according to T, i.e., if I choose a player at random, the probability their true talent is between t and t+dt is T(t) dt.
The statistical error distribution is S. For large enough values of plate appearances, this is well approximated by a Gaussian distribution. Specifically this means that the probability to observe x, given a true talent t is,
is . This approximation is justified in everything that I’ll be looking at.
Now, supposing that you know T (the true talent distribution), then, given an observation x, you can compute the “best” estimate of true talent in the Bayesian framework, i.e., the mean of the posterior probability distribution,
If T is Gaussian,
and if we approximate as being independent of , (where is a reasonable choice), then this can be written explicitly as
that is, the true talent estimate is the weighted average of observed performance and mean of the true talent distribution, where the weights are the inverse of the variance. If , then this can be written,
in other words, to regress to the mean, take the observed performance, and add PA of average performance.