### The (fat?) tails of true talent – Part 4- Results for non-Gaussian True Talent

In parts 3.1 and 3.2 I discussed approximating true talent distributions with Gaussians, and comparing these to data. In this post I will talk about my attempts to model true talent with non-Gaussian distributions.

As a reminder, the probability to make an observation x is the convolution of the noise (or luck) distribution, S(x), with the true talent distribution, T(t).

$p(x) = \int dt' S(x-t') T(t')$

So far I’ve limited T to be Gaussian, which is convenient because the convolution can be done analytically and all we have to do is add up the log-likelihoods and minimize.

As an alternative to a Gaussian distribution, I tried using a Cauchy and a
Student’s t distribution, but both of these are way too fat, or more precisely give you huge outliers (like FIP- = 1000 or OBP = 2) too often to be useful approximations for what I’m doing. The comparisons I showed in 3.1 and 3.2 suggest the true talent distribution may be a little wide, but not that dramatically.

So, instead I cooked up an ad-hoc model that looks a lot like a Gaussian, but basically says the standard deviation gets a bit bigger as the values get further away from the mean; specifically,

$T(t) = \frac{1}{\sqrt{s \pi} \sigma_t}e^{-\frac{1}{2} \frac{(t-t_0)^2}{\sigma^2_t (1+[f (t-t_0)/t_0]^2)}}$

in other words, almost a Gaussian, but f (the fat-parameter) stretches out the standard deviation when $t-t_0$ gets large (relative to $t_0$).

If I fit this model for the 3 parameters, $t_0, \sigma_t, f$, here are the results

Let me look at the quantity and time frame with the largest value of the fat-parameter, f, which is OBP for 1993-2005. During this time, Barry Bonds had a 0.559 OBP in 2443 PA. Using the Gaussian approximation, I would regress by about 230 PA and estimate his true talent as,

$t = \frac{0.559 \times 2443 + 0.347 \times 230}{2443 + 230} = 0.541$

on the other hand if I use the fat-Gaussian as my true talent distribution and compute the mean of the posterior probability distribution I get,

$t = 0.553$

So that’s about as extreme a difference the “fat” tail of OBP can make. I started off this project thinking about Pedro’s 5-year peak of FIP-, which I’m taking to be 1999-2003, during which he had a 43 FIP- and 3644 batters faced. Based on the value of $\sigma_t$, the number of PAs to regress is $\approx 220$ and regressed answer would be,

$\mathrm{FIP-} = 46.1177$

on the other hand, with f at 2 times the standard deviation of the maximum likelihood value ($f = 2 \times 0.019$) it would be,

$\mathrm{FIP-} = 46.1153$.

Here is how Pedro’s true talent estimate would vary as a function of the fat-parameter, f.

So in conclusion, this work shows no substantive impact due to fat tails on FIP-. For OBP and wOBA, it may make difference on the order of 10 points.