The (fat?) tails of true talent – Part 4- Results for non-Gaussian True Talent

In parts 3.1 and 3.2 I discussed approximating true talent distributions with Gaussians, and comparing these to data. In this post I will talk about my attempts to model true talent with non-Gaussian distributions.

As a reminder, the probability to make an observation x is the convolution of the noise (or luck) distribution, S(x), with the true talent distribution, T(t).

p(x) = \int dt' S(x-t') T(t')

So far I’ve limited T to be Gaussian, which is convenient because the convolution can be done analytically and all we have to do is add up the log-likelihoods and minimize.

As an alternative to a Gaussian distribution, I tried using a Cauchy and a
Student’s t distribution, but both of these are way too fat, or more precisely give you huge outliers (like FIP- = 1000 or OBP = 2) too often to be useful approximations for what I’m doing. The comparisons I showed in 3.1 and 3.2 suggest the true talent distribution may be a little wide, but not that dramatically.

So, instead I cooked up an ad-hoc model that looks a lot like a Gaussian, but basically says the standard deviation gets a bit bigger as the values get further away from the mean; specifically,

T(t) = \frac{1}{\sqrt{s \pi} \sigma_t}e^{-\frac{1}{2} \frac{(t-t_0)^2}{\sigma^2_t (1+[f (t-t_0)/t_0]^2)}}

in other words, almost a Gaussian, but f (the fat-parameter) stretches out the standard deviation when t-t_0 gets large (relative to t_0).

If I fit this model for the 3 parameters, t_0, \sigma_t, f, here are the results

tailsTable

Let me look at the quantity and time frame with the largest value of the fat-parameter, f, which is OBP for 1993-2005. During this time, Barry Bonds had a 0.559 OBP in 2443 PA. Using the Gaussian approximation, I would regress by about 230 PA and estimate his true talent as,

t = \frac{0.559 \times 2443 + 0.347 \times 230}{2443 + 230} = 0.541

on the other hand if I use the fat-Gaussian as my true talent distribution and compute the mean of the posterior probability distribution I get,

t = 0.553

So that’s about as extreme a difference the “fat” tail of OBP can make. I started off this project thinking about Pedro’s 5-year peak of FIP-, which I’m taking to be 1999-2003, during which he had a 43 FIP- and 3644 batters faced. Based on the value of \sigma_t, the number of PAs to regress is \approx 220 and regressed answer would be,

\mathrm{FIP-} = 46.1177

on the other hand, with f at 2 times the standard deviation of the maximum likelihood value (f = 2 \times 0.019) it would be,

\mathrm{FIP-} = 46.1153.

Here is how Pedro’s true talent estimate would vary as a function of the fat-parameter, f.

trueTalenetEstimateVsFatpar_01

So in conclusion, this work shows no substantive impact due to fat tails on FIP-. For OBP and wOBA, it may make difference on the order of 10 points.

Advertisements

One thought on “The (fat?) tails of true talent – Part 4- Results for non-Gaussian True Talent

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s