FanPost

Come and get your random samples! (just a slice of SV% deviation data)

Ok. I did some stuff and now I'm going to attempt to make a coherent claim out of some of it. Pointing out obvious and maybe not so obvious mistakes: more than welcome.

The question. How much confidence do we have in the claim that some goalies are truly consistent, while others are truly inconsistent? Is something called "consistency" a characteristic of a goalie?

Holiday park, in "your goalies will vary", last tackled this by suggesting looking at variance of SV%. While HP went somewhere else with standard errors and a t-test, I'm going to stay descriptive for the moment. As Snark suggested, I think that we can substantiate the claim that consistency is measurable by virtue of a repeatable standard deviation of a goalie's SV% over a certain number of games.

Put another way. Imagine goalie X. Every game X plays produces a SV% (caveat: which is an incomplete and imperfect indicator of goalie talent and performance, but it's what we got right now). Imagine the spread, distribution, or range of values this produces over time. (I know, right?)

We want to know whether this distribution is a characteristic of goalie X. We can substantiate this with evidence that the distribution is repeatable, non-random. Potentially, if we see a large distribution of SV% values game-to-game, and find this relatively repeatable, then we may have found an inconsistent goalie. Conversely, if we we observe a narrow distribution of SV% values game-to-game, and find this relatively repeatable, then we may have found a consistent goalie.

In a previous post, I had a brief glance at the 2012-2013 seasons of two goalies: Antti Niemi and Jonathan Quick. By looking at the game logs, it was straight-forward to calculate the standard deviation of the game-to-game SV% for both.

Niemi = 0.049

Quick = 0.109

This suggests that Niemi was a consistent goaltender, and Quick was not. But better evidence could be gained if one started to draw random samples from the 2012-2013 season to see if the standard deviation would hold in these samples. Put it this way: if consistency really is characteristic of the goalie, our samples should reflect the population. How close do random samples conform to the spread of the data as a whole?

Procedure

1) Formatted the game log I already had, giving each game a number ID.
2) I decided to start off using five (5) game sample sets.
3) Generated 100 random sets of 5 numbers (I used www.randomizer.org to generate them; their form is easy to use and results downloadable). Linked these to the game log via VLOOKUP, and presto, 100 random sets of 5 SV%s. (Note: I decided not to have repeat games. I don't think this matters, but FYI for the stat heads.)
4) Fuck it. Made another 100 random sets of 5 numbers. Just in case.
5) Wtf, why not? Generated 100 random sets of 20 numbers.

Results

I first looked at the spread of the 5 game samples. Each 5 game sample produced a standard deviation value. What were the minimum and maximums of the standard deviations drawn from each 5 game sample?

Niemi first 100 five-game samples = 0.0138 - 0.0895
Niemi second 100 five-game samples = 0.0155 - 0.0965
Niemi 100 twenty-game samples 0.0308 - 0.0651

Here, we get a sense of the range of deviation for Niemi (recall that the actual standard deviation was 0.049). And for Quick (actual deviation was 0.109):

Quick first 100 five-game samples = 0.0095 - 0.268
Quick second 100 five-game samples = 0.0117 - 0.2721
Quick 100 twenty-game samples = 0.0383 - 0.143

Obviously the 20-game samples produce a tighter concentration around the actual deviation. That should happen given the sample size. Notice also that the spread for Quick was much greater than Niemi.

For Niemi, in the first set of 100 five-game samples, the standard deviation for game-to-game SV% was 0.045. In the second set of 100 five-game samples, the standard deviation was 0.046. In the set of 100 twenty-game samples, the standard deviation was 0.049. (Breaking down the 100 samples further into sets of 10 discloses a pretty tight concentration around 0.049.)

For Quick, it's a different story. First set of 100 five-game samples = 0.9145. Second set = 0.0808. Twenty-game samples = 0.1018. These are relatively far from the actual deviation of 0.109. Here you go. Tables.

niemi 2012-2013 (rs) 5 game (ss) 5 game2 (ss) 20 game (ss)
agg SV% 0.924
gtg mean SV% 0.922 0.923 0.919 0.922
gtg mean SV% stdev 0.049 0.045 0.046 0.049

Table 1. Niemi results.

quick 2012-2013 (rs) 5 game (ss) 5 game2 (ss) 20 game (ss)
agg SV% 0.902
gtg mean SV% 0.883 0.876898 0.885922 0.8821935
gtg mean SV% stdev 0.109 0.091451503 0.080837331 0.101756477

Table 2. Quick results.

Now, my interpretation of this is that this is weird, and maybe really problematic. If a goalie is inconsistent, his inconsistency should be repeatable.

Consider the set of 100 twenty-game samples I created for Quick. Each sample consisted of 20 random games from the 2012-2013 season. If he's inconsistent, a random half-season of games should give a result that reflects this inconsistency (measured by a standard deviation). I did this 100 times to make sure that this was repeatable. After 100 games, the deviation was still off by about 0.007. That may seem insignificant, but for Niemi, the 20-game samples hit his deviation right on the nose. It could mean that inconsistency is harder to identify than consistency, but I don't about that.

I could keep going by grabbing more goalies from 2012-2013 (and then grabbing more years of data). But am I reading this properly? Did I produce a "repeatable" deviation result with Niemi? If so, it appeared to work less well with Quick. I'd be really interested to hear your diagnoses. More sets of 100 twenty-game samples for Quick? Increase the sample size? I figured 100 would be good enough, but I don't know the burgeoning conventions for hockey statistic

This item was created by a member of this blog's community and is not necessarily endorsed by Fear The Fin.

X
Log In Sign Up

forgot?
Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

Join Fear The Fin

You must be a member of Fear The Fin to participate.

We have our own Community Guidelines at Fear The Fin. You should read them.

Join Fear The Fin

You must be a member of Fear The Fin to participate.

We have our own Community Guidelines at Fear The Fin. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9355_tracker