A couple ideas from the recent posts (Mike Chen, Preds numbers) have spurred some thoughts. (I started writing two comments and then decided oh god sunk costs so heavy do fan post.) One of the big questions that hit me is a simple one: what’s a streak? How do we know when a team is on a streak? Statistics (by my eye!) helps us organize and describe (and potentially explain, though not always) what we are seeing. At first, I thought that we only really know a streak in retrospect, not while it’s happening. But maybe I’m wrong about this.
This led me to wonder further about Corsi for (Corsi+) and scoring chances (scoring+). Perhaps it is obvious to say that these two statistics are highly correlated (seems to hold, see here), but I wondered about what the “happy” ratio would be for a winning team, and for the Sharks, in particular. The aforementioned link looks at the comparative utility of tracking shot differential versus scoring chance differential for game outcomes (bottom line: no statistically significant difference, shots tell us enough), but I haven't seen anything that uses a ratio of shots toward the net (Corsi for events) and scoring chances as a measure of efficiency (one could also calculate the ratio of Corsi for and SOG, but I’m not sure this is as good).
(Note: Underlying this proposition is that Corsi counts a lot of events, some of which are more helpful in leading to a goal than others. Scoring chances is the closest thing we have to a goal beside an actual goal. So, I’m assuming this to be a way to estimate Corsi event quality in hockey practice. Now, there is a live debate over “shot quality” (see here and here), but these are typically over whether or not this is a “repeatable” team skill and looking at shooting percentages. I’d be happy to know if someone else has already profitably tilled this rather arid patch of soil/panned this barely flowing creek for golden (but not really) flakes; or, if this really is an effect of some other force. Also, this exercise does not control for possible “score effects,” which is the tendency/behavio(u)ral pattern of leading teams to allow more shots and losing teams to take more shots. Statistics drawn when teams are tied can be used to manage this, but I haven’t run the numbers. Ok. Enough qualifying.)
So why calculate this ratio of Corsi+/scoring+ (where a “1” is ultimate efficiency, as each Corsi+ event is a scoring chance, and where “INSERT big number here” is crap)? Well, here’s a 2x2 table to guide us:
|Efficient (low number)||Inefficient (high number)|
|Win||Bif-urcated into two scenarios: 1) if team wins by wide margin, then team is awesome, go all-in/bet the farm for the Cup, or 2) if team wins by low margin, ohshitohshit not really as good as we thought <hold butt>||Ha! Poopin’ gold, suckers! Just wait ‘til we all don’t have the flu!|
|Lose||Team not talented enough. <sigh>. <curse>. Scout WHL, watch Beanpot, etc.||Ohmanohmanohman. Team sucks real bad. Rant on FTF and demand: 1) DW be fired, 2) Larry Robinson to replace TMac, and 3) trade Marleau.|
Figure 1. 2x2 = science. Yup.
Awesome. So, what data and what results? To do this properly, we need to know game outcomes and calculate corsi+/scoring+ (via the req’d reading). Would be best to do this for all teams (always in need of bigger N), but for the purpose of preliminary analysis (spit-balling) let’s just look at the Sharks on their two streaks. In addition (for the sake of total bull-shitting), let’s look only at regulation wins and losses. This also helps us understand if there’s something really wrong with the team or not (see lower right-hand box of the 2x2). That results in a very small sample (n=8!), but maybe, just maybe, indicative of some pattern.
Descriptives for the ratio:
|Mode||5.5 (unhelpful in this instance)|
|Min/Max||3.33 - 5.5|
(Note: Assuming normal distribution with this small-N sample, 68% of the total population will be within 0.89 of the mean, or between 5.22 and 3.44; i.e. can typify most results to fall within this range(...ish) and perhaps can draw the high-low efficiency boundary along these two numbers.)
Right. Check out the bottom two ratios. These, perhaps in being more than one standard deviation from the mean, suggest the Sharks were straight-up gold-poopin’. Check out the top two. These, again more than one standard deviation from the mean, suggest we should bet the farm on the Sharks. Middle four ratios give mixed results.
Bottom line: ahm... no real pattern. When they win, sometimes it looks like the Sharks are talented. Other times, perhaps more luck than skill. One notable point, though, is that their regulation losses do not appear to be the result of inability to convert pucks toward the net into actual scoring chances. This is evidence (assuming this is a repeatable team skill) that there is probably nothing fundamentally flawed with this team (other than the obvious... you know... too streaky, too soft, etc.).
Final note: just for fun, one may predict that when Corsi+/scoring+ < 3.44 we’ll see a WIN, and when it is >5.22, we’ll see a LOSS. I am willing to admit that this could be stupid. Thoughts?