April 16, 2012; San Jose, CA, USA; San Jose Sharks center Joe Thornton (19) skates with the puck against the St. Louis Blues during the first period of game three of the 2012 Western Conference quarterfinals at HP Pavilion. St. Louis defeated San Jose 4-3. Mandatory Credit: Jason O. Watson-US PRESSWIRE
P+CORP = (P/60 +
Avg NHL Replacement Level P/60)
+ (Corsi Rel + Avg NHL Replacement Level Corsi Rel) / [coeffiecient] + (Corsi On +
Avg NHL Replacement Level Corsi Rel) / [coeffiecient]
P+CORP (I leave the pronunciation to your discretion) wasn't something I went looking for, but was really the consequence of defining NHL replacement level talent. It's a holistic stat forged under KISS principles that bloomed before me as an inevitable consequence of my previous post. Following that post I began to lament the fact that we currently didn't have an OBP+SLG like baseball. The concept to me was so perfect. OBP represents a player' s ability to setup his team to score, as getting on base perserves an out and puts a runner in position to score. SLG represents a player's ability to produce, that is, to generate runs if teammates are in scoring position. At that moment it occurred to me (and I honestly feel pretty stupid for not putting this together sooner) that we have 2 stats with a similar scope.
A player's Corsi tells us about possession. If a skater has great possession numbers, by inductive reasoning we can also say that he's putting his team in the position to generate scoring chances, and in all liklihood, limiting the opposition's scoring chances. P/60 represents a player's ability to produce. Although a little confounded by PDO, it still is a good measure of a player's ability to convert on those opportunities. I think many of us had unconsciously connected these 2 stats when evaluating players, but now I present them under one heading, while fine-tuning the coefficients to give us the best results. At it's heart it's a simple stat that serves as a good substitute or comparison piece to the current gold standard of player evaluation, GVT. The beauty is in it's ease of calculation, which requires looking up only 3 (or 4 for Defensemen) stats from the main Behind The Net page. My underlying assumption is that ice time increases with line status, ie. top lines are given the most ice time, but everything else falls out from detailed data collection. I didn't fine tune these numbers so the best players are at the top of the list, this is evidenced-based hockey analysis.
P+CORP for Forwards
Forward P+CORP = P/60 + (Corsi Rel + 7) / 20 + (Corsi On + 7) / 20
P+CORP is really built for forwards. It's an amalgam of offensive production and possession. The objective for most forwards is to score, and we directly measure that. With the addition of Corsi we measure that forwards ability to push play in the right direction, while adding reliability along the way. I used both Cosi Rel and Corsi On in attempt to utilize both their strengths and neutralize both their weaknesses. Corsi Rel tends to over-state players that play on weak teams, while Corsi On over-states players on good teams. By using both, the cream that rises to the top are players that are either good players on average teams, or average players on good teams. The elite being those with a strong Corsi on and Corsi Rel, that is, good players on good teams.
What I love the most about this stat is how accessible it is. I still consider GVT to be currently the best tool to evaluate player contribution, but I think this creates a widely available substitute that correlates very well with GVT. In addition, it is well integrated with replacement level. A score of 0 will always be our baseline replacement level talent. As P+CORP increases (and theoretically decreases) talent level increases as compared to (ie. over) replacement level.
We'll start our analysis with some simple year-over-year correlations in comparison to GVT. (I know, not the most rigourous, but sufficient because I don't have game-by-game GVT). What we're basically doing is comparing the previous year's results to the following year, and trying to see if the previous year predicts the following year accurately. My dataset (N=1077) included 2007-2012 (5 years) skaters who have over 450 total TOI, all (save GVT) are 5v5 numbers.
Forward Year To Year Correlations
|TOI/60||P/60||Corsi Relative||Corsi On||P+CORP||GVT|
The two most important numbers come from the r(self) and r(GVT) rows. These represent each stat (in the columns) ability to predict next year's (row) results. TOI/60 is the most repeatable (highest r(self)) which is understandble, as coaches are likely to play their top players no matter what kind of season they are having. Both GVT and P+CORP are equally repeatable, a good indicator of their respective reliability. We can see from the r(GVT) row that P+CORP does a reasonable job tracking along with GVT. At this level I would say the difference between TOI/60 and P+CORP is negligble, and both predict future GVT with equal ability.
Another way to parse the data (N= 281) is looking only at players that ended the season on different teams. BTN doesnt provide season splits, so it's possible and likely that some players appear on other teams due to trade deadline acquisition, and really played a majority of their time on the same team. This could disrupt the results, but we'll take that as we analyze the data.
Forwards Year To Year Different Team Correlations
|TOI/60||P/60||Corsi Rel||Corsi On||P+CORP||GVT|
What grabbed me immediately was how repeatable Corsi Rel was year to year when skating for a different team. I hadn't seen this before, and it really shows the importance of that stat. P+CORP uses this to it's advantage and vaults over GVT in this set of data.
Lastly we'll look at % variance explained by a stat intra year. This tells us which stats give us the most information about our GVT and P+CORP results, and really unearths my biggest criticism of GVT (and PC, for that matter).
% of Variance Explained
|On-Ice Sh%||On-Ice Sv%||PDO||P/60||Corsi Rel||Corsi On||P+CORP||gvt|
The columns represent the % of variance explained by the stat in the row
As you can see, nearly 2/3 of GVT can be accounted for by TOI/60 and PDO. . This is my biggest problem with accounting all successes and failures of a team to specific players. Ultimately the randomness (non-repeatable) elements are assigned as well. P+CORP is a rate stat, so it doesn't consider contribution over the entire year. Small samples may be a little out of whack with reality, and would require adjusting (regressing) to the mean. There is utility in the accounting approach of GVT and PC. With P+CORP we now have a reliable rate stat for comparison.
Here is the obligatory table of top players that really doesn't tell you anything you don't already know. I usually prefer to filter through stats like zone% and Corsi Rel QOC, but here I adjusted P+CORP for both.
Forward Zone and Competition Adjusted P+CORP = P/60 + (Corsi Rel + 7) / 20 + (Corsi On + 7) / 20 + (Corsi Rel QOC + 1)/5 + (49 - Ozone%)/12
Career Zone and Competition Adjusted P+CORP Leaders
|NAME||TOI (>2000)||GP||P/60||Corsi Rel||Corsi On||Corsi Rel QOC||OZ%||P+CORP||Rank||Adj P+CORP||Adj rank|
I was caught in a "damned if you do, damned if you don't" scenario when adjusting for ozone% and Corsi Rel QOC. The Sedins are at the top of the unadjusted results, but by adjusting players like Kesler end up in the top 10, despite barely cracking top 50 unadjusted. Like I mentioned before, I usually prefer filtering, but presented it this way for brevity.
Single Season P+CORP Leaders
|Rank||Single Season P+CORP||SEASON||P+CORP||Single Season S.J P+CORP||SEASON||P+CORP|
|1||ALEXOVECHKIN||2009-2010 Season||6.03||JOETHORNTON||2011-2012 Season||5.04|
|2||HENRIKZETTERBERG||2007-2008 Season||5.85||JOEPAVELSKI||2010-2011 Season||4.80|
|3||SIDNEYCROSBY||2010-2011 Season||5.70||RYANECLOWE||2010-2011 Season||4.75|
|4||PATRICEBERGERON||2011-2012 Season||5.62||PATRICKMARLEAU||2009-2010 Season||4.40|
|5||HENRIKSEDIN||2009-2010 Season||5.47||JOEPAVELSKI||2011-2012 Season||4.35|
|6||DANIELSEDIN||2009-2010 Season||5.41||KYLEWELLWOOD||2010-2011 Season||4.26|
|7||PAVELDATSYUK||2007-2008 Season||5.38||PATRICKMARLEAU||2008-2009 Season||4.18|
|8||PAVELDATSYUK||2008-2009 Season||5.37||JOETHORNTON||2007-2008 Season||4.15|
|9||ANDREWLADD||2008-2009 Season||5.37||LOGANCOUTURE||2010-2011 Season||4.15|
|10||ZACHPARISE||2008-2009 Season||5.33||JOETHORNTON||2009-2010 Season||4.12|
|11||JUSTINWILLIAMS||2007-2008 Season||5.33||JOEPAVELSKI||2009-2010 Season||4.10|
|12||MARIANHOSSA||2008-2009 Season||5.26||TORREYMITCHELL||2010-2011 Season||4.04|
|13||BRADMARCHAND||2011-2012 Season||5.22||JOEPAVELSKI||2008-2009 Season||3.71|
|14||PATRICEBERGERON||2010-2011 Season||5.18||JOETHORNTON||2008-2009 Season||3.71|
|15||ALEXANDERSTEEN||2011-2012 Season||5.16||MANNYMALHOTRA||2009-2010 Season||3.64|
|16||PAVELDATSYUK||2010-2011 Season||5.09||RYANECLOWE||2009-2010 Season||3.63|
|17||TYLERSEGUIN||2011-2012 Season||5.09||MILANMICHALEK||2008-2009 Season||3.58|
|18||SIDNEYCROSBY||2007-2008 Season||5.09||LOGANCOUTURE||2011-2012 Season||3.55|
|19||JOETHORNTON||2011-2012 Season||5.04||PATRICKMARLEAU||2010-2011 Season||3.49|
|20||ALEXBURROWS||2008-2009 Season||5.02||PATRICKMARLEAU||2011-2012 Season||3.49|
|21||SIDNEYCROSBY||2009-2010 Season||4.96||DANYHEATLEY||2009-2010 Season||3.34|
|22||JUSTINWILLIAMS||2010-2011 Season||4.92||DEVINSETOGUCHI||2008-2009 Season||3.33|
|23||JOHANFRANZEN||2008-2009 Season||4.90||JOETHORNTON||2010-2011 Season||3.05|
|24||DANIELSEDIN||2008-2009 Season||4.89||RYANECLOWE||2008-2009 Season||3.04|
|25||CLAUDEGIROUX||2010-2011 Season||4.85||MILANMICHALEK||2007-2008 Season||2.84|
No surprises for me when it comes to either column. That 2010-2011 season for the Sharks saw a lot of production from a lot of players. The fact that JT's 2011-2012 campaign outshines every other year tells you something about the role he was given last year. Wellwood's number is really interesting. He only played in 35 games, a small sample, and posted a fantastic 4.26. A year later playing top 6 minutes with the Jets and Wellwood's P+CORP? A 4.27, how's that for reliable? Both those years are well above his career average, but maybe he has finally found a good role.
P+CORP for Defensemen
Defensemen P+CORP = P/60 + (Corsi Rel + 2) / 10 + (Corsi On + 2) / 10 + (Corsi Rel Qoc + 0.5) / 2
The difference between forwards and defensemen is so substantial that nearly every stat is effected by the difference in roles, and P+CORP is no exception. A defensemen's responsibility is is to prevent the opposition from producing goals. The unfortunate thing is we don't have anything such as negative Points/60, because we can't as easily assign the value a goal against to a specific player. However, defensemen still contribute a fair amount of offense, just not nearly as much as forwards. Subsequently, we have to add another stat to P+CORP for defensemen, Corsi Rel QOC, which is our best understanding of the level of competition a defensemen faces. Normally I advocate for filtering contextual stats, rather than adjusting for them outright, but the level of competition a defensemen faces is so important to their role that it's incorporation is inescapable.
Defensemen Year to Year Correlations
|TOI/60||P/60||Corsi Relative||Corsi On||Corsi Rel QoC||P+CORP||GVT||Corsi Rel QOC adj P+CORP|
|Defensemen Adj P+CORP||1.36||1.50|
As I mentioned before, one of the strengths of holistic stats is that we can always assess value above replacement level talent. I haven't tried to standardize defensemen and forward P+CORP values. It's not to say I won't try to rectify that later, but for now I'll just leave it as it is. Of note from the table above is that replacement level for defensmen was significantly higher than for forwards. Assusming that the recent data is correct, less than a standard deviation from the Adj P+CORP mean represented replacement level talent, suggesting that a few teams carry replacement level as the 5th and 6th defensmen and many teams as the 7th defensemen. Whereas replacement level is largely 4th line and role players for forwards.
What I've presented here is certainly an introduction to a working statistic. At best it is still very raw and needs much more refinement. It's built on a solid foundation of what makes hockey players successful, tracks very nicely with the gold standard of player evaluation, GVT, and is relatively easy to calculate. At the very least I hope you, loyal fear the fin reader, find it interesting and useful.