Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Yankees Deny Rumors That Team Is For Sale

Playoff Probabilities: Applying Score-Adjusted Fenwick%

OTTAWA, ON - JANUARY 26:  S.J. Sharkie, mascot for the San Jose Sharks, poses for a portrait during 2012 NHL All-Star Weekend at Ottawa Convention Centre on January 26, 2012 in Ottawa, Canada.  (Photo by Matt Zambonin/Freestyle Photo/Getty Images)

Before we get rolling with the post I want to introduce myself. Recently I've fan posted a few stats related articles and also spent some time writing for BTN (now AIC). I plan to contribute some advanced stats analysis of the sharks here at FTF, and some material on league wide metrics as well. I'll try to bring highly relevant, interesting stat related material regarding this year's team. If you want to see something specific feel free to let me know. I hope you enjoy it!

A few weeks ago I started compiling a few different models that would look at playoff probabilities. In part because I think there is still considerable debate going on in the advanced stats community about what exactly is the best metric for predicting future success (Point% specifically), and also because I wanted to see what the potential playoff match-ups would look like. We're basically killing 2 birds with 1 stone by setting up an "experiment" for which variable predicts future success the best, while also looking at current team rankings through playoff probabilities. Let's look at data through the lens of the currently most predictive model, score adjusted fenwick%.

Star-divide


Western Predicted Final Standings

Final Standings Team Score Adj Fenwick% Playoff Probability Mean Points Mean Wins Mean Ties Mean Losses
1 Detroit Red Wings 56.02 99.9% 109.68 52.45 4.79 24.77
2 Vancouver Canucks 52.79 99.1% 104.46 48.22 8.03 25.75
3 San Jose Sharks 52.72 97.9% 102.66 46.21 10.25 25.55
4 St. Louis Blues 55.08 99.9% 107.89 48.47 10.94 22.58
5 Chicago Blackhawks 52.81 98.3% 102.49 46.29 9.91 25.80
6 Los Angeles Kings 51.00 78.3% 94.93 40.50 13.92 27.58
7 Nashville Predators 46.93 74.9% 94.56 43.36 7.84 30.80
8 Dallas Stars 50.22 48.7% 90.27 42.04 6.18 33.78
9 Colorado Avalanche 50.55 42.0% 89.67 41.92 5.83 34.25
10 Phoenix Coyotes 49.38 23.6% 86.70 37.39 11.92 32.69
11 Minnesota Wild 45.32 19.4% 86.14 37.59 10.95 33.45
12 Calgary Flames 48.10 16.6% 85.38 37.73 9.93 34.35
13 Anaheim Ducks 47.08 0.9% 76.96 32.92 11.12 37.96
14 Edmonton Oilers 48.37 0.5% 75.63 33.29 9.05 39.66
15 Columbus Blue Jackets 49.45 0.0% 67.19 28.59 10.02 43.40



Eastern Predicted Final Standings

Final Standings Team Score Adj Fenwick% Playoff Probability Mean Points Mean Wins Mean Ties Mean Losses
1 Boston Bruins 52.82 99.8% 107.15 50.44 6.27 25.29
2 Pittsburgh Penguins 54.90 99.1% 102.95 47.50 7.95 26.55
3 Florida Panthers 48.88 67.7% 92.37 38.58 15.20 28.21
4 New York Rangers 48.48 98.5% 102.76 47.24 8.28 26.48
5 Philadelphia Flyers 51.43 98.4% 102.40 46.61 9.18 26.21
6 Ottawa Senators 49.91 73.5% 93.44 41.87 9.69 30.43
7 New Jersey Devils 50.23 70.8% 92.85 42.84 7.18 31.99
8 Washington Capitals 48.67 63.0% 91.62 42.22 7.19 32.59
9 Toronto Maple Leafs 48.93 52.9% 90.31 40.63 9.06 32.31
10 Winnipeg Jets 51.15 35.2% 87.62 38.85 9.93 33.23
11 Montreal Canadiens 49.77 17.1% 84.48 35.70 13.09 33.22
12 New York Islanders 49.62 9.7% 82.40 35.60 11.20 35.20
13 Tampa Bay Lightning 48.45 8.9% 82.00 36.90 8.19 36.91
14 Buffalo Sabres 48.06 3.3% 79.51 35.24 9.03 37.73
15 Carolina Hurricanes 48.43 2.0% 78.44 32.81 12.83 36.37

Grab your super nerd goggles and join me at the end of the post for a detailed explanation of this and other models. The tables are sortable, click the table headers to sort for that column.

A full spreadsheet of this model can be found here.

For Sharks fans- San Jose comes out looking pretty good. They make the playoffs in about 98% of the simulations, and win the Pacific division in 79% of sims. They're pretty much a lock for the 2nd or 3rd seed coming in at 71%, while they probably won't take the President's trophy or the Western Conference crown, claiming those titles only 5% and 8% respectively.

The West is boasting some great teams (again) this year, with a sharp drop-off outside of the big 5. It looks like DET, STL, VAN, CHI, and SJS are all but locks for the playoffs this year. After that it gets pretty interesting. LA, NSH, and DAL look like front runners for the remaining spots, but it's open, with LA and NSH missing the playoffs in 1/4 of the simulations. Of note NSH has a terrible adj-fenwick%, but they've won so many games already that they have a significant shot at the playoffs. I'm sure most Sharks fans would love seeing NSH come to the tank this April. LA is a bit the opposite. Despite a strong adj-fenwick% their early season woes have left them fighting for a playoff spot, which may come down to the final week of the season for them. Also it's interesting to see MIN's numbers They carry such a low fenwick% that the model predicts them to make the playoffs in only 19% of sims despite clinging to the 8th spot currently. Unfortunately for CLB fans in the 100,000 sims I ran, they never made the playoff, not even once.

The East similarly has 4 top teams that drop-off after that. WPG fans can take notice that in about 1/3 of sims they make the playoffs. Their underlying numbers are really quite good, for whatever reason they just haven't got the bounces this year (Kinda ironic in Artic Ice Hockey's first year?).

The race for the President's trophy should get real heated toward the end of the season as well. The real front runners are DET, STL, and BOS. They take top team in 32%, 20%, and 19% of sims respectively. That's nearly 3/4 of all sims. What else can you say about STL's turnaround year? They seem to be only getting better.

I'll try update this in a week or so, and we can take a look at trends over that time. An obvious disclaimer here is that as a whole hockey just isn't that predictive of a sport. Either we haven't found the key to measuring success yet, or it's just that random of a game. Take all these numbers with a grain of salt, as they will be changing throughout the rest of the season as teams get hot and cold.

***Model Specifics***

A few days ago Eric T over at BroadStreetHockey wrote a great piece about adjusting team fenwick% using score effects. This ignited a lot of articles covering the subject of score effects, and some debate over its validity (here). I decided to use Eric's data for the model I created. First, I calculated each team's score-adj fenwick% using Eric's formula

SAF = 0.089 * Fen_up_2 + 0.200 * Fen_up_1 + 0.424 * Fen_tied + 0.200 * Fen_down_1 + 0.089 * Fen_down_2 + .210

and then regressed that by 0.15 to the mean, as this is about the reliability of that stat over the average amount of games played by teams this year. I then calculated the Goals/Game and Goals Against/Game for each team assuming a 0.7 Fenwick Sh%. I then ran all that through a Monte Carlo Simulation of the rest of the season 100,000 times. At the end of each simulated season I recorded each teams standings so that probabilities could be calculated at the end. Although not shown here, I also created 2 other models much in the same way. One using Fenwick% Road Close, and the other Goal%. In April we can go back and see which of all the models was the most accurate. We could theoretically choose any point in any of the past 4 seasons to do the same exact thing, which if warranted (and time permitting) I may do.

Comment 29 comments  |  3 recs  | 

Do you like this story?

Comments

Display:

Really cool stuff

You might have some bookie friends in the near future

"You can't polish a turd." -- George Carlin

Co-Manager at Arctic Ice Hockey

by Bettman's Nightmare on Jan 28, 2012 9:10 AM PST reply actions  

Only a 97.9% playoff probability??

That 2.1% is going to keep us Sharks fans awake at night.

Leslie Knope: "...but it has a lot of heart."
April Ludgate: "That's what people always say when something sucks."
Fear The Fin's Fifth-String Moderator and Recap Specialist !!

by idunno723 on Jan 28, 2012 9:12 AM PST reply actions  

Why only even strength?

Basic question here — it seems like most of these metrics only take into account even strength situations (at least I think this is the case). Doesn’t this leave out potentially important distinctions, especially if we’re trying to measure the relative strengths of some otherwise pretty evenly matched teams (top 5 in west; top 4 in east)? Teams do differ in the number of penalties they take, as well in the quality of the special teams — is it enough to make a difference?

BTW, great stuff — looking forward to more.

by craigtyle on Jan 28, 2012 9:18 AM PST reply actions  

The generic advanced stat reply is

sample size, and EV has consistently shown to be the most predictive, but I think a few other things are at play as well. Some rankings (here) take special teams into account, but it turns out that teams that are good at EV are usually also good on the PP and PK, so you dont really add all that much predictive power despite the impact special teams has on individual games. Also, how your team performs when everything is equal (str, score, etc.) is a lot more telling of your team, than when your given an advantage.

by SnarkSD on Jan 28, 2012 10:18 AM PST up reply actions  

Nice stuff. Now can you give us some Stanley Cup odds? I may…

by Hawerchuk on Jan 28, 2012 9:26 AM PST reply actions  

Awesome work again

Surprised by Nashville’s poor Fenwick numbers, they’ve usually looked pretty good the past few years. Maybe Marcel Goc was the key to their success? (Kidding. Well, mostly).

by The Neutral on Jan 28, 2012 9:34 AM PST reply actions  

It might be interesting to look back on this at the end of the season

Explore how closely things played out to these projections, what factors might have influenced changes, etc.

Can’t say I would love the Sharks having to go through the Kings then the Canucks in the 2nd round, but that obviously can all change. Top 6 in the west seems likely to be these teams, though—which means getting even the 2nd seed could be pretty huge.

Winter. Time to eat fat and watch hockey. -- Margaret Atwood

by Timorous Me on Jan 28, 2012 10:16 AM PST reply actions  

That's definitely the plan

As a sharks fan I’m hoping we see NSH. LA is similar to (although not nearly as dominant) as CHI was last year. Getting out of the first round cleanly would be a huge boost, as it looks like the 2nd round is going to be brutal no matter who you face.

by SnarkSD on Jan 28, 2012 10:24 AM PST reply actions  

The relative weakness of the lower seeds (who knows what Nashville does with Suter now) makes long-term playoff success definitely more likely for the 1 and 2 seeds. Unfortunately, the Sharks’ upcoming schedule makes it harder to see them catching Detroit/Chi/StL or Vancouver.

But yes, please, bring me Nashville, bring me Dallas, bring me Colorado again in the first round!

Winter. Time to eat fat and watch hockey. -- Margaret Atwood

by Timorous Me on Jan 28, 2012 10:34 AM PST up reply actions  

I’ve said the same thing this season— the West as a whole isn’t as deep as it has been the last five or so years, which makes those 1 and 2 seeds pretty important if you’re a top team, especially with LA beginning to get where many expected them to be. That 4-5 matchup is going to be an absolute bear, and whoever matches up against the Kings is going to have their hands full as well.

"The world is a comedy to those that think; a tragedy to those that feel." - Horace Walpole
Fear The Fin: Where Selling Your Soul Is The Likely Solution

by Mr. Plank on Jan 28, 2012 2:07 PM PST via iPhone app up reply actions  

Absolutely phenomenal stuff.

"The world is a comedy to those that think; a tragedy to those that feel." - Horace Walpole
Fear The Fin: Where Selling Your Soul Is The Likely Solution

by Mr. Plank on Jan 28, 2012 2:08 PM PST via iPhone app reply actions  

Good stuff.

I then calculated the Goals/Game and Goals Against/Game for each team assuming a 0.7 Fenwick Sh%. I then ran all that through a Monte Carlo Simulation of the rest of the season 100,000 times.

Did you use the Poisson distribution or something? How did you go from the four GF/GA per game numbers to a probability of each result for the individual games in the simulation?

Driving Play - The Blog with Three First Lines

by JaredL on Jan 28, 2012 3:06 PM PST reply actions  

Thanks Jared, good question

After determining GF and GA numbers for each team, I then determine the Win, Tie, and SO win probability of every possible matchup. Way back when Alan Ryder came up with some WP metrics for individual games using Poisson distributions based on G/g and GA/g. I use a modified formula of that to come up with win probabilities for every matchup. This sets up all the data I will need to run the monte carlo sim. During the sim, for each simulated game I generate a random number and based on the win probability of that matchup assign appropriate amount of points. This is done for the remaining games on the schedule. Results are recorded, and the season is repeated however many times.

I can modify each teams GF and GA, which in turn changes win probabilities for every matchup, which will in turn cause different results at the end of the monte carlo sim. This way each predicting variable (eg. FRC%, G%, SAF%) can be used to determine the end result, which is playoff probabilities (and more important here, mean points)

by SnarkSD on Jan 28, 2012 10:12 PM PST up reply actions  

I can’t find it written but how many seasons is the score adj fenwick based on?

by jeffgm on Jan 29, 2012 3:15 PM PST reply actions  

I’ll ask the question when I’m approved on Broadstreet but how is the predictive reliability for 60 games calculated? I’d expect the weights to vary heavily season to season.

by jeffgm on Jan 29, 2012 4:46 PM PST up reply actions  

1) SnarkSD, this is very nice. Good job.

2) I answered this over on BSH, but will clarify here too since you’re probably not the only one I confused. The reliability calculations are based on the previous three seasons, not just a single season. The weights were determined empirically from the league average TOI, and then tested against three seasons’ results (30 teams x 3 years = 90 data points).

3) Unfortunately, the formula “SAF = 0.089 * Fen_up_2 + 0.200 * Fen_up_1 + 0.424 * Fen_tied + 0.200 * Fen_down_1 + 0.089 * Fen_down_2 + .210” is not quite right; that came from an arithmetic failure in the comments. The correct formula is 0.089 * Fen_up_2 + 0.200 * Fen_up_1 + 0.424 * Fen_tied + 0.200 * Fen_down_1 + 0.089 * Fen_down_2 (with no +0.210 at the end). I hope the spurious constant doesn’t affect your results.

4) I work in San Jose and go to Sharks games with some regularity. Anyone want to talk Fenwick at the Flyers game on 2/28?

by Eric T. on Jan 30, 2012 1:39 PM PST up reply actions  

TOI weighting over the past few seasons clears up by question.
This is great work.

by jeffgm on Jan 30, 2012 4:32 PM PST up reply actions  

Thanks, I appreciate it. And kudos to Snark for the slick simulation fun.

by Eric T. on Jan 30, 2012 5:01 PM PST up reply actions  

Man I'd love to go Eric

unfortunately I’m living in San Diego….

by SnarkSD on Jan 30, 2012 8:29 PM PST up reply actions  

There’s a lot of different ways to do it, but it basically comes down to auto-correlation [aka r(self)]. Most people do this by splitting every team’s games into 2 data sets. Eg. set A contains all even games, and set B contains all odd games. A correlation between your variable of interest is run between the 2 data sets. Obviously as you’ve pointed out split-half would only get you to 41 games. To get around this you can use formulas that use all data points to calculate correlations, eg. ICC or Cronbach’s alpha. Here and here are some good places to start for the theory behind reliability.

by SnarkSD on Jan 29, 2012 8:53 PM PST reply actions  

Thanks I follow now.
I still don’t follow why the analysis is accepted if it is based on the current or single season?

The weights seem to be chosen to maximize the Rsquared/reliability in one given season. And I’d expect there will be sizeable variation in those weights applied to the various fenwicks in other seasons if we are maximizing Rsquared. Alternatively, we can optimize the weights for various seasons and see what the outcome is for Rsquared (at least this was my experience when working on multi-regression analysis with season to season data).

by jeffgm on Jan 30, 2012 12:46 PM PST up reply actions  

yea, jeff sorry if I didnt understand your question at first

The regression equation is based on 4 seasons worth of data, what we believe the true r(self) after “x” games of score-adj fenwick% to be. I thought you were talking about each team’s score-adj fenwick% originally.

Your right that there would be sizable variation year to year. It’s tough to see how far back you should go, I think most people choose the 2007-2008 season as that’s when btn’s data began collection, and also I believe there was a change in how the nhl recorded games.

Interestingly, I found a few other sites that you might find helpful. What’s discussed there is the basis of much of my reliability estimates.

by SnarkSD on Jan 30, 2012 1:37 PM PST up reply actions  

great links…I’ve been thinking about if the tango regression can be used to determine if there is a “skill” contribution in addition to the assumed “luck” contribution in PDO. Sample size seems to be the big limitation with PDO (SH% and SV%).

by jeffgm on Jan 30, 2012 4:38 PM PST up reply actions  

You could ,a big reminder with that is to always make sure your sample is (in theory atleast) a binomial distribution. So, you could use Sh% as a serrogate. JaredL found that it’s about 50% luck after about a season and a half.

by SnarkSD on Jan 30, 2012 8:28 PM PST up reply actions  

When I looked at split-half reliability in the article, I just did a simple regression of Fenwick in the odd-numbered games against the even-numbered games.

But the majority of the effort, and the claims about predictive power, are based on a regression of Fenwick-up-to-game-x against points-after-game-x.

So I did a regression of Fen-in-the-first-20 against points-in-the-last-62 (for both Fen Tied and Score Adjusted Fen), and I did the same for Fen-first-30 vs points-last-52 and for Fen-first-41 vs points-last-41 and for Fen-first-60 vs points-last-22. In each of those regressions, the inputs were 90 teams’ (3 years x 30 teams) Fenwick and the outputs were those 90 teams’ points.

Sorry if that wasn’t clear. And if it still isn’t clear, keep asking, and I’ll try to figure out how to explain it better.

by Eric T. on Jan 30, 2012 1:45 PM PST up reply actions  

Applying this to the SEL?

Hello, is it possible for you to do a similar playoff prediction for the teams within the Swedish Elite League (aka Elitserien)?

Here is two links to two official stats sites.

http://estat.hockeyligan.se/c/LTable.aspx?LId=248 (only in swedish, I would recommend google translate)

http://stats.swehockey.se/ (Click on “Division” → “Elitserien” → “2011/2012” → “Elitserien” and you should find all the stats you need in english.

by hultas on Jan 30, 2012 12:18 PM PST reply actions  

if you give me your email I can send you the excel with all the code

you’d have to input the schedule, and whatever predictive stat you want to use, but its all coded after that.

by SnarkSD on Jan 30, 2012 1:37 PM PST up reply actions  

Email

My email is hultas7@gmail.com. Thanks in advance! Great job with this article!

by hultas on Jan 31, 2012 1:29 AM PST reply actions  

Comments For This Post Are Closed


User Tools

The Official SB Nation blog of the San Jose Sharks.

Managing Editor

Bender-smoking_small The Neutral

Authors

Pavs2_small idunno723

Poseypavelski_small mymclife

Shark_trek_small jwizzle241

Ovechkin-russia-080518-584_small Ivano M

Xbox360hockey_small Jon Allred

Moderators

2296_s_small Nael M.

Mikeyicon_small ElvisVF101

Z_small ZeroIndulgence

313483_2054510893373_1562580382_31984672_1965025_n_small James Brady