So I know the "The Neutral" is a big stats guy... I am as well. I am a trained researched in applied statistics and really enjoy this stuff. Reading about the Burish signing, I thought I could add a bit to The Neutral's comments about whether or not on-ice save percentage is driven by (1) randomness, (2) quality of the players, or (3) a bit of both.
In my view, without any empirical evidence, I would have close to 100% confidence that #3 has to be the correct choice. Now, I know that most studies by hockey stats people have found no evidence to reject the randomness hypothesis. But, there is an issue with these studies that is often overlooked. The current studies either use Corsi/Fenwick (assuming all shots are created equal) or the more advanced studies use scoring chances (which basically eliminates poor shots and keeps bad shots). Both of these measures have terrible limitations.
Remove your "stats hat" for a moment and put on your "hockey fan hat". Goals occur in a few common ways over and over again: (1) a skilled player makes an amazing play/pass/shot, (2) a goalie misses an easy save, (3) a defensive player makes a bad mistake/missed assignment, (4) a fluky bounce leads to a goal out of nowhere, (5) the offensive team scores because of constant pressure wearing out the defense, or (6) a team gets too aggressive and gives up a counter-attack/breakaway/odd-man rush. Now, those six options (and maybe there are more that I am missing) are all qualitatively different statistical events. Until they are measured properly, you cannot make reasonable statistical conclusions. Quite simply, the save percentage on a clear-breakawy is probably more like 70% (think shootouts), while shots from the perimeter probably lead to save percentages above 95%. A missed defensive assignment leaving the backdoor wide open leads to an almost sure-thing tap-in goal. Shot quality measurement has to matter!
Things like fluky goals are probably entirely driven by randomness. Things like sustained pressure are probably well-measure by player CORSI rates. Things like defensive missed assignments/giving up counter-attacks are probably the sole types of scoring opportunities that are driven by a skater's impact on the on-ice save percentage. So, by treating all shots (or scoring chances) as equal w/o including situations, you are adding noise to your measures. Basically, some goals are mostly driven by randomness and some are mostly driven by the defense of on-ice skaters. Now, you have five skaters which further complicates measurement. If you have ever played a sport, you will know that some players are very adept at communicating with teammates to maintain defensive assignments. Again, difficult to measure. But, if scoring chance data could somehow by grouped into my 6 bins (or, even better, a more comprehensive list), I think you could get a better picture of which players are allowing quality scoring chances and which are not. This would go much further than simply binning perimeter vs. non-perimeter shots.
I hope this spurs some debate and maybe we could see someone actually attempt to measure this type of scoring chance data. Hockey is a very tough sport for statistical analysis. Baseball is very conducive because events are very well isolated. An interesting corrolarry to baseball is the stat: BABIP. This means batting average on balls in play. Until recently, everyone assumed BABIP was uncontrollable. Basically, this was the randomness component in batting averages.... Then, with historical data, some studies showed that elite pitchers maintained very low BABIP's over their full careers (their pitches forced hitters to hit weak balls)... More recently, it has been shown that a few elite hitters have done the opposite; these hitters managed to get on-base when they put balls in play at a statistically higher clip. I think the same thing goes on in hockey with some players (snipers) sustaining above-normal shooting percentages and some players (defensive aces) sustaining below-normal shooting percentages. We just need the correct measures to base our statistical conclusions upon.