Analytics and PDO

For the most part, I’ve focused everything on this site on the past.  This will represent a departure by getting into what’s been a controversial, and decidedly modern, topic of late: “analytics”.

My qualifications are as such:

  • I’ve embraced contrarian, or at least skeptical, thought for as long as I can remember
  • I’ve used sabermetrics to great effect, although limited primarily to usage in fantasy baseball.  Then again, the leagues that I’ve played in aren’t exactly havens for the casual players; I only play in leagues with hardcore fans that know their stuff.  This has allowed me the opportunity to see up-close what can and can’t be used with any sort of predictive value, which is pretty damned important.
  • I’ve devised my own metrics and algorithms, some over periods of years, and thus am very familiar with the ins and outs of exactly what it means to do so.  This is limited more to football than to other sports, but does require a different type of thinking than baseball, and also of hockey.

I don’t know how many people who are reading this have actually done so or will attempt to do so, but believe me when I say that creating good metrics results in the realization of every type of emotion that a human can possibly experience.  There’s the frustration of spending a few dozen hours compiling something and then realizing that it fails the most basic test of analysis, and the bemusement of having some throwaway thought while taking a shower that turns out to be the breakthrough you need.

The big knock against me according to some, the anti-qualification if you will, is that I’m an Ohioan, therefore a Midwesterner, and there are some people who seem to think that this makes me a rube who is unworthy to sit at the table with the real smart ones.  After all, if I knew the first thing about numbers or analysis or data, surely I would be at the forefront of a movement to embrace certain analytics in hockey; to not do so simply brands me as an unrefined person, dazzled by the “ancient” numbers like goals for and goals against that all thinking people must reject.

It’s funny to think that I can go absolutely all-in with a brand new baseball metric in 2002 and be regarded as some type of anti-traditionalist weirdo, then largely roll my eyes at what passes for publicly-available analytics in hockey and be regarded as some type of backwards rube.  It’s not that I particularly care about the labels, but rather the inconsistent manner in which these things get thrown around.  If someone wants to call me a rube, be my guest.  A few days ago I spent a chunk of my morning on a tractor in the biting cold Ohio winter, and then came in to some hot coffee and compiled the research and the writing for my Petr Buzek All-Star write-up.  If my morning activities or my geographic location are perceived by some as inherently detrimental to my ability to look at hockey (or anything else) objectively, that’s the fault of their bigoted selves.  Perhaps I should have come in, looked at my diploma from an Ohio high school and my degree from an Ohio college and gone back to dipping my own candles or whatever the hell it is that people think we do here.

Now that that’s out of the way, let’s get down to business.

What are analytics?

In theory, analytics in hockey refers to the usage of statistical-based analysis in an attempt to see “the game beyond the numbers”.  The inherent irony is in using different numbers, or at least different computations, to reject other numbers or computations as being insufficient.

“Regular” hockey stats for players are limited to things like goals, assists, +/-, penalty minutes, and shots taken.  “Regular” team stats will include those, and add in things like power plays, power play goals, power play percentage, times shorthanded…you get the idea.

In recent years, there has been a greater emphasis on other metrics like PDO, Corsi, Fenwick, and others.  These represent different computations of regular stats to try to find a greater, more accurate picture of what takes place during a game and see what players may be more or less valuable than generally thought.

In practice, it’s more often used by people who are unfamiliar with the nuances and, more important, the inherent limitations of these numbers.  I can speak from a particular type of experience because I’ve devised and used my own metrics.  Again, this should be tempered with the knowledge that, as an Ohioan/Midwesterner, I’m probably some rube who’s intrinsically less important.

Where and when did analytics in hockey begin?

Good question, and I don’t know for sure.  Coming up with and using different metrics isn’t anything that’s really new.  I do remember having a The Hockey News subscription for many years when I was younger, and they liked to tout their IQ ratings.  I think it was short for Intensity Quotient, and the formula was [(Goals x 3)  + (penalty minutes -misconduct minutes)].  This was to rank the power forwards in the game at the time, which was seeing a bumper crop from the arrival and emergence of Roenick, Tkachuk, Lindros, and others in addition to veterans like Neely and Shanahan.

My theory is that the groundwork for actual hockey analytics was really laid in the aftermath of the publication of the book Moneyball by Michael Lewis, which focused on sabermetrics in baseball and more specifically how it enabled the small-market Oakland A’s to remain competitive year after year despite the extreme limitations of their ability to add and maintain MLB talent.  (NOTE: “Sabermetrics” is the term used to refer to baseball analytics; it was named by analyst Bill James in honor of SABR, a historical baseball research society that at first wanted little if anything to do with numbers.  This led to the second sabermetric wave, which [unlike the first] came in the Internet Age and thus an explosion in the availability of numbers, the number of platforms to host information and disseminate it, and the speed and ease with which previously tedious tasks related to data compilation and calculation could be done.)

Hockey, for an extremely long period of time, had nothing much for a skater outside of goals, assists, +/-, penalty minutes, and shots taken.  Time on ice was only added in 1998-99, around the same time that other things like hits, blocked shots, and faceoff data was also being added.  With more data becoming available, and the realization that maybe there was a hidden game of hockey beyond the “traditional” numbers, hockey analytics was born.

I want to emphasize that analytic thought is nothing new.  For as long as hockey has been around, meaningful thought has been given to concepts like the interactions between defensive partners, linemates, and whether certain players should start their shifts in the offensive or defensive zone.  What is new is the availability of such numbers to the general public.

Now, perhaps you glossed over this, so now it’s time for a quick test.  Above, I said “sabermetrics in baseball and more specifically how it enabled the small-market Oakland A’s to remain competitive”.  If you saw the word “enabled” and kept going, we have much work to do.  If you saw the word “enabled” and thought, “Hey, wait a minute.  This hasn’t enabled anyone to do a damned thing!”, then we’re on the same wavelength.

What’s wrong with “enabled”?

In anything, there is a gulf between theory and practice.  And there’s also a gulf between practice and outcome.

Sabermetrics did not “enable” the Oakland A’s to remain competitive for years; it was simply the cornerstone of their organizational philosophy that was put into practice.  When put into actual practice, sometimes it worked out extremely well, sometimes it did not, sometimes it was in between.  That’s the reality of life, and especially the reality of sports.  Neither life nor sports is a table game or a computer simulation, where players and events are merely lines of numbers that can be plugged in and interchanged.  Players do not decline in predictable patterms, prospects don’t always develop, and things like shifts in locker room chemistry can have enormous consequences.

When I used to play NHLPA 93 and NHL 94 all those years ago, it was with line changes off and with the best players (as determined by their overall rating) on the ice.  Sometimes this would be an actual line or an actual defensive pairing; most times it was not.  In an actual reality, the majority of those cobbled-together lines would have been a disaster on the ice because there wouldn’t be much chemistry and because I liked to set lines for one-timers.  Having a line with three prime goal scorers can work fine in a player-controlled simulation because there’s a person controlling it like an omniscient deity, able to view a situation from above and turn Brett Hull into a prime setup man if a situation warrants it.  Putting Pavel Bure at center specifically for the sole purpose of creating one-timers would have been a disaster on the ice, but was a great move on a Sega Genesis.

I must point out that some very basic arguments have been analyzed time and time again over the last 40 years within baseball’s sabermetrics community.  Sometimes a greater understanding of a particular concept is gained, sometimes one is discovered, and sometimes it’s determined to be random clustering that needs further study…and then when studied further, results in a different type of random clustering.  The existence of a clutch hitter or pitcher at all has been hotly debated for decades, and dozens of studies have so far proved fruitless.  That someone may hit or pitch well in certain situations is one thing, but whether someone displays non-random patterns of sustained excellence in those type of situations is quite another.

What’s wrong with analytics?

In theory, nothing.  If something helps us gain a greater understanding of a fluid game like hockey, which can appear to be largely random, then it should be studied further.  However, the big issue is that it’s impossible to pare down things as complicated as human beings with human personalities combined with the interactions of eleven other human beings on the ice into simple events and lines of numbers.

A sport like baseball has a series of mostly static events that take place.  A batter steps up to the plate in a given situation, he does something, it gets recorded as such.  And there are still things that get missed, or at least don’t get recorded.  Did the infield come in, expecting this weak hitter to simply slap a grounder toward second?  Did the pitcher adjust what he was throwing to account for that hitter’s weakness?  Did the catcher change his normal target because he noticed the hitter had slightly altered his stance in an attempt to send a hit to the opposite field?  Did the first baseman hug the line a bit closer, knowing that the runner on first had pulled up earlier with what looked like a minor muscle injury and was thus only able to get up to about 80% speed and take away a real threat to steal second?

These are all very real interactions that take place every single pitch, none of which is recorded as they are.  Now, the effects of these cumulative actions may result in a particular play that gets recorded, or it might not.  Perhaps the hitter correctly guesses that a curveball is coming but the pitcher hasn’t had it working for him all day, and he adjusts and crushes a hanging curve over the left field fence for a home run.  Perhaps the first pitch is inside, but it backs the hitter off the plate because he was hit by a fastball in the ribs two weeks prior and doesn’t want to relive that.  Perhaps the guy on first tries to steal second anyway and is thrown out by a mile.  Perhaps the signals between the pitcher and catcher get crossed, and the catcher sets up for a fastball down the heart of the plate and instead watches a slider zip past him to the backstop.

Baseball is mostly a series of static events.  A hit is credited to the hitter, so a hit against is charged to the pitcher.  Grounding into a double play is recorded as such against the hitter, and recorded as a double play turned by the defense.  Hockey is extremely fluid and dynamic; the interactions that take place are much faster, involve many more and different people, and contain a countless number of variables.  (Example: In hockey, a player may end up random linemates as a result of game action; a center’s usual right wing may be stuck on the bench because it’s the second period and the slow right wing on the ice can’t complete the long change, while in baseball something like that simply cannot happen.)  This limits the nature of what can be analyzed and how, and what the ultimate conclusions can be.

The nature of analysis is that it has to lead somewhere and toward some type of a conclusion: yes to the question, no to the question, that it needs more analysis, or that it’s a stupid question that doesn’t warrant further study.  I emphasize that there are very few questions that fall into that last category.

Examples of all four types of questions, and since I have Petr Buzek on my mind that’s the category you’ll get:

  • Would Petr Buzek have had a better NHL career if not for the fact that he suffered catastrophic injuries before he was drafted, including serious knee injuries that negatively affected his speed? (Upon analysis, I’ll say with 99.9% certainty that the answer is “Yes”)
  • Was the loss of speed advantageous to Buzek’s career? (Considering the time period in which he played, including a relative scarcity of fast defensemen who could carry the puck, I’ll say that the answer is 99.9% “No”)
  • Would Buzek have been more productive if he’d had a regular defensive partner in Atlanta and a more clearly-defined role? (Needs more analysis; who did he pair with, what was his role, how and why did each change regularly?)
  • Would Buzek have been a Hall of Fame-caliber player if he were on the ice not against other NHLers, but against a handful of slow-moving spiders? (Stupid question, doesn’t warrant further study.  Might be good for a laugh if you’re at a party in which alcohol is served, since it may lead down a rabbit hole of discussing whether he would play better or worse against these spiders if he also had borderline crippling arachnophobia.)

What else?

One important point that I cannot stress enough is that teams themselves use all sorts of analysis internally that is not publicly available.  To give an example, Scotty Bowman used to maintain a list of pluses and minuses for his players, which was completely arbitrary and – as some of his old charges have said – used primarily to boost players he liked and tear down those he didn’t.  If a defenseman he liked bailed out in the offensive zone when the puck got loose, he might get a plus for recognizing the play and transitioning toward defense.  If it was a defenseman he didn’t like, he might get a minus for not making an aggressive play to maintain possession in the offensive zone.

Of course, this is one coach who needed to do everything he could to drive his supremely talented team to dizzying heights.  Bowman’s plus/minus sheet is admittedly an extreme example and falls into the realm of “probably a motivational tactic” more than anything.  Teams at the time, just as today, use systems that are substantially more insightful for everything from scouting to personnel moves to contract negotiations.  But since none of this is public knowledge, it’s impossible to dive any further into it.  Our own limitation here is in what is actually publicly available.

What was wrong with THN’s IQ, as mentioned above?

Name every player in NHL history who has scored 300 career goals and has 3,000 penalty minutes.  Just one: Dale Hunter.

Neat, huh?  It also means absolutely nothing; it’s simply interesting.

There is a difference between something that is interesting and something that is meaningful.  Generally speaking, the more arbitrary particular breakdowns may be, the less meaningful it becomes.  If I felt like wasting a bunch of time, I could come up with a bunch of different categories that create clusters of 15 players in a list with only one player who’s not in the Hall of Fame.  It doesn’t make that 15th player Hall of Fame-worthy, but it may be interesting…at least until someone seizes onto it as proof of a HOF argument.

The issue with THN’s IQ is that it didn’t mean anything.  The original premise that equated penalty minutes with toughness, eliminated assists completely, and assigned a multiplier of 3 on goals scored is as arbitrary as it gets.

In addition, it ignores completely the question of whether taking penalties is a good thing at all.  And by using it as part of the equation in the first place, it automatically makes the assumption that taking penalties is a positive thing, or at least not a negative.  Personally if I’m coaching against one of these types of guys, I’m using my fourth line to mix it up and take both my own guy and also their guy off the ice.  I lose a fourth-liner for two minutes, or possibly five, and the other team loses a vastly superior player for the same period of time.  Their player gets a bump in THN’s IQ, my player has given my team a better chance to win the game.  Someone like Clark Gillies, who was legendary for his skill and toughness in all facets of the game, never hit 100 penalty minutes in a season (misconducts included).

This is an example of what a metric says, or at least strongly implies.  Don’t get me wrong; I don’t think THN was looking to do anything groundbreaking.  I’d guess they were simply looking for an interesting cover story as a bevy of young power forwards entered the league as the older ones were passing the torch, and used a bad metric as part of the story.  But metrics do speak; when things get weighed and extrapolated, importance is assigned to them as a matter of practice.

What is the most basic rule of analytics that you referred to above?

The first basic rule is: “Does it state the obvious?”

Keep in mind that I’m not talking about buttressing a pre-conceived notion per se, but if there’s something that doesn’t mesh with an obvious reality then there’s an issue.  If you devise a metric that shows Cam Neely as the worst goal-scoring forward of 1985-1994, or Dominik Hasek circa 1998 as an average goalie, or Rick Bowness as a better coach historically than Scotty Bowman, then there’s a real problem there.  On the other hand, if it states 90% of the obvious and 10% that looks unusual, then perhaps there’s something more worth exploring.

Of course, this leads into the offshoot question of “what is the obvious?”, and that’s been argued for as long as analysis of any type has been sought and used.

To what extent can analytics be used in a practical manner?

This, my friends, is the big question.  Considering the value of franchises, and the value of contracts in free agency that may be significantly based on a player’s underlying skills that elude the “traditional” numbers, it’s really a multi-million dollar question.

The analytics collective, by its nature, needs to be able to do two things:

  • Provide a new accurate window into the past to the greatest extent possible, and/or
  • Have some level of predictive usage

I cannot emphasize enough that a single metric does not need to do both of these things.  For one thing, it may be impossible.  For another, it may have unusual effects, which I’ll get to in a minute.

Who can analyze?

Everyone.  This is the beauty of bringing any type of math or science up: if the data shows or reflects something, it’s worth looking at.  In the world of baseball, men like Bill James, John Thorn, Dick Cramer, and Pete Palmer formed the backbone of the early sabermetrics in the 1970s and 1980s.  In the case of James, arguably the most influential analyst of the last 50 years, he was working as a security guard at a pork and beans plant in Kansas when he started his rise.  More recently, one of the most important breakthroughs came from an underemployed paralegal named Voros McCracken.

(I use baseball and sabermetrics extensively in these discussions because that’s the gold standard.  If baseball information is a kettle full of soup, it’s being distributed to the masses by ladles.  If hockey information is a kettle of soup, it’s being distributed by a plastic spoon.  A very, very small plastic spoon.)

The breakthrough I mention from McCracken is something called Defense Independent Pitching Stats (DIPS).  His writing, this one right here, is perhaps the most important thing written about baseball in the last 20 years.

Excerpted from the middle of it, with emphasis mine:

Then, I looked at the behavior of Hits Per Balls in Play [(H-HR)/(BFP-HR-BB-SO-HB)]. That’s where the trouble really started. I swear to you that I did everything within my power to come to a different conclusion than the one I did. I ran every test, checked every stat, divided this by that and multiplied one thing by another. Whatever I did, it kept leading back to the same conclusion:

There is little if any difference among major-league pitchers in their ability to prevent hits on balls hit in the field of play.

It is a controversial statement, one that counters a significant portion of 110 years of pitcher evaluation. Let’s go over the facts that led me to this conclusion:

This is the very essence of good analysis.  Ask a question, seek an answer, find a possibility, run the data.  Upon reaching an unlikely conclusion, re-run it again.  Verify everything is plugged in correctly, run it again.  And again.  And again.  McCracken’s theory, that a batted ball put into play and whether it turns into a hit or an out is largely random, did in fact counter roughly 110 years of pitching evaluation.

For his part, Bill James responded to it thusly:

This argument has caused some stir in the world of sabermetrics.  Without commenting on the nuts and bolts of McCracken’s method, my two cents worth:

  1. Like most things, McCracken’s argument can be taken too literally.  A pitcher does have some input into the hits/innings ratio behind him, other than that which is reflected in the home run and strikeout columns.

  2. With that qualification, I am quite certain that McCracken is correct.

  3. This knowledge is significant, very useful.

  4. I feel stupid for not having realized this 30 years ago.

He then went off into an analysis of a handful of pitchers (ten, to be exact) who were “hit lucky” one year by allowing at least 20 fewer hits than expected, then looked at the next season’s outcomes.  The “hit lucky” seasons averaged a 16-10 record with a 2.75 ERA; the next year they averaged an 11-10 record with a 3.61 ERA.  In each case, the ERA jumped by a statistically significant amount.

I went ahead and glanced through the old game logs of the ten pitchers and their “hit lucky” seasons.  To a man, there was no real regression toward a mean that would be expected if BABIP/DIPS had in-season predictive value.  From season to season is one thing, but within a given season it did not happen.

One significant issue with those who aren’t familiar with analytics is that there may be a time frame of effectiveness on it.  McCracken’s DIPS metric gave way to Tom Tango’s FIP (Fielding Independent Pitching), which had some predictive value as well.  FIP generally ran pretty similar to the time-tested ERA (Earned Run Average), but as time went on, a gulf between FIP and ERA began to emerge.  In 2009, Kansas City Royals pitcher and perennial prospect Zack Greinke had a breakthrough season that culminated in him winning the Cy Young Award, and credited a large part of that to his season-long focus on lowering his FIP.  Success breeds imitation, and before long a lot more pitchers were doing the same thing.  By 2015, some very strange stat lines with pitchers were beginning to emerge.  For example, Jose Fernandez had a game where he struck out 13 opposing hitters and walked just one…but allowed five earned runs in a 7-3 loss.  CC Sabathia had a game in which he struck out 12 hitters and walked 1 in six innings, but allowed seven runs in a 7-4 loss.  For the most part, these gaudy strikeout:walk ratios in a game would be a sure path to victory, but there were 13 games between 2015 and 2016 in which a pitcher had 10 or more strikeouts, 2 or fewer walks, and 5 or more earned runs in a loss.  There were four more such games in which that team won their game.  This compared to just seven such games combined in 2013 and 2014, and six between 2011 and 2012.

What does this mean?  There is the possibility that FIP is losing some of its value; this unusual shift from what appears to be random occurrences from 2011-2014 toward a much greater number in 2015 and 2016 is absolutely notable.  Is it that pitchers have changed their approach to hitters, doing things like giving up sure hits by throwing fastballs on 3-0 counts so as not to give up a walk but in the process giving up hits to batters who are looking for exactly that?  That’s a possibility, and I’d argue a strong one.  But more analysis is needed.  Maybe it’s a two-year anomaly, and next year will return to the normal.

The idea of regression toward the norm is the basis of BABIP/DIPs in baseball, and of PDO in hockey.

What is PDO?

In short, PDO is the sum of a team’s shot percentage with that team’s save percentage.  Because every shot on goal that is not a save is a goal, this means that the overall leaguewide PDO in any given year is 1000 (100%).

PDO is extremely similar to BABIP/DIPS in the sense that it has minimal actual predictive value, although it can be used to guess at future performances.  It is dissimilar in the fact that leaguewide PDO in any given year is 100%, while leaguewide BABIP can fall anywhere in a range between .000 and 1.000, but usually settles in around .300.

I jumped in big-time with the baseball metric in a fantasy league I was playing in back in 2002, filling the roster with hitters and waiting until the very end to stack up on pitchers who had suffered through extremely high BABIP seasons in 2001 and thus figured to improve dramatically in 2002.  Funny thing…it didn’t work like I’d hoped, although I did have Derek Lowe all year for his huge breakout season.

One of the pitfalls with using BABIP and DIPS for predictive value is that although it may normalize toward the mean, it also may not.  Sometimes pitchers have truly lost their stuff, sometimes they can’t get along on the field with their new catcher, sometimes their role changes and it affects their situational usage.  I did make several transactions about a month or two into the season, acquiring pitchers with high BABIP to that point in the season for cheap.  I noticed that among the other managers in the league, they would give significantly more leeway to established pitchers than to ones who were not established; an established pitcher might be given until July before it was realized he’d lost it, while one who wasn’t established might be on the waiver wire in April after a couple of middling starts that were still better than the big-name pitcher.  This may or may not mirror what teams do in real life, which in itself changes based on changing circumstances and on a team-to-team basis.

So to quickly recap, I was using the baseball equivalent of this metric back in 2002 to determine if it had predictive value.  I used it throughout multiple ensuing seasons to better figure this out.  And I determined that when you account for the small sample sizes that exist early in the season, the predictive value is only marginally better than flipping a coin.  The peripheral indicators to DIPS (walks, strikeouts, and home runs allowed) may continue to be terrific and yet the on-field results might not be there.

This exact same issue exists with PDO.  A team with an enormously high PDO in October can be expected to regress toward the mean, and a team with an extremely low PDO can also be expected to build toward the mean.  This in itself does not necessarily have predictive usage either; teams have made the playoffs with a sub-100 PDO and have missed with a PDO of over 100, and plenty have won games in which their single-game PDO looked atrocious.  Within a season, most teams don’t have these ridiculous outliers though, even after the first month of the season.  By December 1, as we get clear of the first quarter of the NHL season, the extremes have largely disappeared.  PDO has little predictive value until December 1, and even after then it’s extremely iffy to the point that it’s nothing more than a slightly more educated guess.  In predictive value, it doesn’t push a 50:50 outcome toward 90:10; it might be 55:45 on a good day.  If you were to use PDO as a predictor to gamble on hockey games, you’d be cleaned out in a week or less.

In addition, there is the significant question of whether PDO has any type of meaningful predictive value for the playoffs.  2016-17 is the 10th season in which PDO has been available in the NHL, and the results are as follows:

(NOTE: The following is for PDO in even strength situations.)

In 2007-08, the top six teams in PDO made the playoffs, as did nine of the top ten and ten of the top twelve.  Three playoff teams were below 100% in PDO: the Devils, Capitals, and Sharks.  The entire leaguewide range of PDO in the regular season was between 98.0% (Tampa Bay, with 71 points) and 101.9% (Pittsburgh, with 102 points); only five out of thirty NHL teams had a PDO that was outside the 99.0-101.0 range.  If PDO had meaningful predictive value, particularly after an 82-game sample size combined with the culling of the non-playoff teams, the smart money would have bet on the lowest PDO teams going into the playoffs.

Out of eight first-round playoff series, the lower PDO team won just two series; six were won by the team with the higher PDO.  In the second round, it was split between two higher and two lower PDO teams.  The conference finals were a 1/1 split, and the Stanley Cup was won by the lower-PDO Red Wings.  They’d finished the season with a PDO of 100.0, and had also won the Presidents Trophy with 115 regular season points.  All told, the lower PDO team won six out of fifteen playoff series.  (Higher PDO had a playoff series record of 9-6).

In 2008-09, PDO was distributed in a more scattershot manner, with 10 out of 30 teams falling outside the 99.0-101.0 range.  Only six of the league’s top 10 in regular season PDO made the playoffs, and half of the bottom ten made it. The leaguewide range went from 98.0% (Colorado with 69 points) to 102.5% (Boston with 116 points).

When we get into the playoffs, the lower PDO team won just two out of eight series in the first round; the higher won six.  Lower would win three of four second round series, then a 1/1 split in the conference finals, and the higher PDO team won the Stanley Cup.  Once again, the lower PDO team won six of fifteen playoff series.  (Higher PDO had a playoff series record of 9-6.)

In 2009-10, the lower PDO team won five of eight first-round series, then just one out of four second-round, but each of the conference finals and Stanley Cup Final; that’s nine out of fifteen playoff series.  (Higher PDO had a playoff series record of 6-9.)

In 2010-11, the first round was a 4/4 split.  The second round was a 2/2 split.  The conference finals went 2-0 for the higher PDO team, and the Stanley Cup was not only won by the higher PDO team (Boston), but by the team that had the highest PDO in the league.  Lower PDO was 6-9 in the playoffs overall.  (Higher PDO had a playoff series record of 9-6.)

In 2011-12, the higher PDO team won 5/8 first-round series, but the lower team won 3/4 second-round, then both conference finals and the Stanley Cup Final.  One year after the highest PDO team in the league won the Stanley Cup, it was won by the team that finished 28th (Los Angeles).  However, considering the way that the Kings caught fire right around the trade deadline and carried it through the playoffs, I’d be curious to see how their game-to-game PDO shifted over the course of a season.  (Higher PDO had a playoff series record of 6-9.)

In 2012-13, the lower PDO team won 5/8 first-round series, 2/4 second-round, 1/2 conference finals…and then lost in the Stanley Cup Final to the team with the 3rd-highest PDO in the league (Chicago).  (Higher PDO had a playoff series record of 7-8.)

In 2013-14, the lower PDO team won 5/8 first-round series, all four second-round series, one conference final, and then lost in the Stanley Cup Final. (Higher PDO had a playoff series record of 5-10, but won the Stanley Cup.)

In 2014-15, the higher PDO team won 5/8 first-round series, 2/4 second-round series, then lost all remaining series.  (Higher PDO had a playoff series record of 7-8.)

In 2015-16, the higher PDO team won 5/8 first-round series, 3/4 second-round, lost both conference finals, but won the Stanley Cup.  (Higher PDO had a playoff series record of 9-6.)

In nine seasons, the higher even strength PDO team has:

  • A first-round playoff series record of 40-32
  • A second-round playoff series record of 16-20
  • A conference finals record of 6-12
  • A Stanley Cup Final record of 5-4
  • An overall series record of 67-68

There do not appear to be any patterns or trends that have emerged in general, except that leaguewide PDO distribution almost exclusively falls between 98.0 and 102.0 in any given year and that outliers in either direction are fairly uncommon.  2012-13, with the lock0ut-shortened 48-game season, had three teams below 98.0 and one (Florida) below 97.0.  This seems to be an indication of at what point PDO starts strongly drifting toward 100.0, or it could be that there were simply three putrid teams who would have stayed at that level for the duration of the season.

Out of 270 team-seasons, measured in even-strength PDO:

  • 4 teams have had a PDO of 103.0 or higher
  • 11 teams have had a PDO of 102.0 to 102.9
  • 34 teams have had a PDO of 101.0 to 101.9
  • 95 teams have had a PDO of 100.0 to 100.9
  • 83 teams have had a PDO of 99.0 to 99.9
  • 33 teams have had a PDO of 98.0 to 98.9
  • 7 teams have had a PDO of 97.0 to 97.9
  • 3 teams have had a PDO of 96.9 or below

Playoff appearances by distribution:

  • All 4 teams with a PDO of 103.0 or higher made the playoffs; none won the Stanley Cup
  • All 11 teams with a PDO of 102.0 to 102.9 made the playoffs; three (2008-09 Pittsburgh, 2010-11 Boston, and 2012-13 Chicago) won the Stanley Cup
  • Of the 34 teams with a PDO of between 101.0 and 101.9, 27 made the playoffs while 7 missed out.  Of those seven who missed, one of them (2012-13 Columbus) only missed due to a tiebreaker while another one (2010-11 Dallas) had 95 points.
  • The 95 teams with a PDO of between 100.0 and 100.9 are expanded upon below
  • As are the 83 teams with a PDO of between 99.0 and 99.9
  • Of the 33 teams with a PDO of between 98.0 and 98.9, six made the playoffs while 27 missed.  The 2010-11 Lightning is in here, the team that went seven games in the conference finals against Boston.
  • All 7 teams with a PDO of 97.0 to 97.9 missed the playoffs
  • All 3 teams with a PDO of 96.9 or below missed the playoffs

Of the 95 teams with a PDO of between 100.0 and 100.9:

  • 13 were at 100.8 or 100.9.  1 missed the playoffs, 6 lost in the first round, 3 lost in the second round, 2 lost in the conference finals, and 1 lost in the Stanley Cup Final.
  • 19 were at 100.6 or 100.7.  6 missed the playoffs, 4 lost in the first round, 3 lost in the second round, 4 lost in the conference finals, 1 lost in the Stanley Cup Final, and 1 (2014-15 Chicag0) won the Stanley Cup.
  • 28 were at 100.4 or 100.5.  7 missed the playoffs, 17 lost in the first round, 2 lost in the second round, and 2 (2013-14 Los Angeles and 2015-16 Pittsburgh) won the Stanley Cup.
  • 15 were at 100.2 or 100.3.  8 missed the playoffs, 4 lost in the first round, 2 lost in the second round, 1 lost in the conference finals.
  • 20 were at 100.0 or 100.1.  10 missed the playoffs, 5 lost in the first round, 1 lost in the second round, 2 lost in the conference finals, 1 lost in the Stanley Cup Final, and 1 (2007-08 Detroit) won the Stanley Cup.

Out of 95 teams in this range, 32 (33.7%) missed the playoffs, 36 (37.9%) lost in the first round, 11 (11.6%) lost in the second round, 9 (9.5%) lost in the conference finals, 3 (3.2%) lost in the Stanley Cup Final, and 4 (4.2%) won the Stanley Cup.

Then, of the 83 teams with a PDO between 99.0 and 99.9:

  • 11 were at 99.8 or 99.9.  8 missed the playoffs, 2 lost in the first round, 1 lost in the second round.
  • 19 were at 99.6 or 99.7.  10 missed the playoffs, 6 lost in the first round, 2 lost in the second round, 1 lost in the Stanley Cup Final.
  • 17 were at 99.4 or 99.5.  7 missed the playoffs, 5 lost in the first round, 2 lost in the second round, 1 lost in the conference finals, 2 lost in the Stanley Cup Final.
  • 20 were at 99.2 or 99.3.  14 missed the playoffs, 3 lost in the first round, 1 lost in the second round, 1 lost in the conference finals, 1 lost in the Stanley Cup Final.
  • 16 were at 99.0 or 99.1.  11 missed the playoffs, 3 lost in the second round, and 2 (2009-10 Chicago and 2011-12 Los Angeles) won the Stanley Cup.

Out of these 83 teams, 50 (60.2%) missed the playoffs, 16 (19.3%) lost in the first round, 9 (10.8%) lost in the second round, 2 (2.4%) lost in the conference finals, 4 (4.8%) lost in the Stanley Cup Final, and 2 (2.4%) won the Stanley Cup.

To break everything down in wider ranges, a team with an even strength PDO of 101.0 or higher has an 85.7% chance of making the playoffs (42 of 49 such teams).  If this is expanded down to 100.8, the odds of making the playoffs actually increase slightly to 87.1% (54 out of 62 teams).

A team with an even strength PDO of below 99.0 has an 86% chance of missing the playoffs (37 of 43 such teams).

A team with a PDO of between 99.0 and 100.9 has a 53.9% chance of making the playoffs, which is about the same as what would happen if you flipped a coin 100 times.

Does this change if we look at all-situation PDO rather than simply even strength?

Out of 270 team-seasons:

  • Two teams have had a PDO of over 103.0: 2008-09 Boston and 2012-13 Toronto
  • 11 teams have had a PDO of between 102.0 and 102.9
  • 41 teams have had a PDO of between 101.0 and 101.9
  • 91 teams have had a PDO between 100.0 and 100.9
  • 82 teams have had a PDO between 99.0 and 99.9
  • 29 teams have had a PDO between 98.0 and 98.9
  • 12 teams have had a PDO between 97.0 and 97.9
  • Two teams have had a PDO at or below 96.9, both in 2012-13

The highest PDO recorded is 2008-09 Boston, with 103.3; the lowest is 2012-13 Florida with 96.3.  The lowest that isn’t 2012-13 is 2014-15 Edmonton, with 97.0.

Playoff appearances by distributions:

  • Both teams with a PDO of over 103.0 made the playoffs
  • All 11 teams with a PDO of between 102.0 and 102.9 made the playoffs.  Of these, three lost in the first round, three lost in the second round, one lost in the conference finals, and four played for the Stanley Cup with a 2-2 record in that series.  The 2010-11 SCF between Boston and Vancouver had both teams with a PDO in this range.
  • Of 41 teams with a PDO of between 101.0 and 101.9, 35 made the playoffs while 6 missed.  Of the 35 playoff teams, 14 lost in the first round, 12 lost in the second round, 7 lost in the conference finals, 1 (2014-15 Tampa Bay) lost in the Stanley Cup Final, and 1 (2008-09 Pittsburgh) won the Stanley Cup.
  • 91 teams have had a PDO between 100.0 and 100.9, which is expanded upon below
  • 82 teams have had a PDO between 99.0 and 99.9, which is also expanded upon below
  • Of the 29 teams with a PDO between 98.0 and 98.9, 27 missed the playoffs while just two made it.  Both of those lost in the first round.
  • None of the 12 teams with a PDO between 97.0 and 97.9 made the playoffs
  • Neither of the two teams with a PDO at or below 96.9 made the playoffs

Of the 54 teams with a PDO above 101.0 in a season, 48 of them made the playoffs.  Of the 43 teams with a PDO at or below 98.9, 2 made it while 41 missed.

Given the number of additional teams and the proximity to the mean, I’ll break the remaining 173 teams down further.

Of the 91 teams with a PDO between 100.0 and 100.9:

  • 13 fell between 100.8 and 100.9.  Of these, 7 missed the playoffs, 3 lost in the first round, 2 lost in the conference finals, and one (2015-16 Pittsburgh) won the Stanley Cup.
  • 12 fell between 100.6 and 100.7.  Of these, 4 missed the playoffs, 6 lost in the first round, 2 lost in the conference finals.
  • 19 fell between 100.4 and 100.5.  Of these, 6 missed the playoffs, 6 lost in the first round, 5 lost in the second round, 1 lost in the Stanley Cup Final, and 1 (2014-15 Chicago) won the Stanley Cup.
  • 30 fell between 100.2 and 100.3.  Of these, 10 missed the playoffs, 13 lost in the first round, 4 lost in the second round, and the remaining 3 (11-12 NJ, 13-14 NYR, 15-16 SJ) all lost in the Stanley Cup Final.
  • 17 fell between 100.0 and 100.1.  Of these, 6 missed the playoffs, 7 lost in the first round, 2 lost in the second round, and 2 lost in the conference finals.

Out of 91 overall teams within that range, 33 missed the playoffs (36.2%), 35 lost in the first round (38.5%), 11 lost in the second round (12.1%), 6 lost in the conference finals (6.6%), 4 lost in the Stanley Cup Final (4.4%), and 2 won the Stanley Cup (2.2%).

There were also 82 teams with a PDO between 99.0 and 99.9.  Of these 82 teams:

  • 13 fell between 99.8 and 99.9.  Of these, 3 missed the playoffs, 4 lost in the first round, 3 lost in the second round, 1 lost in the conference final, and 2 (2011-12 and 2013-14 Los Angeles) won the Stanley Cup.
  • 18 fell between 99.6 and 99.7.  Of these, 10 missed the playoffs, 2 lost in the first round, 2 lost in the second round, 2 lost in the conference finals, and 2 (2007-08 Detroit and 2009-10 Chicago) won the Stanley Cup.
  • 19 fell between 99.4 and 99.5.  Of these, 10 missed the playoffs, 7 lost in the first round, 1 lost in the second round, and 1 (2009-10 Philadelphia) lost in the Stanley Cup Final.
  • 14 fell between 99.2 and 99.3.  Of these, 8 missed the playoffs, 2 lost in the first round, 2 lost in the second round, 1 lost in the conference finals, and 1 lost in the Stanley Cup Final.
  • 18 fell between 99.0 and 99.1.  Of these, 15 missed the playoffs, 2 lost in the first round, and 1 lost in the second round.

Out of these 82 teams, 46 missed the playoffs, 17 lost in the first round, 9 lost in the second round, 4 lost in the conference finals, 2 lost in the Stanley Cup Final, and 4 won the Stanley Cup.

What’s perhaps most interesting is the significant deviation in playoff appearance percentage between the 99.0-99.1 group and the 99.2-99.3 group, with a drop from 42.8% of teams making it in the latter to 16.6% making it in the former.

Based on all of this, it appears that a team with a season all-situations PDO of at or above 101.0% is extremely likely to make the playoffs (88.88% make percentage), while one at or below 99.1% is extremely likely to miss (91.8% miss percentage).

And in the playoffs?

Once in the playoffs, all bets are off.  It’s been proven above that PDO does not carry meaningful predictive value in the postseason, and that an individual team’s PDO does not carry significant weight in the postseason either.

Conclusions?

For all-situation PDO, the distribution of teams outside of the 99.2% to 100.9% range, and the extremely significant correlation of those outlying numbers to regular season success and at least playoff appearances, is important.  For other teams within that 99.3% to 100.9% range, it does not appear to be significant to any real extent.

For even strength PDO, the meaningful range is what’s outside of 99.0% and 100.7%; there’s about an 86% chance of making the playoffs above 100.7 and an 86% chance of missing the playoffs by being below 99.0.

In other words, PDO may be useful for approximately 1/3 of teams (97/270 in all situations, 105/270 in even strength situations), and not useful for 2/3 of teams.  Its limited practical application within a given season, when predicting a particular game, or when predicting the likelihood of a playoff run, is notable and limits how meaningful PDO can be as a standalone.