An important note before I begin this: I am not claiming to have invented this metric at all. I’m reasonably certain that someone, somewhere, has used it before. However, extensive Googling turned up nothing. I’m still not claiming to have invented this; I’ve been using it as kind of a “huh, neat” thing for close to 20 years though.
What I’m referring to as “OPP” is simply short for “overall power play”, and I went with this instead of “adjusted power play” for two reasons. The first is that referring to something as “APP” seemed unwieldy and the word “app” was still years away from entering the lexicon. The second is that Naughty by Nature was on in the background.
[There’s actually a third reason, which is that at the time, “adjusted” stats were becoming commonplace and seemed to cloud the picture rather than clearing it up. “Adjusted” at the time meant things like “adjusted goals”, where you could take a player’s goals per game, multiply it to arrive at the current number of games in a year’s schedule, and then move up or down based on the leaguewide goals per game relative to the historical average. That’s how you get weird stuff like “Wayne Gretzky’s 92 goals in a season is more like 71 because of the offensive inflation of the time, but Babe Dye in 1924-25 is more like 226 because they played a much shorter schedule”. And then this would fluctuate further as the historical average would change year to year.]
Again, I’m not claiming to have invented this. All I’m doing here is explaining it a bit, then providing the information that I don’t believe anyone else has compiled and released.
“OPP”, as “overall power play” would suggest, is simply a measure of overall power play efficiency as measured in the form of (power plays goals for) – (shorthanded goals against). This then gets fed back into the number of power play opportunities, with the result being that there will be a slight drop in a team’s net power play efficiency. The regular power play percentage metric is (power play goals)/(power play opportunities); this is (power play goals minus shorthanded against)/(power play opportunities.
“OPK”, short for “overall penalty kill”, is similar. That’s just (power play goals against – shorthanded goals for), and run back through the standard penalty kill formula. This will result in a slight percentage boost to a team’s penalty kill.
“OST” is short for “overall special teams”, and is simply adding a team’s OPP and OPK together. That’s all.
Frequently Asked (or Wondered) Questions:
Does the math work?
Because of the zero-sum nature of power play and shorthanded goals, the balance does not change in terms of total goals in either column. A power play goal in one column is a power play goal against in another, and a shorthanded goal for is also a shorthanded goal against.
To use the 1966-67 season as an example, there were 253 power play goals scored that year. There were, therefore, also 253 power play goals against. And there were 33 shorthanded goals for, therefore 33 shorthanded goals against. And with OPP and OPK, there’s a leaguewide OPP of 220 goals for (253 PPG minus 33 SHG)…and therefore 220 against.
The percentages balance out as well. The leaguewide PP percentage that year was 18.123% (253 PPG on 1,396 power plays), opposing a leaguewide PK percentage of 81.877% (1,143 penalties killed out of 1,396 power plays against). 18.123 plus 81.877 is 100.00.
In OPP, OPK, and OST, the same still applies. 253 PPG minus 33 SHA equals 220; 220 is 15.759% of 1,396. And 220 net goals against out of 1,396 is 81.241%. 15.759 plus 81.241 is 100.00.
Why have shorthanded goals against included at all?
Ordinarily, a power play has two jobs; hockey’s version of the Hippocratic Oath, if you will. (Do good, or do no harm). Those two jobs are to either score a power play goal, or at least prevent a shorthanded goal against. Scoring a goal is optimal, not scoring is neutral, giving up a shorthanded goal is negative. Naturally the priority can vary based on situation; if a team is down by a goal and has a power play with 45 seconds left in a game, failing to score in that situation is obviously negative, and no coach would ever hang his hat on the idea of “at least we didn’t give up a shortie!”
However, since information on situational power plays is spotty from days gone by, this creates two issues. One is that OPP will simply treat all power plays as being equal, where one that takes place with a 5-0 lead in the second period is the same as one with less than a minute to go and facing a one-goal deficit. (Of course, this is no different than how power plays are currently compiled anyway.) The other is that there is no way to actually adjust for this factor without getting into the one thing I can’t stand with metrics: arbitrary numbers.
Yes, the latter power play situation is significantly more important than the former. There is also no possible way to assign a numerical value to certain situations. And if there’s one thing that two decades of messing around with metrics and algorithms has taught me from firsthand experience, plus my experience of all types in sports, it’s that screwing around with these things can create an enormous cascade of situations that make a metric say something that you don’t intend it to. Yes, some situations are unquestionably more important than others. No, it is not possible to quantify this effect.
Why not attach a multiplier to shorthanded goals?
It’s arbitrary and leads to unforeseen consequences. If a team scores 100 power play goals but allows 50 shorthanded goals against, it’s still a +50 on the scoreboard over the span of a season. On the flip side, scoring 50 shorthanded goals while allowing 100 may be extremely impressive, and a historic achievement, but it’s still -50 on the scoreboard.
Attaching multipliers would unbalance the ledger as it relates to the actual scoreboard. At minimum, that’s undesirable.
Why add OPP and OPK together as percentages, rather than multiply or run as a total average?
This requires a historical footnote. In baseball, sabermetrics gurus Pete Palmer and John Thorn popularized OPS (on-base percentage plus slugging) in their book The Hidden Game of Baseball. Although it’s dangerous to go simply off memory, I believe Palmer was the one who later rued this, suggesting that it should have been OTS (on-base times slugging) since that would provide a greater emphasis on balance – balance for a hitter being more important and more indicative of a player’s value than mashing the two numbers together by addition.
In the case of OPP and OPK, let’s take 11 theoretical teams that all have a flat 105% OST. This will show how it would look if the OPP and OPK were multiplied, rather than added. I’ll list the final number as OmST (Overall multiplied Special Teams):
- 10% OPP and 95% OPK – 950 OmST
- 11% OPP and 94% OPK – 1034 OmST
- 12% OPP and 93% OPK – 1116 OmST
- 13% OPP and 92% OPK – 1196 OmST
- 14% OPP and 91% OPK – 1274 OmST
- 15% OPP and 90% OPK – 1350 OmST
- 16% OPP and 89% OPK – 1424 OmST
- 17% OPP and 88% OPK – 1496 OmST
- 18% OPP and 87% OPK – 1566 OmST
- 19% OPP and 86% OPK – 1634 OmST
- 20% OPP and 85% OPK – 1700 OmST
The problems are simple:
- There’s no guarantee that a greater balance is necessarily a positive. We may intuitively think that it is, and a coach may say that it is, but the empirical proof is simply not there.
- There is a strong correlation between combined OPP and OPK (OST) and playoff appearances, while one does not appear to exist for OmST.
In addition to this, the changing conditions of the game over spans of time inherently create natural imbalances. The highest OST of all-time is the 1975-76 NY Islanders (30.345 OPP plus 88.329 OPK, equaling 118.674 OST). Second is the 1970-71 Boston Bruins, whose 117.964 OST is a result of a 26.298 OPP plus 91.667 OPK. The Islanders’ OmST would be 2680.344, the Bruins’ 2410.659. But there was also a gulf between what the leaguewide OmST looked like: 1975-76 was 1479.198 (18.05 OPP times 81.95 OPK), while 1970-71 was 1357.492 (16.199 OPP times 83.801 OPK). The Islanders OmST would be 1.812 times the league average, the Bruins’ 1.776 times the league average.
What does all of this mean? Damned if I know; it sure looks like we’re just creating metrics for the sake of creating metrics. Leaguewide OST balances out to 100.00 every year, leaguewide OmST can swing wildly from year to year as the conditions change. Adjusting OmST to normalize back toward a certain benchmark has two additional problems:
- What is the benchmark?
- How do you arrive there? Is it by simple multiplication, do we start bringing standard deviations into it, or do we start using more and more complicated formulas to arrive at that point?
I remember in fifth grade science, we had a small group project in which the groups had to build a structure that could support a class textbook for 15 seconds, and the only allowable materials were plastic straws, paperclips, and pins. The first attempt with my group was successful for a few seconds before it collapsed under the weight of the book. We made some minor adjustments, and it didn’t change the results. We made some more changes, but were not successful for the full 15 seconds. By the time that we had the final test on it a week later, our structure looked impressive but still couldn’t hold the book for the whole time. Looking around, no one’s could either despite some very impressive-looking designs…with one exception.
The one group that passed the test had a very basic structure, with a flat gridwork of straws, then another grid on top of that at a 45-degree angle, then one more at the original angle. The whole thing was probably 10″ by 10″, and barely one inch tall if even that. It looked simple, and simply ugly, but it did the job to an extent that none of the intricate shapes and towering structures ever could.
I don’t actually think of this specific incident when I devise metrics; the lesson learned simply became innate. This is truly the first time I’ve even recalled this test since I was in 8th grade and we had a similar test. (In that test, which was different but also involved the idea of “simple is better”, our very simple design was beaten out by one that was even simpler. Go figure.)
Where are you going with this?
Actions speak, and metrics speak. OmST says that balance between the power play and penalty kill is optimal, regardless of whether “balanced is better” is even a true statement. OmST changes leaguewide from year to year, throwing everything completely out of whack, and the only way to compensate for this is by adding another calculation. And that might not even be enough, so another calculation gets added in. By the time it’s all done, it would require a small book to explain all the computations, and it might still be insufficient.
In ancient astronomy, it was noted that stars tended to move across the sky in certain patterns. But a handful of other celestial bodies did not, a single one of which would be referred to as planetes (meaning “wanderer”). They seemed to follow a general pattern, but not a specific one that could be pinpointed and tracked.
The assumption was made that, although the planeti were still orbiting the earth in circle, they also had an epicycle in which it was rotating slightly while still orbiting. To see an example of this, put a coin flat on your desk. Now take another coin, stand it on its side, and flick it. Notice that it will orbit around the stationary one, but it also makes smaller orbits as part of the larger one. That’s an epicycle.
The existence of epicycles for planetary orbits was regarded as an absolute truth, but it didn’t quite mesh with the observable evidence. To compensate for this, another epicycle might be added; it would bring the math and the observable closer, but not quite there. Another epicycle might be added…you get where this is going. It wasn’t until Johannes Kepler determined (twenty centuries later) that the planets followed elliptical orbits, rather than circular ones, that what appeared to be the unusual random motion of the planets was finally solved.
Adding epicycles didn’t work fully in astronomy because the idea of circular planetary orbits was an inherently flawed one. Today, “adding epicycles” is a term that’s used to refer to attempting to further refine an inherently and fatally flawed system of some type, either not realizing or not caring that it will simply not work.
Get to the point…
OmST, created by multiplying OPP and OPK, is fatally flawed, and attempts to make it less flawed do not change the fact that it is fatally flawed. I can make all the additional calculations and add all the epicycles I want, and it does not change one bit the fact that it does not work to a meaningful extent.
And what would the point be? I said on the brief PDO write-up that metrics need to either provide a meaningful look at the past or have some type of predictive value. This isn’t the last time I’ll bring up this point either. OmST is a metric that’s created simply for the sake of creating a metric; I do not believe it has any predictive value, and I don’t think that it can provide a meaningful look at the past.
Further exploration of OmST is therefore pointless.
Also, “OmST” looks like something I’d find on an electrical schematic. Imagine seeing a sentence that reads “The Canucks lead the league with 1690 OmST”, rather than “The Canucks lead the league with an overall special teams percentage of 108.2”.
OST, the simple addition of OPP and OPK, equals 100.00. Every year. Period. No adjustments needed, no further calculations, no epicycles.
That’s the beauty of OPP, and OPK, and OST. The closer to 100% in OPP and OPK, the better. OST, since it’s tuned to a league average of 100%, is equally simple: above 100 is above average, below 100 is below average.
While you’re rambling, are there any other random school memories you’d like to bring into it?
Of course. In 12th grade physics, we had an egg drop from various heights, culminating in a drop from the top of the press box at the football stadium. My design, which was a box slightly larger than an egg and filled with finely chopped packing peanuts, passed with ease. Easiest design ever.
Once the egg was secured in the box, I wrapped the whole thing in duct tape. It was then that I was told that the safety of the egg had to verified after each height to see if it was still intact. Since this was impossible with my design, I suggested to my teacher – one of the most interesting individuals I’ve ever met, by the way – that if the box was opened after the last one and found to be less than intact, I could be graded as if it couldn’t survive the first drop (shoulder height); this would mean I could only get an F or an A on the project and nothing in between. The egg made it, and I got my A.
What about negatives?
Okay, I haven’t gotten into this yet. It is possible to exceed 100% in OPK, and it’s possible to be negative in OPP. It’s never happened over the span of a season, although I’m quite certain it’s happened in several individual games.
Since the numerator of OPP is (power play goals for) – (shorthanded goals against), a negative OPP can be achieved by allowing more shorthanded goals against than scoring power play goals for. And on the flip side, since the numerator of OPK is (power play goals against) – (shorthanded goals for), the same team with such an outburst would have an OPK of over 100%.
Team A goes 1/5 on the power play but allows 3 shorthanded goals against. Team A will have an OPP of -40%, while Team B will have an OPK of 140%.
And yet when you add those together, you’ll still get an overall OST of 100%. Such is the beauty in the simplicity. (OmST, by the way, would show up as -5,600, thus exposing yet one more flaw in that system which demonstrates that it’s simply irreparably broken.)
Who holds the league records?
Without further ado, here are the various leaderboards. Since power play and shorthanded stats have only been compiled since 1963-64, these only go back that far.
Top 10 Single-Season OPP%, since 1963-64
Top 10 Single-Season OPK%, since 1963-64
Top 10 Single-Season OST%, since 1963-64
Bottom 10 Single-Season OPP%, since 1963-64
Bottom 10 Single-Season OPK%, since 1963-64
Bottom 10 Single-Season OST%, since 1963-64
Notice that the majority of teams seem to cluster within the 1970s and early 1980s. I’m merely speculating here, which is always dangerous, but I would venture to say that the spread of OPP and OPK from the league norms and the spread of OST away from 100% is a result of one of two things:
- A team of historic levels, whether in extremely high or extremely low quality
- An extremely imbalanced league from a competitive standpoint
Imbalanced leagues, quite simply, are very bad for business. A significant spread of teams away from the average means that you’re simply watching the strong pummel the weak, game after game after game. After a while, fan interest begins to wane because no one with good taste wants to become strongly emotionally invested in a team that cannot compete and shows no signs of competing in the near future. Ask a Cleveland Browns fan what it’s like. Trust me; I am one.
Now multiply that across an entire league, where there’s a very clear and historically bad bottom handful of teams getting hammered by the upper echelon. Of the bottom 50 teams in single-season OST, exactly three made the playoffs. And of the top 50, every single one of them made the playoffs. A very large number of the top 50 played for the Stanley Cup with quite a few winners, while the bottom 50 had as many teams with single-digit wins as .500 records.
There are a couple more leaderboards that I think are interesting. Shorthanded goals against kills a team’s OPP, and shorthanded goals for will boost a team’s OPK. So who had the biggest deviations based on factoring shorthanded goals in?
Bottom 10 teams in OPP compared to regular PP%, since 1963-64
Top 10 teams in OPK compared to regular PK, since 1963-64
What’s interesting is that the 1995-96 Avalanche, whose power play was hammered by allowing 22 shorthanded goals against, still recovered enough to win the Stanley Cup that year. On the other side, five of the top ten teams who got a boost from a devastating shorthanded attack also won the Cup.
With additional research, stronger correlations may be found between OPP, OPK, and OST and success compared to that involving “regular” PP, PK, and combined special teams percentage.
That may be the next analysis….