On “Talent Dilution” – re-assessing one of history’s greatest foregone conclusions – PART 4 (Discrepancies in aging, or “Who is Johnny Bower?”)

Previous: Part 3 (“The Devil and J.J. Daigneault”)

As mentioned previously, there are discrepancies in some birthdates of NHL players – this is mostly in the earlier days of the league.  Johnny Bower is a notable exception.

This is not about that type of age discrepancy.

Turn on any NHL game, and you might see a reference to how old or how young a particular team is.  I believe this is nothing more than simply averaging the age of the current roster at that moment in time, but there could be some different methodology.

So let’s go through some possibilities first, and then get into the heart of it.

The Common Way

For the sake of nice round math, let’s say that our NHL team has 20 players on it.  It doesn’t, but we’ll say it does.  And let’s say that ten of them are 20 years old, and ten of them are 30 years old.

fakeroster1

(Click to enlarge)

What’s the average age?  It’s 25, right? I mean, it’s very clearly in that green box in the very bottom right.

Or…maybe not.  After all, “age” is a fluid thing – two people who are 25 years old may be almost a year apart in terms of actual age.

Have you ever heard that riddle: “The day before yesterday, I was 25 years old, and next year I will turn 28. How?”  The answer is simple: the person has a birthday of December 31, and today is January 1.  Two days ago (December 30), they were 25 years old.  On December 31 they turn 26.  On January 1, the year rolls over…they will turn 27 on the last day of this year.  Next year, they turn 28 (on the last day).

The NHL has a skewed number of players who are born earlier in any given year.  The reason is that youth hockey uses birth years to determine the eligible age of its players, and those born early in a given calendar year have an advantage over those born later in the same year.  A kid born January 1 has a two-month physical and developmental advantage over someone born March 1, six months over someone born July 1, and 364 (or possibly even 365 days, if we include leap years) over someone born December 31.

To use the average age of our sample NHL team above, it lacks important information.  A player who is 20 years old by celebrating a birthday today is essentially a different age than someone who is 20 years old today and will be 21 tomorrow.

Marleau and Seguin

In the run to the 1997 Entry Draft, there was discussion over how close Joe Thornton and Patrick Marleau were in the race to be the first overall selection.  Thornton, born July 2, 1979, wouldn’t turn 18 until after the draft.  Marleau, born September 16, 1978, was born one day after the cutoff for the 1996 draft.  In other words, he was as old as a first-time draft eligible player could be.  (Yes, I know about the old NCAA opt-in rule; it doesn’t bear discussion here.  I’m not that pedantic.)

Marleau was taken second overall by the San Jose Sharks, and went right to the NHL.  He played the 1997-98 season with the Sharks, and never played a game in the minors (or back in junior hockey) after being drafted.

Marleau also shows the discrepancy involved in the two major online player stats sites.  His 1997-98 season, in which he played his first games after turning 19, is listed as a 17-year-old season on hockeydb.com.  On hockey-reference.com, it’s listed as his 18-year-old season.

Less age-related controversy existed in the 2010 draft, with Tyler Seguin going second overall behind Taylor Hall.  Seguin, born on January 31 1992, played his first NHL games as an 18-year-old that year.  On hockeydb.com, the 2010-11 season is thus listed as his 18-year-old season.  But on hockey-reference.com, it’s listed as his 19-year-old season.

The reason is that when determining player ages on both sites, the player’s age as of February 1 is what is used.  On hockeydb.com, it’s February 1 of the year that the season begins.  But on hockey-reference.com, it’s February 1 of the year that the season ends.  In both cases though, it’s a far cry from the way that the NHL calculates things – which is September 15 immediately preceding the season that is about to be played.

In the player database, I ended up having to key in every player’s birthday, and the league’s cutoff for that particular year, in order to determine how the league would regard them.

Who plays the games?

This is where the re-averaging mentioned in the last part (“The Devil and J.J. Daigneault”) comes into play.

Let’s say that our imaginary NHL team with 20 players, ten of whom are 20 years old and ten of whom are 30, has a series of roster changes due to injury, ineffectiveness, or just shaking things up.  Besides the fact that this on its own will shift the average age in some way, it raises the further question of who plays the games.

fakeroster2

(Click to enlarge)

We can see that, besides the fact that I am horribly uncreative when it comes to creating fake names, the hockey-reference age (“H-R AGE”) is 25 since everyone’s birthday is four and a half months short of the H-R cutoff.  And because no weight is given to who is actually playing the game, the regular average is still exactly at 25 even though things start shifting a bit.

The big issue here is that it’s a perfect example of either stunning coincidence, or of the birthday paradox run completely amok.  (If you don’t recall the birthday paradox, it has to do with probability theory – if you put 23 people in a room, there is a greater than 50% chance that two of them have the same birthday.  If you put 70 people in a room, this goes up to 99.9%.  The likelihood of 23 randomly-assembled people – connected by nothing other than their hockey skill and a color-themed surname – all having the same birthday is astronomically unlikely.)

So we’ll throw some random birthdays in there, and notice how the ages start to diverge.

fakeroster3

(Click to enlarge)

This is using the hockey-reference.com cutoff of February 1 in the season in question.  You can see some age discrepancies: anyone with a birthday between September 15 and January 31 (inclusive) will have a difference in their NHL age and their H-R age.  On this roster, this includes Mr. Amber, Mr. Blue, Mr. Honeydew, Mr. Ice Blue, Mr. Jade, Mr. Mauve, Mr. Orange, and Mr. Purple.

In any case, we can see how much the actual age (in days) and the listed (rounded down) age can start to push apart.  This is the important factor, that a player’s age is given only in years which are rounded down – someone who is 24 years and 4 days is simply “24”, just as much as someone who is 24 years and 160 days, or 24 years and 332 days.

So here’s this very colorful roster, with the NHL’s age cutoff of September 15 and with the games played multiplier.

fakeroster4

(Click to enlarge)

The green box on the left (under “AGE”) is the average player’s actual age as of the NHL’s September 15 cutoff date of the season in question.  And the green box on the right is what happens when the games played multiplier is applied, and the age then re-averaged out.

So what?

Statistics and numbers will usually be interesting, meaningful, or illuminating.  Very rarely are they neither, and these represent the absolute bottom of the barrel.  Occasionally we find statistics and numbers that are interesting, meaningful, and illuminating.

Using the games played multiplier on the player’s actual age may or may not be meaningful or interesting, but it’s a bit more illuminating.

What I look for in expansion-related metrics across history are what I call “shocks and ratchets”.  Economists refer to something called “the ratchet effect” – this has to do with a response to a hard stimulus or a shock of some type that is applied.

For example, American military spending before World War II was fairly low.  After Pearl Harbor was attacked, military spending skyrocketed.  After the war ended in 1945, military spending plummeted…but it didn’t return back down to pre-war levels.  When the Korean War broke out a few years later, spending increased again – and once that war ended, it fell back but, again, not to levels or before that war.  Vietnam caused…you get the idea.

Expansion is a shock to a closed system – it causes an immediate surge in the number of roster spots, in the number of teams, it forces scheduling and possibly rules changes, and a myriad of other things to happen.  The evidence of this is everywhere.  Whether it causes a ratcheting of any of these remains to be seen; I’m not very far into the scope of this project right now, and there’s plenty more to go.

What am I looking for?

What I am looking for, in this very specific instance of average player age over time, is to see whether there is a large divergence between who is actually playing the games (as seen with the multiplier applied) and the raw average age.

More specifically:

  • Do we see changes, either a divergence or a contraction, in expansion years or in the immediate aftermath?
  • Does expansion act as the catalyst for a shift (in either direction) in player age trends over time?

The answer is here.  Blue represents the raw player age, and orange has the games played multiplier.

weighted-age

(Click to enlarge)

In years where orange diverges significantly from blue, it’s a sign of a veteran league: yes, young players are entering the league, but the veterans are getting the bulk of the games played.  If the orange and blue are fairly similar, it’s a league where the young players are getting a decent share of games played and (presumably) ice time.

The average annual discrepancy across NHL history, by the way, is a +0.5706 difference between the weighted average age and the raw average age.

Set against a backdrop of this league average over time, we can see how the gap between the raw player age and the multiplied/re-averaged age shifts over time.

age-mountain

(Click to enlarge)

The problem with this graph is that, as cool as it looks, it doesn’t have the capability of showing the expansion years.  This basic line graph does.

The league average age discrepancy is the maroon line.  The discrepancy is in blue, and the expansion years are in orange.

age-line-graph

(Click to enlarge)

In the mountain-shaped graph, you can see two of the blue data points that are at 0.  In the line graph directly above this sentence, these points are completely absent.

The reason for it is simple: the discrepancy actually shifts the other way – the games multiplied league average is below the raw player ages.  All it means is that the veteran players in the league were being overtaken at that point in time by younger ones, who were drawing the majority of the games played.

Short Conclusions

However, with the evidence listed so far in this section, it does not appear that expansion in the NHL has had any impact on average player age in any manner.  It does not seem to act as a catalyst to push one way or another, it does not appear to accelerate any trends, and crediting or blaming expansion for anything related to this appears to be without merit.

What warrants further research are the 1992/1993 and 1998/1999/2000 expansions.  The latter group in particular saw a quick drop in the age discrepancy, which barely moved until the 2001-02 season.  Could expansion have had something to do with it?  Could retirements of several aging players have been the primary factor?

As always, more research is needed.

NEXT: Part 5 (Intuition, and what is “talent”?)