Introducing Player Approximate Value (PAV)

One of the oldest questions in global team sport is: what is a player really worth?  To come up with a workable answer for this, we have leant heavily on work undertaken by Bill James, Doug Drinen and Chase Stuart, and looked at several different sporting codes and how they attribute player value within the team environment.

This post will describe in detail the player valuations we’ve derived under a method we’re calling Player Approximate Value (PAV). We’ve given hints of these valuations in past posts such as this one about recent retirees and this one running through statistical “awards”. We are planning to use the values we’ve derived here to replace earlier methods of trade and draft valuations, and will continue running other PAV-based analysis, so you’ll see a lot more of it in future.

Valuing players

Much of the modern advanced sport analysis can be traced back to one man: Bill James. From the publication of the first The Bill James Baseball Abstract in 1977, James has created a language to describe the sport beyond it’s base components, and has emphasised using statistics to support other obvious judgements.

In 1982 James introduced a concept called the value approximation method, a tool to produce something he called Approximate Value. He did so by stating:

“The value approximation method is a tool that is used to make judgements not about individual seasons, but about groups of seasons. The key word is approximation, as this is the one tool in our assortment which makes no attempt to measure anything precisely. The purpose of the value approximation method is to render things large and obvious in a mathemtatical statement, and thus capable of being put to use so as to reach other conclusions.”

The resultant product produced by James was inexact, but able to generally differentiate bad seasons from good seasons, and good seasons from great. James used basic achievements to apportion value, based on traditional baseball statistics. Over the years James experimented with a series of different player value measures, but he revisited Approximate Value several times, most notably in 2001. However, much of James’s later efforts focused around other methods of player valuation, and Approximate Value remains an often overlooked part of his prior work.

In 2008 Doug Drinen, of Pro-Football Reference, decided to adapt James’s original formula to evaluate which individual college postseason award was most predictive of future NFL success, but was confronted by a lack of comparable data for football players. This initial effort, while a noble attempt, was critized for using very basic statistics – games played, games started and Pro Bowls played. Whilst the results largely conformed with logic, notable outliers existed – ordinary players that saw out lengthy careers on poor teams.

Unwittingly, we created a similar method to both the original 1982 James formula and the first Drinen formula, which we used to create a Draft Pick Value chart. The method created a common currency that could be used to value the output of players drafted from 1993 to 2004, and to also predict the future output of players (1993 is considered by most to be the first true draft, as it comes two years after the cessation of the traditional under 19 competition and after the various AFL zones were wound back).

This produced this chart, as linked.

The most common criticism of the chart was, like the original Drinen analysis, it was too narrow in ignoring the quality of games versus the quantity of games played. For most players, the relationship between games played and the quality of the player is relatively linear – bad players tend not to play a lot of football before they are delisted. Due to the strict limitations placed on AFL lists, and the mandatory turnover of about 7% of each side each season, players who fail to perform tend not to stay in the AFL. A small modification we made in 2016 was to add a component of quality – namely a weighting by Brownlow Medal votes, which applied a weighting for Brownlow-implied value of players selected at each draft position above and beyond just games played.

However, the original formula still had the issue of valuing Doug Hawkins as having a better career than Michael Voss – which is patently ridiculous. And the modified formula, though doing a better job of valuation, still felt slightly incomplete.

Later in 2008 Drinen came up with the measure we know today as Approximate Value, by splitting contributions into positions and determining positional impact on overall success. Whilst it still is an approximate value measure, it was far more accurate than any other NFL value measure to date. Approximate Value is still used as a historical comparison tool of player value, worth and contribution across a variety of applications, not limited to draft pick value charts, trade evaluation and the relative worth of players across careers.

What have we done

Player Approximate Value, or PAV for short, is a partial application of the final Drinen version of AV, but applied to the AFL after a range of testing. In the vein of CARMELO and PECOTA, it is unashamedly named after Matthew Pavlich, who happens to be one of the most valuable performers in recent years under the PAV measurement now proudly bearing his name.

Basic AFL statistics are very good at determining a player’s involvement and interaction with play, but relatively poor in evaluating how effective that interaction was. On the other hand, basic statistics are reasonably effective at determining how good a team is both across a season and within each individual game. Drinen’s AV, and now PAV, both combine these two elements.

PAV consists of two components – Team Value and Player Contribution.

Team Value

When developing AV, PFR recognised that the team is the ultimate in a team sport, an approach that we fundamentally agree with. PFR split up an NFL team’s ability into two components – offence and defence. Both were evaluated on points per drive adjusted for league average.

Luckily, we accidentally stumbled on a similar approach in 2014 when trying to determine team strength, however we split strength into three categories corresponding with areas of the field – offence, midfield and defence. Unlike American Football, possession in the AFL does not alternate after a score, and turnovers aren’t always captured in basic statistics. However, after learning from Tony Corke that inside-50s are one of the stats which correlate most strongly with wins, we landed on an approach of utilising them to approximate the “drive” of the NFL.

The formulas, similar to those used in the HPN Team Ratings, which are all ratios measured as a percentage of league average:

  • Team Offence: (Team Points/Team Inside-50s) / League Average
  • Team Midfield: (Team Inside-50s/Opposition Inside-50s)
  • Team Defence: This is a little more complex.
    • Defence Number (DN) = (Team Points Conceded/Team Inside-50s Conceded)/ League Average
    • Team Defence = (100*((2*DN-DN^2)/(2*DN)))*2

All three categories are inherently pace-adjusted, and as such there is no advantage to quick or slow teams racking up or denying opposition stat counts.

Each season is apportioned a total number of PAV points (we’re just saying “PAVs”) in each category, at a rate of 100 * the number of teams in the competition. For example in 2017 there were 1800 Offence PAVs, 1800 Defence PAVs and 1800 Midfield PAVs, or 5400 PAVs overall. This ensures that individual seasons are comparable over time, regardless of the number of teams in the competition at any time.

Unfortunately, inside-50s have only been tracked since the 1998 season. For seasons before then, we have utilised points per disposal, which roughly approximates the team strengths of the inside 50 approach. There are some differences but they are relatively marginal overall – with very few club seasons moving by more than 3%.

We feel that these three basic statistics can articulate the strength of a team better than any other approach we have seen, and it happens to match the approach taken when creating AV.

Player Involvement

This is the part where HPN has deviated from the approach of Drinen and James. As positions are not strictly defined and recorded as tightly in Australian Rules as in the NFL, it would be impractical at best to use positions as a starting point for developing a player value system.

Instead, we considered that the best way for us as amateurs from the general public to identify a player’s involvement was through those same basic and public statistics. Whereas the team value as calculated above used a relatively small number of statistical categories, player involvement can be much more complicated.

To allocate value, we relied on a number of intuitive decisions, statistical comparisons and peer testing, refining until the results were satisfactory.

The first attempt we made with the guidance of Tony Corke’s statistical factors that correlate with winning margin, then making some subjective decisions made from there. This attempt produced “sensible” results and also correlated reasonably with Brownlow medal votes.

The formulae were then fine-tuned by testing subjective player rankings on a group of peers. The formulas were also tested further against Brownlow Medal votes, All Australian selections, selected best and fairest results and Champion Data’s Official AFL Player Ratings.

Although no source is perfect, PAV was largely able to replicate the judgements of these other sources, especially that of the Official Player Ratings. Generally, if a player has a higher PAV across a season, they will receive more Brownlow Medal votes:

BV v PAV

In the end, PAV and its results were tested on a wider scale via blind testing on the internet (stealing the approach taken by Drinen when he created AV), and the results largely confirmed the valuations taken by PAV. The formulae for each line are:

  • Offensive Score = Total Points + 0.25 x Hit Outs + 3 x Goal Assists + Inside 50s + Marks Inside 50 + Free Kick Differential
  • Defensive Score = 20 x Rebound 50s + 12 x One Percenters + (Marks – 4 x Marks Inside 50 + 2 x Free Kick Differential) – 2/3 x HitOuts
  • Midfield Score = 15 x Inside 50s + 20 x Clearances + 3 x Tackles + 1.5 x Hit Outs + Free Kick Differential

The weightings and multipliers used in each component formula will necessarily look a bit arbitrary, but are the results of adjustment and tweaking until the results lined up with other methods of ranking and evaluating players as described above.

As the collection of several of these measures only commenced in 1998, we have also adapted another formula for the pre-1998 seasons which correlates extremely strongly with the newer formula. Whilst we feel it is less accurate than the newer formula, it still largely conforms to the findings of the newer formula. This formula was created by trying to minimise the standard deviation for each player’s PAV across the last five seasons of AFL football. Around 5% of players have a difference in value of more than one PAV between the new and old formulas.

We will publish the pre-1998 formula in the not-too-distant future.

Putting It Together

The final step combines individual player scores and team strength calculations to produce the final PAV for each player. This is done in two steps.

Firstly, the individual component scores for each team are compiled. Each player’s individual player score is converted to a proportion of total team score, telling us the proportion of value they contributed to that area of the ground.

Secondly, the team value (i.e. team strength as outlined above) is multiplied by the proportion of the component score for each player.

An example will help illustrate this.

In 2016 the Blues midfield earned 96.71 Midfield PAVs across the whole side (being below league average). Bryce Gibbs accrued a Midfield Score of 3984, and the team tallied up 37702 in midfield score in total. As a result, Gibbs contributed 10.567% of the total Midfield Score for Carlton, and receives that part of the 96.71 Midfield PAVs that Carlton had gained – or 10.22 MidPAVs.

These calculations are done for every player in the league for every side. The overall PAV value for each player is merely the three component values added together. For Gibbs in 2016, this is his 10.22 MidPAVs with 6.86 OffPAVs and 3.21 DefPAVs for a total of 20.29 PAVs all up. Which is pretty good.

What PAV should be able to tell you

Two key advantages of PAV, we feel, are that it can be replicated based entirely on publicly available statistics, and that by using a pre-1998 method, we have derived a fairly long set of historical values.

While HPN intends to publish PAV to a finer degree than PFR, there still remains a great deal of approximation in the approach. This is especially the case for pre-1998 values, which rely on a far smaller statistical base. We cannot definitively state that these are the exact values of each player relative to other players; however we feel that the approximation is closer than any other method that has as long a time series made with publicly available data. It is possible, and indeed likely, that some lower-ranked players are better players than those above them in certain years.

What we are more confident of is that the values are indicative of player performance relative to others across a longer period of time. Or; in that given year, player X was likely more valuable than player Y, or at least to their team.

As it draws its fundamental values from team rankings, it is much harder to draw a higher value from a bad team than it is a good team. This scales player value to more highly rate performance in a good side, and specifically it highly rates players in good parts of the field in good sides.

As a group, players with a PAV of 18 should be better (or have had a better year) than those who are ranked with 16. As a rule of thumb a season with a PAV of over 20 should be considered to be a great season, with any PAV over 25 should be considered exceptional. This varies slightly for different positions – an All Australian key position defender may have a lower overall PAV than a non-All Australian midfielder, but with an extremely high rating in the defence component.

Below is a list of the players with the highest season long PAVs between 1988 and 2016:

TopPAVSeasons.JPG

2017 isn’t finalised yet, but the top end of the list to date is populated with Brownlow Medallist years and players considered to be the absolute elite of the league over the past two decades. While there are some year-on-year PAVs that conflict with common opinion, these top-end player-years do not contain any. Yes, that Stynes year was that good.

On a career basis, the top rated players should be fairly uncontroversial:

TopPAVSCareers.JPG

The top ten players on this list not only had successful careers, but also incredibly long careers as well. Note that this is current to 2016, so Gary Ablett Jr has more value to come.

Every player made multiple All Australian teams, and a majority were considered at different points of time to be the “best player in the game”. As such, PAV ends up being a measure of not only quantity of effort but also of quality.

What are the weaknesses of PAV?

Like almost any rating system, there are always blind spots – especially in early phases of development. Like in almost any rating system in any sport, there appears to be a slight blind spot in valuing truly pure negating defenders. Consider Darren Glass, possibly the finest shutdown KPD of the AFL era. He is somewhat overlooked from an overall perspective by PAV:

Glass.PNG

Glass’s Defence PAV still remains elite during this era, but he provided little to no value to any other part of the Eagles’ performance across the period. It’s worthwhile to compare Glass to the namesake of PAV for a comparison:

Pavlich.PNG

This is a lesson to sometimes look beyond the headline figure to the components that make it up. It’s worthwhile look beyond the Overall PAV figure for the relevant component figure for the player’s specific role, especially for specialist players. We can also see with a player like Pavlich that his shifting role over his career is revealed by PAV. Generally, a component PAV of more than 10 for a specialist player will place them in contention for an All Australian Squad selection (cf. Glass above), if not selection in the side itself.

Occasionally a season pops up that defies conventional wisdom, such as Shane Tuck’s highly rated 2005 season, or Adem Yze, who rates so highly via PAV as to suggest he was under-recognised throughout his career.

However, Insight Lane brought a very interesting observation to our attention this week, from Bill James himself:

As noted at the top, we’ll be applying this system throughout the draft and trade period to evaluate trades and draft picks, and probably in a lot of other analysis from here on out, as well. Stay tuned in the coming days for an All-Australian team based on PAV.

In our time developing and testing PAV, it has usually confirmed our conventional thinking, but occasionally surprised us. Which makes us think we might be on the right track. With system comes the ability to analyse, so the goal for us in developing this approach is to emulate and augment subjective judgments with a systematic valuation, rather than to create a value system alien to an actual “eye test”.

If you have any comments or questions about PAV, please feel free to contact us via twitter (@hurlingpeople), or email us at hurlingpeoplenow [at] gmail [dot] com. We are more than willing to take any feedback on board, and if you want to use or modify the formulas yourself, feel free to do so (just credit us).

Thanks to all that provided help, assistance and the reason for the development of PAV, namely Rob Younger, Matt Cowgill, Ryan Buckland, Tony Corke, James Coventry, Daniel Hoevanaars… and everyone we are forgetting here. We will add more when we remember who we have forgotten.

Advertisements

3 thoughts on “Introducing Player Approximate Value (PAV)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s