After three years and at least as many requests, we have finally come up with a tipping model. Perhaps the longest part of the process was coming up with a name, like all good statistical models. We’d like to introduce our new friend PERT (Player Estimated Result Tipping), named in honour of Brian Pert‘s son Gary, who got a biscuit stuck in his oesophagus so badly it required surgery, and who also ran Collingwood for a decade.
PERT is a model that takes our new mPAV measure (Marginal Player Approximate Value) as employed in our preseason previews (here, here, and here), derives expected team forward, defence and midfield strengths from the specific squads named, and estimates a result based on them.
In short, the model predicts games from the selected 22 and how each player has performed in the past, rather than using past results or estimated strengths for the club as a whole as with Elo and related methods.
Essentially, mPAV takes our basic PAV measure (detailed here and provided on a year-by-year basis from the menu bar at the top of the page) and creates a standardised per-game measure. The per-game value is then expressed as a plus or minus figure relative to the value of the average individual game.
Why the extra steps from PAV itself? They are necessary because basic PAV is a season value measure, developed with a view to long term historical comparisons, such as career player valuation useful in draft and trade analysis. PAV is therefore constrained to a fixed pool of points resources for each season. There’s 100 PAV per team for each of the forward, back and mid zones, meaning a total of 300 PAV per club, distributed according to team strengths. The stronger clubs have more than 300 PAV in total, the weaker have fewer.
That means we can’t just divide PAV by number of games to get a viable per-game value for a player. Early in the year, a straight PAV calculation can see a player averaging 30 PAV per game, and later on maybe just 1.5 for roughly the same level of performance. Cross-year comparisons to the current year are therefore impossible. Crucially, PAV also doesn’t discriminate between H&A and Finals. Some sides play up to 26 games, others play 22, which creates inherent inequity in comparing players from different clubs by dividing PAV by number of games.
To account for all this, mPAV to create an average per-game value per onfield component (def/mid/off) to use as a baseline to evaluate each player’s performance against. This is simply the 100 PAVs for a season divided by the games played:
mPAV Team Component (TC) = 100/Total Player-Team Games Played
Then, a player component is created:
mPAV Player Component (PC) = ((Component Player PAV / Player Games Played) – (TC))/TC
For example, Dyson Heppell had 5.2 DefPAV, 8.56 OffPAV and 12.55 MidPAV after round 1 this year, to convert his DefPAV into a per-game Defensive marginal PAV, the equation looks like:
mPAV Team Component (TC) = 100/22 = 4.545 mPAV Player Component (PC) = ((5.2 / 1) – (4.545))/4.545 = 0.88 Def mPAV
For this measure, 0 represents the mean value of a game played by the most average player on the most average team imaginable (in practice more players end up below the mean of 0 than above it, due to the nature of value distribution across players).
Each player has three component mPAVs – Off, Def and Mid. These are summed but for most players they end up very different from each other – defenders often have a very negative OffPAV and MidPAV, which represents that their value in these areas are well below the league average player.
But the tips! How does this feed into PERT?
For each week, a team’s mPAVs for the selected players are summed and divided by 22, which are then added to 1 to give a team rating (the added ‘1’ just turns the marginal +/- measure back into a figure relative to an average team strength of 100%). The resulting team rating is directly analogous to the HPN Team Ratings, but instead of a team-based measure representing the strength of that club for the year in total, it only considers the selected players. Essentially we generate team strengths for a list of selected players on the fly, in order to tip a result from them.
Team strengths are expressed as we’ve been doing for a couple of years, by looking at three zones of the field in isolation. Here’s the final whole-list team strength ratings for 2017:
These strengths answer a simple question: “what is the balance of the inside-50s they’re getting in their games, how well do they score from each inside-50, and how well do they defend their opponents’ inside-50s?”
The strength evaluations directly determine PAV apportionment – Adelaide’s 2017 list has 110 offensive PAVs, defensive midfield PAV and 114 midfield PAV apportioned to players according to their contributions. That means we can also use our mPAV for each player to back-out the same sorts of team strengths, above or below 100% of league average, for the selected teams.
Each side’s team strength for each line is adjusted against the opposition line in PERT :
Home Off v Away Def
Away Off v Home Def
Home Mid v Away Mid
By doing this we can create an prediction on how many points per inside 50 each team will score in a game, and how many inside 50s each team will be able to create.
What about Home Ground Advantage(HGA)?
At this stage we’re not rolling in a home ground advantage factor. We haven’t done the research, unlike the above sources, to how much it might run in this system. We may run a shadow model using a fixed margin adjustment for interstate or intercity games in order to test. If you were to throw the standard 10 point advantage to home sides playing interstate opponents, it might provide a more accurate result. Who knows?
Results so far
We also honestly don’t know how well the model will do. We’ll be tracking the results from week to week.
Last night we predicted Richmond by 30 (106-76, this obviously did not eventuate). Last week we tweeted out the following table, with a beta version of PERT running the predictions. This is not the final version we have ended up with, but the model still had a MAE of 31 and predicted six games correctly.
The win-probability is a calculation provided by Matt at The Arc (thanks Matt!), which lets us enter the famous Monash probablistic tipping competition. These tips were actually not quite the results produced by our finalised model for deriving results from the calculated strengths. However they’re what we put on the record so here they are again anyway.
As we can only calculate the predictions after final teams are announced, we will not be able to release all our predictions at one time each week (damn you Thursday Night Football). Each week we will tweet out the tips, and try to keep a tally on our website for transparency.
Week 2 tips
As discussed above, we predicted Richmond to beat Adelaide by 30 last night which of course didn’t happen (Adelaide did quite a lot better in the midfield than our forecast, winning 64:49 instead of losing 43:62). For the rest of the round we have the following predictions, complete with a draw tip between Carlton and Gold Coast:
The tips will be updated as Sunday and Monday final teams are announced. Follow us on Twitter for the most up-to-date tips.