# Updated Pythagorean Tables

With the Premier League season drawing to a close I thought I’d go back and see what the ‘Pythagorean Expectation’ for the current league table currently looks like. Using the same values as in my previous post of $\gamma_1 =1.18 \text{ and } \gamma_2 = 1.23$ means we can see if the same teams seem to under or over performing.

This is the league table from earlier in the year.

Old League Table

Below is the current league table I’m working from with about 5 or 6 games to go.

Current League Table

Observations

• Leicester seem to be unstoppable in real life and despite my prediction of them hitting a blip nothing of the sort has happened to them yet. This is reflected in them having out performed the average teams points total for their goals scored and conceded by a whopping 9.29 pts. This has almost doubled in the last 10 games (they have only lost 1 league game in the last 10 and that was by a single goal)!
• The other prediction I made was that Tottenham seemed to be under-achieving earlier in the season and that they could challenge for the title. Despite not substantially improving their residual -5.64 to -4.81, they have maintained their good run of form and look like the most likely title challengers to Leicester.
• Other significant improvements have been made by the likes of Bournemouth and Stoke who have turned their form around to pick up more points than they might have expected to.
• Everton continue to be having a shocking season based on pythagorean expectation which matches the conventional wisdom about them. In a more typical season based on their goals scored and conceded they should have almost 9 more points.
• Finally despite my current worries about Watford the maths seems to suggest that they will still be fine in the league this year.

# Coupon Collecting and Football Stickers

Although Euro 2016 is still months away from officially starting, another footballing tradition has just started: collecting football stickers. For children and adults alike the quest to complete the entire sticker album is something that takes time and money but the excitement and joy makes it worth it.

As the number of stickers in each album have dramatically increased from the first few albums produced around 40 years ago, the task of collecting all of the stickers has become much harder. Mathematically this problem happens to be a classic problem in combinatorial optimisation: the coupon-collector problem. This problem is when there is a finite number of different coupons any of which can be given to a person one at a time. In a statistical sense this is the problem of sampling with replacement. The question is then how many times do you need to sample to obtain a copy of every single sticker?

# Expected Goals in Football

Previously I talked about one way to measure the success of a football team over a year through `Pythagorean Expectation‘. Whilst this is a pretty good metric for predicting success, it can only be applied over a certain number of games and can’t tell us anything about a particular match. Since being able to determine how well a team performed in a particular match is the ultimate goal of analysing games, many ideas have been developed to try and do this with increasing accuracy.

#### A Short List of (Increasingly Better) Football Metrics

1. Goals Scored & Goals Conceded
2. Shots
3. Shots on Target (SoT)
4. Total Shots Ratio (TSR)
5. Expected Goals (xG)
6. Expected Goals with tracking data

At the most basic level that you see in the overall league table is the goals scored and conceded for each team. Teams that tend to score lots of goals and concede few goals win more matches and hence finish higher up in the table at the end of league.

# Pythagorean Expectation in Football

The use of data and statistics has grown rapidly in most sports in recent years. Nowadays many metrics and formulas exist for trying to measure and predict performances of players and teams. One of my favourite tools because of its simplicity is something called ‘Pythagorean Expectation‘.

The most famous early pioneer of baseball statistics, Bill James, came up with the original formula to predict the number of wins for a team over a baseball season, based on the number of runs they scored and conceded.

### $Win\% = \cfrac{RS^2}{RS^2 +RC^2}$

To work out the number of wins in a season you simply have to multiply the win percentage by the number of games played. The reason James called it Pythagorean is because of the occurrence of all the squared terms. Whoever said Pythagoras Theorem was boring!