I previously talked about the use of the classic Bradley-Terry model and its applicability to a wide variety of situations from ranking in machine learning algorithms through to modelling sports teams. In this post I will briefly outline some of the main modifications to the model over the last 60 years, extending its use into a wider range of situations.

In many sports that are played in front of a partisan crowd there is often a benefit to the team being supported. This is the concept of “Home Advantage“, where the local team tends to perform better than the visiting team. This is not just because of crowd support, it might also include the fact that the home team are more experienced at playing in those conditions – think of the England’s cricketing struggles in the Indian subcontinient!

A mathematical form for this was suggested by Agresti (1990) which is given below,

$P(i \text{ beats } j | i \text{ at home }) = \cfrac{\theta\lambda_i}{\theta\lambda_i + \lambda_j}$

$P(i \text{ beats } j | j \text{ at home }) = \cfrac{\lambda_i}{\lambda_i + \theta\lambda_j}$

Here the parameter $\theta > 1$ represents the size of the home-field advantage, the larger the value the more likely the home team wins.

### 2. Dealing with draws

A key drawback to the use of the basic Bradley-Terry model is that it doesn’t account for the chance of draws. This is traditionally done by adjusting the model with an extra threshold parameter $\theta$ to controls the likelihood of ties. One such model is that of Rao and Kuppler (1967),

$P(i \text{ beats } j) = \cfrac{\lambda_i}{\lambda_i + \theta\lambda_j}$

$P(j \text{ beats } i) = \cfrac{\lambda_j}{\theta\lambda_i + \lambda_j}$

$P(i \text{ draws with } j) = \cfrac{(\theta^2-1)\lambda_i\lambda_j}{(\lambda_i + \theta\lambda_j)(\lambda_j + \theta\lambda_i)}$

Another way of including ties in the model was suggested by Davidson (1970).

$P(i \text{ beats } j) = \cfrac{\lambda_i}{\lambda_i + \lambda_j + \theta\sqrt{\lambda_i\lambda_j}}$

$P(j \text{ beats } i) = \cfrac{\lambda_j}{\lambda_i + \lambda_j + \theta\sqrt{\lambda_i\lambda_j}}$

$P(i \text{ draws with } j) = \cfrac{\theta\sqrt{\lambda_i\lambda_j}}{\lambda_i + \lambda_j + \theta\sqrt{\lambda_i\lambda_j}}$

In this instance the probability of a draw is proportional to the geometric mean of the skill parameters of the two players. If the value of $\theta = 0$, then there is no chance of draws and we reproduce the basic Bradley-Terry model.

### 3. Multi-Person Matchups

One generalisation of the basic model is to allow for the case of having a single winner in a game that has more than two players. This could be the case of a family playing a board game or a finding the best chocolate out of a selection box.

$P(i \text{ wins}) = \cfrac{\lambda_i}{\lambda_1 + \lambda_2 + \ldots + \lambda_n}$

You can even model games made up of different teams of different sizes playing against each other by building the information about each players individual skill into the model. This could be used for team sports such as doubles tennis or track cycling.

$P( \text{ 1-2-5 wins versus 3-4 and 2-5 }) = \cfrac{\lambda_1 \lambda_2 \lambda_5}{\lambda_1 \lambda_2 \lambda_5 + \lambda_3 \lambda_4 + \lambda_2 \lambda_5}$

Obviously this approach does have a fairly large assumption built into it: that the quality of a team is essentially just a combination of its parts. Some sports might work well using this such as baseball, where each individuals contribution to the team is pretty much independent of his team-mates. I would argue that more fluid games such as football are better summarised as: you’re only as good as you’re weakest link. If you compare a pair of “average” centre midfielders and a pair made up of a”star” and a “rubbish” centre midfielder, then the second pair will probably be worse. Remember team-work and cohesion plays an important part in football.

It is even possible to account for comparisons and rankings between more than two players in a game. This is better known as a Plackett-Luce model. If we have three players then the model becomes,

$P(i \text{ comes 1st}, j \text{ comes 2nd}, k \text{ comes 3rd}) = \cfrac{\lambda_i\lambda_j}{(\lambda_i + \lambda_j + \lambda_k)(\lambda_j+\lambda_k)}$

You can think about this with the analogy of picking balls out of a vase. Imagine there are three colours of balls: red, green and blue. Suppose that the proportion of balls in this infinite vase is $n_r, n_g \text{ and } n_b$ which will basically be our skill parameters. Then suppose that we want the probability of some order of balls being drawn, say green-red-blue. Remembering that we need to resample if we get a colour that we previously had, we can write the probability of this ordering as,

$P(\text{green, red, blue}) = \cfrac{n_g}{(n_r + n_g + n_b)} * \cfrac{n_r}{(n_r+n_b)} * \cfrac{n_b}{n_b}$

$P(\text{green, red, blue}) = \cfrac{n_g n_r}{(n_r + n_g + n_b)(n_r+n_b)}$

This takes the form of the general Plackett-Luce model for comparisons seen above.