Why Lawyers Need Statistics

I previously talked about Bayes’ theorem and its often misunderstood applications. Normally these mistakes aren’t particular costly or harmful in the world of statistics, but if they are used to make decisions that impact on the real world then getting things wrong can be extremely costly.

One place where statistics can be called upon to influence important matters is the court. Throughout the last 50 years there has been an increase in the use of statistics in court matters and it is important that everybody involved understands them and their use. If any or all of the prosecution, defence or jury misinterpret the information given to them then the chances of a miscarriage of justice will greatly increase.

\text{Prob of matching a description} \neq \text{Prob of matching a description and being guilty}

The classical mistake made in the past by many prosecuting teams is that of the ‘prosecutors fallacy‘. This is when the prosecution or defence have presented the jury with some statistic such as a probability that has been calculated incorrectly, yet manage to convince the jury to accept its truth.

Scales1

Scales of justice; source

The Sally Clark Case

One of the most famous cases of the misuse of statistics was in the Sally Clark court case back in the late 1990’s.

Sally Clark was a solicitor whose first born child died at just under three months. Initially people were supportive and sympathetic to the sudden loss of the child, but when her second child died a year later at just 2 months old attitudes changed. This time the death was treated as suspicious and her eldest child’s death was also re-investigated. Eventually Sally Clark was charged with the murder of both her children.

A lack of conclusive forensic or anecdotal evidence meant that the prosecution needed some other way of trying to prove her guilt – so they turned to statistics.

The case centered around the statement given by Sir Roy Meadow, who appeared for the prosecution, and estimated that the chances of two siblings both dying of “cot-death” were 1 in 73 million. In other words as the birth rate in the UK was about 650,000 per year, the chance of this happening was only once in every 112 years. This statement was grasped on by both the media and the jury, which lead to Sally Clark being found guilty. There was only one problem, it was fundamentally wrong.

The statistic of 1 in 73 million came from a detailed study of the deaths of babies between 1993 and 1996. It found that the probability of a randomly chosen baby dying a cot death to be 1 in 1300. The odds of this then substantially decreased when you account for the family being non-smoking, well-off and with a mother over 26, to a new value of around 1 in 8500.

What Sir Roy Meadow did was to say that if the probability of one child dying from cot death is 1 in 8500, then to find the probability of two children dying of cot death in the same family all we need to do is square 1 in 8500. This leads us to the headline statistics of 1 in 73 million. This would be fine if the deaths are actually independent, but they aren’t.

1 in 8500 * 1 in 8500 \approx 1 in 73 million

Another study of the same data set by Ray Hill found that the chances of two cot deaths were not independent at all. In fact if there was one cot death in the family then the probability of a second cot death was actually much higher at about 1 in 100. Using this we find that for the general population the chance of a double cot death is 1/1300 * 1/100 = 1/130,000. If 650,000 babies are born a year, then around 5 families a year should experience a double cot death, which was backed up the data.

1 in 1300 * 1 in 100 \approx 1 in 130,000

It must be remembered that the point of using statistics in this manner is not to make a decision about whether someone is innocent or guilty, but rather whether there is sufficient evidence to outweigh what the statistics suggests. As a result we need to compare the relative likelihoods. The correct way to do this mathematically is to compare the two hypothesis through the use of Bayes’ Theorem as seen below.

P(H|D) = \cfrac{P(D|H)P(H)}{P(D|H)P(H) + P(D|A)P(A)}

  • H is the hypothesis that both of the children died of cot death.
  • A is the alternative hypothesis that both the children did not die of cot death.
  • D is some data: that both children are dead.

Explaining the Probabilities

  • P(H) = 0.00001

The probability that a random pair of siblings die a cot death. This was estimated at the time as around 1 in 100,000.

  • P(A) = 1 – P(H) = 0.99999

The probability that a random pair of siblings do not die a cot death.

  • P(D|H) = 1

Given that both children died of cot death, the probability that they are dead is one.

  • P(D|A) = 0.0000046

Given that two siblings died of something other than cot death (i.e. were murdered), this is the probability that they died. In other words it is the probability that a random pair of siblings will be murdered. This is quite difficult to estimate due to the rarity of double child murders. The Home Office said that around 30 children were known to be murdered by their mother each year. So the probability of a single murder will be about 30/650,000 = 0.000046. However since double murders are less common it makes sense that the actual probability would be even smaller, say 10 times smaller, at 0.0000046.

  • P(H|D) = ?

Given the children are dead, what’s the probability that they both died of cot death?

There are obviously lots of other factors that in an ideal world we like to take into account such as the income level of the family or the number of interventions by social services. These are excluded for simplicity and the lack of available data on the effects of these things.

Then simply by using the standard application of standard Bayes Theorem we can see the equation for P(H|D).

P(H|D) = \cfrac{P(D|H)P(H)}{P(D|H)P(H) + P(D|A)P(A)}

P(H|D) = \cfrac{1*0.00001}{1*0.00001 + 0.0000046 *0.99999} > \cfrac{2}{3}

So we can see that we get a rough estimate of the probability of both children dying of cot death as around 0.66. This means that rather than the statistics saying that Sally Clark needed to prove her innocence, they actually said that the prosecution needed to prove her guilt.

See the full article on plus.maths

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s