What is Bayes’ Theorem?

One of the most famous theorems in statistics and probability is that of Bayes’ Theorem, which first appeared around 250 years ago. It allows us to calculate reverse probabilities and use new evidence to update our beliefs. For example the probability of a hypothesis given a set of evidence can be found from the probability of that evidence given a hypothesis.

To understand Bayes’ Theorem it is important to have a basic understanding of conditional probability. This is the probability of something happening given that some event has already happened. Some examples of conditional probabilities are given below,

  • Given that Watford scored a goal, what was the probability that Odion Ighalo scored?
  • Given that it rained yesterday, what is the probability that it will remain tomorrow?
  • Given a sports centre has a swimming pool, what is the probability it also has a gym?

Bayes’ Theorem

Bayes’ famous theorem related the conditional probabilities of two events A and B together.

P(A|B) = \cfrac{P(B|A)P(A)}{P(B)}

  • P(A|B) is the probability of event A happening given that event B has happened
  • P(B|A) is the probability of event B happening given that event A has happened
  • P(A) is the probability of event A happening
  • P(B) is the probability of event B happening

Whilst this might appear to be a relatively simple idea conditional probabilities are often misunderstood. The probability that I have an umbrella when it is raining is not the same as the probability that it is raining when I have an umbrella.

A Medical Example

To illustrate this I will describe the classic example of testing people for a disease. You might find the results quite surprising! Let event A be the event that you have disease and event B be the event that you test positive for the disease.

labtests

Notation Explained

  • P(A) = probability you have the disease
  • P(not A) = probability that you don’t have the disease = 1 – P(A)
  • P(B) = probability that you test positive
  • P(A|B) = probability that you have the disease, given you test positive
  • P(B|A) = probability that you test positive, given you have the disease
  • P(B| not A) = probability that you test positive, given you don’t have the disease

Generally medical tests might find the disease 95% of the time when you have the disease, so P(B|A) = 0.95. Then if we assume that 5% of the time people who are are healthy also get a positive result, we have P(B| not A) =0.05. Finally if the probability of having the disease in the first place is 1 in 1000, we get that P(A)=0.001 and P(not A)=0.999.

venn

First we need to find the probability of a positive test by conditioning on whether we do or do not have the disease.

P(B) = P(B|A)P(A) + P(B|\text{not} A)P(\text{not} A)

P(B) = 0.95*0.001 + 0.05*0.999 = 0.0509

We can then use the fact that the probability of a positive test is about 5.1% in the denominator of Bayes Theorem below.

P(A|B) = \cfrac{P(B|A)P(A)}{P(B)} = \cfrac{0.95*0.001}{0.0509} = 0.0187

This result of 0.0187 is 18.7%, so we see that the probability of having the disease given that you’ve tested positive is actually 18.7%, much lower than most people’s original guess of 95%!

This blog post was mainly based on this recent article on plusmaths. Other interesting plus.maths articles that explain some interesting applications of Bayes’ Theorem are:

  • Was Paul the octopus really psychic? link
  • Misuse of statistics in court, link
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s