The role of statistics in understanding the impacts of vaccines and disease
Learning outcomes & key terms
Students will learn how statistics can be used to understand how diseases move through a population.
Specifically, they will learn the:
1.The reproduction number
3. The immunity threshold
4. The vaccination paradox
5. Relative risk
The measure of the probability of occurrence of a particular medical condition in a population within a specified period of time.
A figure expressing the average number of cases of an infectious disease arising by transmission from a single infected individual.
The “immunity threshold” or herd or community threshold refers to the percentage of the population that needs to become immune to an infectious disease so that people without immunity aren’t likely to interact with an infected person and become infected.
Describing how the requirements for life (for example oxygen, nutrients, water and removal of waste) are provided through the coordinated function of body systems such as the respiratory, circulatory, digestive, nervous and excretory systems.
Explaining how body systems work together to maintain a functioning body using models, flow diagrams or simulations.
Investigating the response of the body to changes as a result of the presence of micro-organisms.
Science as a human endeavour
Scientific understanding, including models and theories, is contestable and is refined over time through a process of review by the scientific community.
Advances in scientific understanding often rely on technological advances and are often linked to scientific discoveries.
People use scientific knowledge to evaluate whether they accept claims, explanations or predictions, and advances in science can affect people’s lives, including generating new career opportunities.
Values and needs of contemporary society can influence the focus of scientific research.
Science enquiry skills
Formulate questions or hypotheses that can be investigated scientifically.
Analyse patterns and trends in data, including describing relationships between variables and identifying inconsistencies.
Using statistics to understand how diseases move through the population
Some germs, that cause disease only in people, usually have to be transmitted from a sick person to another susceptible (not immune) person for the germ to survive
Germs usually make you unwell for days or weeks, during which time you may infect other people, then you recover, and your symptoms, like cough, go away. You stop spreading infection and you become immune
If each infected person with symptoms passes their germ onto at least one other person, the germ will survive
This relates to what epidemiologists call the reproduction number for the germ
What is “incidence”?
It is the measure of the probability of occurrence of a particular medical condition in a population within a specified period of time
In the 2010s, about 1 in every 100,000 Australians developed meningococcal disease every year, i.e. 1 meningococcal case per 100,000 people per year (in 25 million people, that’s 250 cases every year)
But, in Australia, meningococcal disease is 5 times more common in Aboriginal children so this group deserves improved access to immunisation (see the Australian National Immunisation Program website)
Reproduction number and immunity threshold
If, on average, one sick person causes 10 others to get sick (and the whole population lacks immunity), then the basic reproduction number is 10; and then those 10 people will each cause 10 others to fall ill; so 10×10 means 100 more people become ill and now we have an epidemic
There are some very cool and simple statistics that use the reproduction number to determine the proportion of the human population that needs to be immune to a particular infection, like measles to stop it spreading
You can be immune because of vaccination or by having recovered from a bout of measles. What is the proportion that need to be immune to stop a particular infection from spreading?
The aim is to get to the stage where the reproduction number is under 1. That means that not every case causes another case and the chain of infections eventually stops
Immunisation is the safest way to achieve this
Here’s an example
Let’s say a case of measles is caught overseas and brought back on a plane by an ill child to Australia
If the reproduction number of that germ is 10, there could soon be an epidemic
We need to get the reproduction number down to less than one, where not every case causes another case and the chain of infection stops, causing the disease to burn out
Well, if 90% (9 in 10) of the population are vaccinated against measles and immune, only 1 person in 10 on average can catch measles from the imported case
This is called the “immunity threshold” – scientists use the immunity threshold to determine if herd immunity can be achieved
There’s a simple mathematical equation to find out what the immunity threshold
1 minus ( 1 divided by ‘reproduction number’)
= 1 minus (1 /10)
= 1 minus 0.1
Otherwise known as 90%
So having at least 90% immune can stop an infection from spreading!
Why not work out the immunity threshold for a different virus?
What percentage (threshold) of people need to be immune, to stop its spread, when a virus has a Reproduction number of 2?
Click to reveal the answer
Answer: 1 minus 1/2 = 0.5 (50%)
Different thresholds for different diseases
Different vaccines for different diseases have different levels of effectiveness. For example, Measles and the Omicron variant of COVID-19 can both spread easily – each with a basic reproduction number of about 10, but the vaccines for each of these diseases work differently
Vaccination can control and eliminate measles infection and disease in a population, whereas COVID-19 is different: the vaccine can lessen the chance of someone getting very sick, but some people who are vaccinated can still get a very mild infection (called a breakthrough infection)
A key statistic for COVID-19 is not the proportion of mild breakthrough infections affecting vaccinated people, but the level of protection that vaccines offer against severe disease and death
The vaccines against COVID-19 may only prevent 50% of infections, but 90% of serious disease. The immune response to vaccines is strong, but the vaccines don’t completely prevent infection like the Measles vaccine does. Vaccinated people may get a mild infection and they may transmit infection but less than in unvaccinated people
So, herd immunity to COVID-19 is considered by many experts to be a real challenge and perhaps not achievable – even with 100% vaccination uptake, some people can still transmit the virus – vaccines do not achieve “sterilising immunity”
There are members of our community, like babies, the elderly and cancer sufferers, who can’t be vaccinated and they rely on the protection of vaccinated people
The first version of SARS (a related coronavirus) was discovered nearly 20 years ago (SARS-CoV-1) and research on vaccines and antibodies has been going for years. COVID-19 is caused by SARS-CoV-2. The vaccine developed in 2020 was based on decades of research, not an overnight exercise
Without vaccination, COVID-19 would have caused far more damage to society. It is estimated that millions of lives have been saved by COVID-19 vaccination. Community immunity has been produced because vaccinated people are less likely to transmit infection. The AstraZeneca vaccine has been used in 175 countries and in billions of people. The Pfizer vaccine and Chinese vaccines are the next most commonly used, in billions more
The Black Death killed 25 million people over 4 years in Europe during the 1300s
That’s similar to the entire population of Australia. There was no vaccine, only public health measures like quarantine
Vaccination saves lives and creates the safest path to possible herd immunity
What is the vaccination paradox?
If you observe, on a particular week, that the same number of people getting a serious infection are vaccinated as the number with the serious infection who are NOT vaccinated, does this mean that vaccination is not working?
To answer this question, you need some facts:
- When Australia went through the COVID-19 pandemic, the original aim was to get 80% of people vaccinated
- Let’s say that the vaccine was 75% effective protective against serious disease (hospitalisation and death)
Let’s look at the numbers and proportions:
- Take 1,000 people of which 80% are vaccinated – 800 are vaccinated but 200 are not
- Of the 800, 75% are vaccine protected, so 600 can’t catch the disease but 200 can
- Of the 200 unvaccinated people, none are protected, so 200 can catch disease
So, if there was a drastic exposure week where everyone was exposed to infection, you could see 200 vaccinated people with disease and 200 unvaccinated people with disease
It could look like vaccination wasn’t working – with the same number of cases in each group
BUT the vaccine is actually working! It is 75% effective and preventing 3 out of 4 infections
It’s just that when a large majority of people are vaccinated, like 80%, the paradox arises where the number of disease cases may appear similar in both groups
Here’s another example for you to consider, where hospitalisations due to COVID-19 may appear to be more common in the vaccinated. Why is that not the case?
We know that some behaviours can increase our risk
For example, not wearing sunscreen increases our risk of skin cancer, smoking increases our risk of lung cancer, and poor diet and not exercising increases our risk of health problems like diabetes
Have you ever wondered how that risk is calculated?
Epidemiologists call this relative risk: the risk of an outcome in one group of people exposed to something, compared to another group of people who were not exposed
Let’s use Cholera as an example
Cholera is an acute diarrhoeal disease caused infection in the gut by the bacterium Vibrio cholerae
A person is usually infected by ingesting water or food contaminated with faeces
The disease is common in parts of the world with poor sanitation, and can cause severe dehydration from diarrhoea and vomiting
There are vaccines available these days to protect against it, but this wasn’t always the case
In 1854, there was an outbreak of cholera in Soho, London. A man named John Snow methodically investigated the cases and concluded that the water pump in Broad Street was the source of the contamination
He further investigated the two water supply companies at the time: The Lambeth company which sourced water from a clean part of the Thames river, and the Southwark and Vauxhall company, which sourced water from more polluted parts of the river
Let’s say the Southwark & Vauxhall company provided water to pumps servicing 40,046 households, of which 286 households reported a cholera death (i.e. the incidence of cholera deaths in this group of households was 286 out of 40,046)
The Lambeth company supplied water to pumps servicing 26,107 households of which 14 households reported a cholera death (i.e. the incidence of cholera deaths in this group was 14 out of 26,107)
We know the households receiving water via the Southward & Vauxhall company were getting water from the polluted areas of the Thames river, so we’ll call them “exposed” to the pollution
The households receiving water from the Lambeth company were receiving cleaner water, so we’ll call them “unexposed” to the polluted water
If we put all this into a table:
|Cholera Deaths||Southwark & Vauxhall (exposed)||Lambeth (unexposed)||TOTAL|
Epidemiologists call this a 2 by 2 table
Relative Risk = Incidence in exposed ÷ Incidence in unexposed
Relative risk of cholera deaths in households supplied by Southwark & Vauxhall (compared to households supplied by Lambeth)
= (286÷40,046) ÷ (14÷26,107)
That is, people in households with water supplied by Southwark & Vauxhall were about 13 times more likely to die of cholera than people in households supplied by Lambeth
1: Define a “paradox”
A famous statistician, Richard Peto, coined the phrase “PETO’S PARADOX”
What is Peto’s Paradox?
(By the way, Prof Peto helped Prof Booy with his first statistical project 30 years ago in Oxford, England)
2: Let’s explore another issue regarding relative risk
You are an epidemiologist living in London in the 1950’s. You have noticed an increase in lung cancer patients presenting to hospital in London
There have also been recent weather events that resulted in a heavy smog (you can read about the Great Smog of London here)
On your way to work you notice advertising in which doctors and dentists are endorsing new brands of cigarettes (almost everyone smokes these days), and begin to wonder, whether the pollution is causing the increase in lung cancer deaths, or whether it could be smoking (or both)?
(Of course we now know smoking causes cancer and is extremely unhealthy!)
You undertake a study to investigate lung cancer (Professor Richard Peto has also been a key researcher in this area)
You send a questionnaire to all the lung cancer patients in the hospitals in the city, asking questions about where they usually live (to find if they are exposed to pollution through living in the city or not), and their smoking habits
You send the same questionnaire to a control group of non-cancer patients who were matched to the age and gender of the cancer group
The results of your study are in the 2×2 table below:
Disease group (all living in the city and exposed to smog)
Number of smokers (exposed)
Number of non-smokers (not exposed)
Lung cancer patients
Calculate the relative risk of lung cancer for smokers, compared to non-smokers
Relative Risk = Incidence of lung cancer in exposed ÷ Incidence of lung cancer in unexposed
Click to reveal the answer
Answer: (647÷969) ÷ (2÷29) = 0.6677 ÷ 0.0689 = 9.96, i.e. smokers were about 10 times more likely to get lung cancer than non-smokers
In this lesson we have learned about the role of statistics in immunisation
- The reproduction number
- The immunity threshold
- The vaccination paradox
- Relative risk
1) Incidence is
a) A measure of how many accidents there are in a population over a specified period of time
b) The probability of the occurrence of a particular medical condition in a population over a specified period of time
c) The probability of the occurrence of a particular medical condition in a population
d) A measure of how many people are immune in the population
2) The reproduction number is
a) The number of people a person infected with a disease will infect in a susceptible population
b) The number of babies born every year
c) The number of people who are vaccinated
d) The number of people who are infected every year
3) If vaccinated people are getting sick and going to hospital with the disease they’re vaccinated against,
a) It means the vaccine has stopped working
b) It means the vaccine is working, it’s just that there’s always a proportion of people for whom the vaccine doesn’t quite work. When most of a population is vaccinated you will see more people for whom the vaccine didn’t quite work than you would when most people are unvaccinated.
c) It means there are lots of unvaccinated people
d) It means there are lots of people being exposed to the disease
4) Relative risk is
a) The risk an outcome in a population
b) The risk of an outcome in one group of people exposed to something, relative to another group of people who were not exposed.
c) The risk of an outcome
d) The risk of an exposure in one group of people, relative to another group of people who were not exposed