A statistician’s guide to coronavirus numbers
The Royal Statistical Society’s Statistical Ambassadors have collated an essential guide for understanding statistics about Covid-19. Here, they list definitions, things to look out for, and what you should do about the numbers you are seeing.
During this Covid-19 pandemic, you will hear or read about many different numbers. The Royal Statistical Society exists to help the public better understand statistics. We have prepared this short guide to help you at this difficult and uncertain time.
- The number of confirmed cases will be less than the number of actual cases.
- Comparisons of case and death numbers between countries may not be meaningful.
- Models produce estimates with plausible ranges. These models can help us understand the likely effects of policies.
Words you may see and hear
A Covid-19 confirmed case means a person with a positive test result for the virus. A confirmed case is active if the person is still infected: they have not recovered or died. A Covid-19 death means a confirmed case that has died.
The case fatality rate is the number of deaths divided by the number of confirmed cases. This is also called the case fatality ratio.
The transmission rate is the expected number of direct infections from one case, in one unit of time (such as a day). The basic reproductive number is one case's expected number of direct infections. This is the average number for the whole infectious period.
The crude mortality rate is the number of deaths as a proportion of the whole population. Researchers calculate this rate for different regions and countries.
What you need to consider
The number of confirmed cases will be less than the actual number of cases. Some infected people will not experience symptoms. Having consistent symptoms is not enough for a confirmed diagnosis. People with consistent symptoms may not get tested for the virus.
Testing is not perfect. Countries differ in their rules around who gets tested for Covid-19, and which tests they use. Sometimes, a test will say an infected person does not have the virus. Testing rules, capacity and quality all affect numbers of confirmed cases.
Comparisons of confirmed cases between countries are challenging. Different countries have different demography, health policy, social structures, and cultures. Countries have different testing regimes, which can change over time. Countries may also be at different phases of the epidemic. Disparities between countries in cases and deaths may partly result from these differences.
Treat the case fatality rate with caution. There is uncertainty about the number of cases and deaths. Mild cases and cases without symptoms can go undetected. New cases may not reach recovery or death for several days or weeks. Healthcare systems could record a death from Covid-19 as pneumonia or another cause. Future deaths of those already infected are not included in the current calculation.
Transmission rates describe averages. The transmission rate measures how fast a disease spreads. Some people may infect greater or fewer other people. This depends on behaviour, contact frequency, biology, chance, and other factors.
Exponential growth cannot continue forever. As an example, one person has the virus. They pass it to three people, who in turn pass it to nine other people. Now, there are 13 cases. This is exponential growth in cases. Exponential growth is a phase in an epidemic. Continuing this simple trend is inappropriate for long-term forecasting. As more people recover or die, fewer people can get the virus. At some point, the number of new cases must start to decrease.
Models transform inputted values into estimates. To use a simple example, a model takes a transmission rate and produces an estimate of total deaths. A scientist may update the model to reflect new public health advice to stay at home. They do this by lowering the transmission rate in their model. That lower transmission rate leads to a lower death estimate from the model. These two inputs represent two different scenarios.
Different models have different purposes. Following new data and discussions, scientists update their assumptions and models. Models which produce different estimates can be consistent with each other.
Modelling involves many layers of uncertainty. There is uncertainty about how fast the virus spreads, and how many people recover. Uncertainty flows through these models. As scientists observe more information, estimates become more precise.
We should focus on plausible ranges of values, rather than a single number.
What we need you to do
Trust in public statistics is vital. Misinformation can damage people’s health.
- Please check your sources before sharing stats.
- If a number does not look correct, see if a trustworthy organisation has reported that number.
- One statistic may not tell the whole story. News items and social media posts could present numbers out of context.
- Graphs can mislead. Check for clear labels, which show where the data comes from. Graphs should represent the numbers in a proportionate way.
Data builds evidence. Evidence informs decisions. Accuracy matters: in print, on television and radio, and online. Share statistics, not misinformation.
On 15 April 2020, we updated the article to correct the definition of transmission rate and basic reproductive number.