"

What is a Confidence Interval?

A quick search for confidence intervals online gives us a quick definition from the Oxford Languages Dictionary:

con·fi·dence in·ter·val
/ˈkänfəd(ə)ns ˈin(t)ərvəl/

noun

STATISTICS
  1. a range of values so defined that there is a specified probability that the value of a parameter lies within it.

The search also brings up the following formula:

[latex]CI = \bar{x} \pm z\frac{s}{\sqrt{n}}[/latex]

The Basic Idea

Confidence intervals show up in many areas of our everyday lives, most commonly in the news, and very heavily in politics. A confidence interval is basically a range of values that helps estimate the “true” value from a population, compared to our statistic computed from a sample. What follows next are a few examples of what a confidence interval would look like in journals or other publications, and how those results are presented to the public.

Some Examples of Confidence Intervals

Example 1: Asking for Identification and Retail Tobacco Sales to Minors

Stores asked minors for identification in 79.6% (95% confidence interval: 79.3%–80.8%) of compliance checks (N = 17 276). Violations after identification requests constituted 22.8% (95% confidence interval: 20.0%–25.6%; interstate range, 1.7%–66.2%) of all violations and were nearly 3 times as likely when minors were required to carry identification in compliance checks.

Source: https://pediatrics.aappublications.org/content/pediatrics/early/2020/04/21/peds.2019-3253.full.pdf

Example 2: Estimation of incubation period and serial interval of COVID-19: analysis of 178 cases and 131 transmission chains in Hubei province, China

Our estimated median incubation period of COVID-19 is 5.4 days (bootstrapped 95% confidence interval (CI) 4.8–6.0), and the 2.5th and 97.5th percentiles are 1 and 15 days, respectively…Ninety-five per cent of symptomatic cases showed symptoms by 13.7 days (95% CI 12.5–14.9).

Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7324649/

Example 3: Long-term blood pressure variation and risk of dementia

After adjustment for age, sex and other factors that could affect the findings, at 15 years people in the highest quintile, who exhibited an increase in systolic blood pressure, had a hazard ratio of 3.31 (95% Confidence Interval 2.11-5.18) for risk of dementia as compared with those in the quintile with the least change in blood pressure.

Source: https://www.sciencedaily.com/releases/2019/11/191113095243.htm

Example 4: Pfizer-BioNTech COVID-19 Vaccine, Grading of Recommendations, Assessment, Development, and Evaluation

Compared with no vaccination, vaccination with Pfizer-BioNTech COVID-19 vaccine was associated with a decreased risk of symptomatic laboratory-confirmed COVID-19 (RR 0.07, 95% CI 0.05–0.13; evidence type 2), hospitalization (RR 0.06, 95% CI 0.03–0.12; evidence type 2), and death due to COVID-19 (RR 0.04, 95% CI 0.02–0.09; evidence type 2).

Source: https://www.cdc.gov/vaccines/acip/recs/grade/covid-19-pfizer-biontech-vaccine.html

Example 5: Most U.S. Latinos say global climate change and other environmental issues impact their local communities

About seven-in-ten Hispanic adults (71%) say climate change is affecting their local community at least some, a higher share than among non-Hispanic adults (54%), the April survey of U.S. adults found. The margin of sampling error for the full sample of 13,749 respondents is plus or minus 1.4 percentage points.

Source: https://www.pewresearch.org/fact-tank/2021/10/04/most-u-s-latinos-say-global-climate-change-and-other-environmental-issues-impact-their-local-communities/

Example 6: Diabetes Mellitus and COVID-19: Associations and Possible Mechanisms

… most prevalent comorbidities in people with COVID-19 were hypertension (17 ± 7%, 95% confidence interval (CI) 14–22%) and diabetes (8 ± 6%, 95% CI 6–11%), followed by cardiovascular diseases (5 ± 4%, 95% CI 4–7%) and respiratory system disease (2 ± 0%, 95% CI 1–3%) [4]… diabetes mellitus is thought to increase the risk of COVID-19 infection [5].

Source: https://www.hindawi.com/journals/ije/2021/7394378/

But what does it all mean?

That’s what comes next. Within this chapter we will take a look at some of the terminology, formulas, and concepts related to Confidence Intervals.

Key Terminology and Formulas

Point Estimate: A single value (statistic) used to estimate a population parameter. We will use [latex]\hat{p}[/latex] for proportions and [latex]\bar{x}[/latex] for the mean.

The point estimate is also known as a statistic or estimate, and can be a value such as a mean, proportion (%), standard deviation, or variance. We will be focusing primarily on the mean and proportion.

Sample Size: The number of items or individuals used in the sample (for example, the number of people who were surveyed, or the number of trees used in a study).

Confidence Level: The probability that the CI actually does contain the population parameter. This is also known as the degree of confidence or the confidence coefficient. The confidence level will almost always be 90%, 95%, or 99% (written as decimals, 0.90, 0.95, or 0.99). I like to think of this as the “level of sureness,” or how sure we are that our range of values actually contains the “real” parameter.

Significance Level: For now, we will define this as what is “left over” from the confidence level, and we will use the symbol alpha, α. This looks like a lowercase “a,” or a drawing of a little fish. The confidence level is also defined as 1 – α in many cases.

Critical Value: This is a z-score based on alpha, and is the number boundary that separates sample statistics that are very high or very low from those that are not. The symbol for our critical value will take one of two forms, depending on the situation. The first form will be [latex]z_{\frac{\alpha}{2}}[/latex] when we assume a normal distribution. The second form will be [latex]t_{\frac{\alpha}{2}}[/latex], when we deal with the t-distribution. For now, we will use the three most common critical values according to this chart:

Confidence Level, 1 – α Critical Value, [latex]z_{\frac{\alpha}{2}}[/latex]
90% or 0.90 1.645
95% or 0.95 1.96
99% or 0.99 2.575

Confidence Interval: A range of values around a statistic that has been calculated, abbreviated as CI. The CI is meant to estimate the “true value” of the population parameter (proportion, mean, etc…). The  confidence interval is generally written in one of three forms:

parameter [latex]\pm[/latex] margin of error

point estimate – margin of error [latex]\le[/latex] parameter [latex]\le[/latex] point estimate + margin of error

(point estimate – margin of error, point estimate + margin of error)

These will get more specific as we venture into the different types of confidence intervals.

Margin of Error: This is the “plus or minus” value that gets added to and subtracted from the point estimate in order to obtain the CI. The margin of error is also called the maximum error of the estimate, usually denoted by the symbol E.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Basic Statistics Copyright © by Allyn Leon is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.