"

Estimating a population proportion (CI for 1 proportion)

Now we get to the good stuff! We will need to know how to compute the margin of error, and then how to use that along with other pieces of information to compute or construct the confidence interval.

The Margin of Error for 1 Proportion:

[latex]E = z_{\frac{\alpha}{2}} \times \sqrt{\frac{\hat{p}\hat{q}}{n}}[/latex]

The Confidence Interval for 1 Proportion:

Basic Form Inequalities Interval Notation
[latex]\hat{p} \pm E[/latex] [latex]\hat{p} - E \le p \le \hat{p} + E[/latex] [latex](\hat{p} - E, \hat{p} + E)[/latex]

What the different symbols mean:

[latex]x[/latex] is the number of successes or observations (not always needed)

[latex]n[/latex] is the sample size (number of people, items, etc… in the study)

[latex]p[/latex] is the population proportion

[latex]\hat{p}[/latex] is the sample proportion (or percentage), given by [latex]\hat{p} = \frac{x}{n}[/latex]

[latex]\hat{q}[/latex] is what is left over from the sample proportion (or percentage), given by [latex]\hat{q} = 1 - \hat{p}[/latex]

[latex]E[/latex] is the margin of error

[latex]z_{\frac{\alpha}{2}}[/latex] is the critical value; we use the most common ones here based on the given confidence level

Confidence Level, 1 – α Critical Value, [latex]z_{\frac{\alpha}{2}}[/latex]
90% or 0.90 1.645
95% or 0.95 1.96
99% or 0.99 2.575

Assumptions when estimating 1 Proportion:

  • We have a simple random sample
  • The binomial distribution is in effect
    • Fixed number of trials
    • Trials are independent
    • Two categories for outcomes
    • Probabilities constant for each trial
  • There are at least 5 successes and at least 5 failures

Steps to construct the Confidence Interval for 1 Proportion:

  • Identify all the symbols listed above (all the stuff that will go into the formulas). This includes [latex]x[/latex] (if necessary), [latex]n[/latex], [latex]\hat{p}[/latex], [latex]\hat{q}[/latex], the confidence level, and [latex]z_{\frac{\alpha}{2}}[/latex].
  • Substitute values into the formula for [latex]E[/latex] and simplify to get the margin of error. Keep 4 or more decimal places for now.
  • Substitute values into the formula for the CI and simplify. Round your CI limits to three significant digits (usually this means three decimal places). Sometimes the specific instructions for an exercise will tell you to round differently; ; if the instructions ask you to round shorter, that’s ok, but make sure you apply that AT THE END.

Example 1: Family planning[1]

The website for UN Gender Statistics contains data on 52 quantitative and 11 qualitative indicators addressing relevant issues related to gender equality and /or women’s empowerment.

Suppose that in a random sample of 1000 women ages 15 – 49, 790 have access to family planning resources. Construct the 95% Confidence Interval for the true proportion of women, ages 15-49, who have access to family planning resources.

Solution

In this example, women either have access or they do not; this puts us in a binomial distribution (2 outcomes, each independent). Here we are not given the value of [latex]\hat{p}[/latex] so we have to compute it using x and n. Let’s go through and identify the values we need.

  • [latex]x = 790[/latex]
  • [latex]n = 1000[/latex]
  • [latex]\hat{p} = \frac{x}{n} = \frac{790}{1000} = 0.79[/latex]
  • [latex]\hat{q} = 1 - \hat{p} = 1 - 0.79 = 0.21[/latex]
  • [latex]z_{\frac{\alpha}{2}} = 1.96[/latex] since the confidence level is 95%
  • [latex]E = z_{\frac{\alpha}{2}} \times \sqrt{\frac{\hat{p}\hat{q}}{n}} = 1.96 \times \sqrt{\frac{0.79 \times 0.21}{1000}} = 0.025245[/latex]

Here’s the confidence interval, in the three different forms:

Basic Form Inequalities Interval Notation
[latex]0.79 \pm 0.025245[/latex] [latex]0.79 - 0.025245 \le p \le 0.79 + 0.025245[/latex] [latex](0.79 - 0.025245, 0.79 + 0.025245)[/latex]
[latex]0.765 \le p \le 0.815[/latex] [latex](0.765, 0.815)[/latex]
[latex]79\% \pm 2.5\%[/latex] [latex]76.5\% \le p \le 81.5\%[/latex] [latex](76.5\%, 81.5\%)[/latex]

Notice that we started off with the decimal form but at the end we converted to percent. This isn’t completely necessary, but in many cases, when dealing with proportions (%), it makes more sense to report the answers as percents at the end.

So what does this mean, what did we find out? We are 95% confident that the true population proportion of women ages 15 to 49 years old who have access to family planning resources is between 76.5% and 81.5%.

Example 2: Smokers

Suppose that in a random sample of 2000 adults in Imperial County, 350 are smokers.  Construct a 99% confidence interval for the true proportion of smokers.

Solution

In this example, Imperial County adults either smoke or they don’t.; this puts us in a binomial distribution (2 outcomes, each independent). Here we are not given the value of [latex]\hat{p}[/latex] so we have to compute it using x and n. Let’s go through and identify the values we need.

  • [latex]x = 350[/latex]
  • [latex]n = 2000[/latex]
  • [latex]\hat{p} = \frac{x}{n} = \frac{350}{2000} = 0.175[/latex]
  • [latex]\hat{q} = 1 - \hat{p} = 1 - 0.175 = 0.825[/latex]
  • [latex]z_{\frac{\alpha}{2}} = 2.575[/latex] since the confidence level is 99%
  • [latex]E = z_{\frac{\alpha}{2}} \times \sqrt{\frac{\hat{p}\hat{q}}{n}} = 2.575 \times \sqrt{\frac{0.175 \times 0.825}{2000}} = 0.021878[/latex]

Here’s the confidence interval, in the three different forms:

Basic Form Inequalities Interval Notation
[latex]0.175 \pm 0.021878[/latex] [latex]0.175 - 0.021878 \le p \le 0.175 + 0.021878[/latex] [latex](0.175 - 0.021878, 0.175 + 0.021878)[/latex]
[latex]0.153 \le p \le 0.197[/latex] [latex](0.153, 0.197)[/latex]
[latex]17.5\% \pm 2.19\%[/latex] [latex]15.3\% \le p \le 19.7\%[/latex] [latex](15.3\%, 19.7\%)[/latex]

Notice that we started off with the decimal form but at the end we converted to percent. This isn’t completely necessary, but in many cases, when dealing with proportions (%), it makes more sense to report the answers as percents at the end.

So what does this mean, what did we find out? We are 99% confident that the true population proportion of adults in the Imperial Valley who smoke is between 15.3% and 19.7%.

Example 3: tablets[2]

Suppose 400 randomly selected students at IVC are surveyed to determine if they own a tablet (iPad, Samsung, Fire, etc…). Of the 400 surveyed, 80 reported owning a tablet. Using a 90% confidence level, compute a confidence interval estimate for the true proportion of IVC students who own tablets.

Solution

In this example, IVC students either have a tablet or they do not; this puts us in a binomial distribution (2 outcomes, each independent). Here we are not given the value of [latex]\hat{p}[/latex] so we have to compute it using x and n. Let’s go through and identify the values we need.

  • [latex]x = 80[/latex]
  • [latex]n = 400[/latex]
  • [latex]\hat{p} = \frac{x}{n} = \frac{80}{400} = 0.2[/latex]
  • [latex]\hat{q} = 1 - \hat{p} = 1 - 0.2 = 0.8[/latex]
  • [latex]z_{\frac{\alpha}{2}} = 1.645[/latex] since the confidence level is 90%
  • [latex]E = z_{\frac{\alpha}{2}} \times \sqrt{\frac{\hat{p}\hat{q}}{n}} = 1.645 \times \sqrt{\frac{0.2 \times 0.8}{400}} = 0.0329[/latex]

Here’s the confidence interval, in the three different forms:

Basic Form Inequalities Interval Notation
[latex]0.2 \pm 0.0329[/latex] [latex]0.2 - 0.0329 \le p \le 0.2 + 0.0329[/latex] [latex](0.2 - 0.0329, 0.2 + 0.0329)[/latex]
[latex]0.167 \le p \le 0.233[/latex] [latex](0.167, 0.233)[/latex]
[latex]20\% \pm 3.29\%[/latex] [latex]16.7\% \le p \le 23.3\%[/latex] [latex](16.7\%, 23.3\%)[/latex]

Notice that we started off with the decimal form but at the end we converted to percent. This isn’t completely necessary, but in many cases, when dealing with proportions (%), it makes more sense to report the answers as percents at the end.

So what does this mean, what did we find out? We are 90% confident that the true population proportion of IVC students who have a tablet is between 16.7% and 23.3%.


  1. Adapted from Social Justice Statistics by Bonnie Blustein et al., licensed under CC BY-NC-Sa 4.0
  2. Adapted from OpenStax Statistics by Barbara Illowsky & Susan Dean, licensed under CC BY 4.0

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Basic Statistics Copyright © by Allyn Leon is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.