Inference for a Population Proportion (HT for 1 Proportion)

Now we get to the good stuff! We will need to know how to label the null and alternative hypothesis, calculate the test statistic, and then reach our conclusion using the critical value method or the p-value method.

The Test Statistic for a 1 Proportion Test:

[latex]z = \displaystyle \frac{\hat{p} - p}{\sqrt{\frac{p \times q}{n}}}[/latex]

What the different symbols mean:

[latex]x[/latex] is the number of successes or observations (not always needed)

[latex]n[/latex] is the sample size (number of people, items, etc… in the study)

[latex]p[/latex] is the population proportion, this will be used in the null and alternative hypotheses as well

[latex]\hat{p}[/latex] is the sample proportion (or percentage), given by [latex]\hat{p} = \frac{x}{n}[/latex]

[latex]q[/latex] is what is left over from the population proportion (or percentage), given by [latex]q = 1 - p[/latex]

[latex]\alpha[/latex] is the significance level, usually given within the problem, or if not given, we assume it to be 5% or 0.05

Assumptions when conducting a 1 Proportion Test:

  • We have a simple random sample
  • The binomial distribution is in effect
    • Fixed number of trials
    • Trials are independent
    • Two categories for outcomes
    • Probabilities constant for each trial
  • There are at least 5 successes and at least 5 failures
  • [latex]np \ge 5[/latex] and [latex]nq \ge 5[/latex]

Steps to conduct the 1 Proportion Test:

  • Identify all the symbols listed above (all the stuff that will go into the formulas). This includes [latex]x[/latex] (if necessary), [latex]n[/latex], [latex]\hat{p}[/latex], [latex]p[/latex], [latex]q[/latex], and [latex]\alpha[/latex]
  • Identify the null and alternative hypotheses
  • Calculate the test statistic, [latex]z = \displaystyle \frac{\hat{p} - p}{\sqrt{\frac{p \times q}{n}}}[/latex]
  • Find the critical value(s) OR the p-value OR both
  • Apply the Decision Rule
  • Write up a conclusion for the test

Example 1: Flint River Water Safety

Adapted from the Skew The Script curriculum (skewthescript.org), licensed under CC BY-NC-Sa 4.0

A few years back, there was a lot of news surrounding the use of the Flint River as a water Source. Researchers claimed the new water from the Flint River was responsible corroded pipes, allowing lead to get into the water. Among many other side effects, ingesting lead can hurt children’s brain development. According to EPA regulations, if more than 10% of homes in a city have high lead content in their water (>15 parts per billion), then the city’s water supply is deemed unsafe. A Virginia Tech study from 2015 randomly sampled water from 252 homes in Flint. Of those, 42 had high lead content. Is there convincing statistical evidence that the Flint water system is unsafe?

Solution

Since we are being asked for convincing statistical evidence, a hypothesis test should be conducted. In this case, we are dealing with rates or percents from one sample or group (the homes in Flint), so we will conduct a 1 Proportion Test.

  • [latex]x = 42[/latex]
  • [latex]n = 252[/latex]
  • [latex]\hat{p} = \frac{x}{n} = \frac{42}{252} = 0.17[/latex]
  • [latex]\alpha = 0.05[/latex] (we were not told a specific value in the problem, so we are assuming it is 5%)
  • Null and Alternative Hypothesis: Since the EPA regulations state that more than 10% would be concerning, and the researcher claims that the water is bad, the claim that goes along with the alternative hypothesis is that [latex]p[/latex] is greater than 10% or 0.10. In our example here, the idea that “nothing is different” would be equivalent to saying that [latex]p[/latex] is the same as (equal to) 10% or 0.10.
    • [latex]H_{0}: p = 0.10[/latex]
    • [latex]H_{A}: p > 0.10[/latex]
  • [latex]p = 0.10[/latex] (from the null hypothesis)
  • [latex]q = 1 - p = 1 - 0.10 = 0.90[/latex]
  • Test Statistic
    • [latex]z = \displaystyle \frac{\hat{p} - p}{\sqrt{\frac{p \times q}{n}}} = \displaystyle \frac{0.17 - 0.10}{\sqrt{\frac{0.10 \times 0.90}{252}}} = 3.70[/latex]  (remember we round z-scores to 2 places)
  • P-Value: The p-value is found by looking up the test statistic calculated (in this case [latex]z = 3.70[/latex]) in the normal distribution table. We find that this corresponds to a value of [latex]0.9999[/latex]. Since this is a “greater than” test, we subtract from one, and get [latex]p-value = 1 - 0.9999 = 0.0001[/latex].
  • Applying the Decision Rule: We now compare this to our significance level, which is [latex]\alpha = 0.05[/latex]. If the p-value is smaller or equal to the alpha level, we have enough evidence for our claim, otherwise we do not. Here, [latex]p-value = 0.0001[/latex], which is definitely smaller than [latex]\alpha = 0.05[/latex], so we have enough evidence for the claim…but what does this mean?
  • Conclusion: Because our p-value of [latex]0.0001[/latex] is less than our [latex]\alpha[/latex] level of [latex]0.05[/latex], we reject [latex]H_{0}[/latex]. We have convincing evidence that more than 10% of homes in Flint have high lead concentration in their water supply.

Example 2: Job Placement for Vigilantes

An opinion poll asks an SRS of 100 vigilantes (like the Green Arrow, Luke Cage, Batman, etc…) how they view their job prospects after retiring from crime-fighting. In all, 53 say “Good.” Does this poll give reason to conclude that more than half of all vigilantes think their job prospects after retiring from crime-fighting are good? Use a significance level of 10%.

Solution

Surveys that use data from opinions or categories can use hypothesis testing for one proportion. In this case, we are dealing with a category where individuals rate their prospects as “good,” so we will conduct a 1 Proportion Test.

  • [latex]x = 53[/latex]
  • [latex]n = 100[/latex]
  • [latex]\hat{p} = \frac{x}{n} = \frac{53}{100} = 0.53[/latex]
  • [latex]\alpha = 0.10[/latex] (we were told a specific value in the problem, so we use 10% here as indicated)
  • Null and Alternative Hypothesis: Since the question is whether more than half of those surveyed think their prospects are good, the claim that goes along with the alternative hypothesis is that [latex]p[/latex] is greater than 50% or 0.50. In our example here, the idea that “nothing is different” would be equivalent to saying that [latex]p[/latex] is the same as (equal to) 50% or 0.50.
    • [latex]H_{0}: p = 0.50[/latex]
    • [latex]H_{A}: p > 0.50[/latex]
  • [latex]p = 0.50[/latex] (from the null hypothesis)
  • [latex]q = 1 - p = 1 - 0.5 = 0.50[/latex]
  • Test Statistic
    • [latex]z = \displaystyle \frac{\hat{p} - p}{\sqrt{\frac{p \times q}{n}}} = \displaystyle \frac{0.53 - 0.50}{\sqrt{\frac{0.50 \times 0.50}{100}}} = 0.60[/latex] (remember we round z-scores to 2 places)
  • P-Value: The p-value is found by looking up the test statistic calculated (in this case [latex]z = 0.60[/latex]) in the normal distribution table. We find that this corresponds to a value of [latex]0.7257[/latex]. Since this is a “greater than” test, we subtract from one, and get [latex]p-value = 1 - 0.7257 = 0.2743[/latex].
  • Applying the Decision Rule: We now compare this to our significance level, which is [latex]\alpha = 0.10[/latex]. If the p-value is smaller or equal to the alpha level, we have enough evidence for our claim, otherwise we do not. Here, [latex]p-value = 0.2743[/latex], which is definitely larger than [latex]\alpha = 0.10[/latex], so we do not have enough evidence for the claim…but what does this mean?
  • Conclusion: Because our p-value of [latex]0.2743[/latex] is greater than our [latex]\alpha[/latex] level of [latex]0.10[/latex], we fail to reject [latex]H_{0}[/latex]. We do not have enough evidence that more than 50% of all vigilantes think their job prospects after retiring from crime-fighting are good.

Example 3: Pediatric Asthma in the Imperial County

I had a conversation with a friend recently and the friend claimed the percentage of children with Asthma in the Imperial County had to be under 20%. A recent study published in the International Journal of Environmental Research and Public Health reported results of a survey administered to the parents of 357 children in the Imperial County. Of those children, 80 reported having a diagnosis of Asthma. Does the study support my friend’s claim that less than 20% of children in the Imperial County have asthma?

Solution

Surveys that use data from opinions, categories, or classifications can use hypothesis testing for one proportion. In this case, we are dealing with a category where children either have Asthma or do not, and we are also given a statement involving a percentage (proportion), so we will conduct a 1 Proportion Test.

  • [latex]x = 80[/latex]
  • [latex]n = 357[/latex]
  • [latex]\hat{p} = \frac{x}{n} = \frac{80}{357} = 0.22[/latex]
  • [latex]\alpha = 0.05[/latex] (we were not told a specific value in the problem, so we use 5% here as an assumption)
  • Null and Alternative Hypothesis: Since the question is whether less than 20% of the children involved in the study have Asthma, the claim that goes along with the alternative hypothesis is that [latex]p[/latex] is less than 20% or 0.20. In our example here, the idea that “nothing is different” would be equivalent to saying that [latex]p[/latex] is the same as (equal to) 20% or 0.20.
    • [latex]H_{0}: p = 0.20[/latex]
    • [latex]H_{A}: p < 0.20[/latex]
  • [latex]p = 0.20[/latex] (from the null hypothesis)
  • [latex]q = 1 - p = 1 - 0.2 = 0.80[/latex]
  • Test Statistic
    • [latex]z = \displaystyle \frac{\hat{p} - p}{\sqrt{\frac{p \times q}{n}}} = \displaystyle \frac{0.22 - 0.20}{\sqrt{\frac{0.20 \times 0.80}{357}}} = 0.94[/latex] (remember we round z-scores to 2 places)
  • P-Value: The p-value is found by looking up the test statistic calculated (in this case [latex]z = 0.94[/latex]) in the normal distribution table. We find that this corresponds to a value of [latex]0.8264[/latex]. Since this is a “less than” test, we keep the value from the table, and get [latex]p-value = 0.8264[/latex].
  • Applying the Decision Rule: We now compare this to our significance level, which is [latex]\alpha = 0.05[/latex]. If the p-value is smaller or equal to the alpha level, we have enough evidence for our claim, otherwise we do not. Here, [latex]p-value = 0.8264[/latex], which is definitely greater than [latex]\alpha = 0.05[/latex], so we do not have enough evidence for the claim…but what does this mean?
  • Conclusion: Because our p-value of [latex]0.8264[/latex] is greater than our [latex]\alpha[/latex] level of [latex]0.05[/latex], we fail to reject [latex]H_{0}[/latex]. We do not have enough evidence that less than 20% of children in the Imperial County have Asthma.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Basic Statistics Copyright © by Allyn Leon is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book