Hypothesis Testing for 1 Sample: An Introduction

sandals29; Allyn Leon

What is a Hypothesis Test?

A quick search for hypothesis tests online gives us several websites with short definitions. Here’s one from a quick definition from the Stat Trek:

A statistical hypothesis is an assumption about a population parameter. This assumption may or may not be true. Hypothesis testing refers to the formal procedures used by statisticians to accept or reject statistical hypotheses.

What is Hypothesis Testing? From Stat Trek

Most websites will have a similar definition or introduction, followed by a number of components, notation, key terminology, and examples.

The Basic Idea

Hypothesis tests show up in many areas of our everyday lives, but they are kind of sneaky. The basic structure of a hypothesis test is very much like a science project from elementary, middle, or high school. You have a problem, hypothesis, data collection, some computations, results or conclusions. What follows next are a few examples of what the hypothesis test and results would look like in journals or other publications, and how those results are presented to the public.

Some Examples of Hypothesis Tests

Example 1: Agility Testing in Youth Football (Soccer)Players; Evaluating Reliability, Validity, and Correlates of Newly Developed Testing Protocols

Reactive agility (RAG)and change of direction speed (CODS) were analyzed in 13U and 15U youth soccer players. “Independent samples t-test indicated significant differences between U13 and U15 in S10 (t-test: 3.57, p < 0.001), S20M (t-test: 3.13, p < 0.001), 20Y (t-test: 4.89, p < 0.001), FS_RAG (t-test: 3.96, p < 0.001), and FS_CODS (t-test: 6.42, p < 0.001), with better performance in U15. Starters outperformed non-starters in most capacities among U13, but only in FS_RAG among U15 (t-test: 1.56, p < 0.05).”

Most of this might seem like gibberish for now, but essentially the two groups were analyzed and compared, with significant differences observed between the groups.

Source: https://pubmed.ncbi.nlm.nih.gov/31906269/

Example 2: Manual therapy in the treatment of carpal tunnel syndrome in diabetic patients: A randomized clinical trial

Thirty diabetic patients with carpal tunnel syndrome were split up into two groups. One received physiotherapy modality and the other received manual therapy. “Paired t-test revealed that all of the outcome measures had a significant change in the manual therapy group, whereas only the VAS and SSS changed significantly in the modality group at the end of 4 weeks. Independent t-test showed that the variables of SSS, FSS and MNT in the manual therapy group improved significantly greater than the modality group.”

Source: https://pubmed.ncbi.nlm.nih.gov/30197774/

Example 3: Omega-3 fatty acids decreased irritability of patients with bipolar disorder in an add-on, open label study

“The initial mean was 63.51 (SD 34.17), indicating that on average, subjects were irritable for about six of the previous ten days. The mean for the last recorded percentage was less than half of the initial score: 30.27 (SD 34.03). The decrease was found to be statistically significant using a paired sample t-test (t = 4.36, 36 df, p < .001).”

Source: https://nutritionj.biomedcentral.com/articles/10.1186/1475-2891-4-6

Example 4: Evaluating the Efficacy of COVID-19 Vaccines

“We reduced all values of vaccine efficacy by 30% to reflect the waning of vaccine efficacy against each endpoint over time. We tested the null hypothesis that the vaccine efficacy is 0% versus the alternative hypothesis that the vaccine efficacy is greater than 0% at the nominal significance level of 2.5%.”

Source: https://www.medrxiv.org/content/10.1101/2020.10.02.20205906v2.full

Example 5: Social Isolation During COVID-19 Pandemic. Perceived Stress and Containment Measures Compliance Among Polish and Italian Residents

“The Polish group had a higher stress level than the Italian group (mean PSS-10 total score 22,14 vs 17,01, respectively; p < 0.01). There was a greater prevalence of chronic diseases among Polish respondents. Italian subjects expressed more concern about their health, as well as about their future employment. Italian subjects did not comply with suggested restrictions as much as Polish subjects and were less eager to restrain from their usual activities (social, physical, and religious), which were more often perceived as “most needed matters” in Italian than in Polish residents.”

Source: https://www.frontiersin.org/articles/10.3389/fpsyg.2021.673514/full

Example 6: A Comparative Analysis of Student Performance in an Online vs. Face-to-Face Environmental Science Course From 2009 to 2016

“The independent sample t-test showed no significant difference in student performance between online and F2F learners with respect to gender [t(145) = 1.42, p = 0.122].”

Source: https://www.frontiersin.org/articles/10.3389/fcomp.2019.00007/full

But what does it all mean?

That’s what comes next. The examples above span a variety of different types of hypothesis tests. Within this chapter we will take a look at some of the terminology, formulas, and concepts related to Hypothesis Testing for 1 Sample.

Key Terminology and Formulas

Hypothesis: This is a claim or statement about a population, usually focusing on a parameter such as a proportion (%), mean, standard deviation, or variance. We will be focusing primarily on the proportion and the mean.

Hypothesis Test: Also known as a Significance Test or Test of Significance, the hypothesis test is the collection of procedures we use to test a claim about a population.

Null Hypothesis: This is a statement that the population parameter (such as the proportion, mean, standard deviation, or variance) is equal to some value. In simpler terms, the Null Hypothesis is a statement that “nothing is different from what usually happens.” The Null Hypothesis is usually denoted by [latex]H_{0}[/latex], followed by other symbols and notation that describe how the parameter is the same as some value.

Alternative Hypothesis: This is a statement that the population parameter (such as the proportion, mean, standard deviation, or variance) is somehow different the value involved in the Null Hypothesis. For our examples, “somehow different” will involve the use of [latex]<[/latex], [latex]>[/latex], or [latex]\neq[/latex]. In simpler terms, the Alternative Hypothesis is a statement that “something is different from what usually happens.” The Alternative Hypothesis is usually denoted by [latex]H_{1}[/latex], [latex]H_{A}[/latex], or [latex]H_{a}[/latex], followed by other symbols and notation that describe how the parameter is different from some value.

Significance Level: We previous learned about the significance level as the “left over” stuff from the confidence level. This is still true, but we will now focus more on the significance level as its own value, and we will use the symbol alpha, [latex]\alpha[/latex]. This looks like a lowercase “a,” or a drawing of a little fish. The significance level [latex]\alpha[/latex] is the probability of rejecting the null hypothesis when it is actually true (more on what this means in the next section). The common values are still similar to what we had previously, 1%, 5%, and 10%. We commonly write these as decimals instead, 0.01, 0.05, and 0.10.

Test Statistic: One of the key components of a hypothesis test is what we call a test statistic. This is a calculation, sort of like a z-score, that is specific to the type of test being conducted. The idea behind a test statistic, relating it back to science projects, would be like calculations from measurements that were taken. In this chapter we will address the test statistic for 1 proportion, 1 mean when we know [latex]\sigma[/latex], and 1 mean with [latex]\sigma[/latex] unknown. The formulas are listed in the table below:

Test	Population Parameter	Test Statistic
1 Proportion	[latex]p[/latex]	[latex]z = \displaystyle \frac{\hat{p} - p}{\sqrt{\frac{p \times q}{n}}}[/latex]
1 Mean, [latex]\sigma[/latex] Known	[latex]\mu[/latex]	[latex]z = \displaystyle \frac{\bar{x} - \mu}{\frac{\sigma}{\sqrt{n}}}[/latex]
1 Mean, [latex]\sigma[/latex] Unknown	[latex]\mu[/latex]	[latex]t = \displaystyle \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}}[/latex]

Critical Region: The critical region, also known as the rejection region, is the area in the normal (or other) distribution in which we reject the null hypothesis. Think of the critical region like a target area that you are aiming for. If we are able to get a value in this region, it means we have evidence for the claim.

Critical Value: These are like special z-scores for us; the critical value (or values, sometimes there are two) separates the critical region from the rest of the distribution. This is the non-target part, or what we are not aiming for. If our value is in this region, we do not have evidence for the claim.

P-Value: This is a special value that we compute. If we assume the null hypothesis is true, the p-value represents the probability that a test statistic is at least as extreme as the one we computed from our sample data; for us the test statistics would be either [latex]z[/latex] or [latex]t[/latex].

Decision Rule for Hypothesis Testing: There are a few ways we can arrive at our decision with a hypothesis test. We can arrive at our conclusion by using confidence intervals, critical values (also known as traditional method), and using p-values. Relating this to a science project, the decision rule would be what we take into consideration to arrive at our conclusion. When we make our decision, the wording will sound a little strange. We’ll say things like “we have enough evidence to reject the null hypothesis” or “there is insufficient evidence to reject the null hypothesis.”

Decision Rule with Critical Values: If the test statistic is in the critical region, we have enough evidence to reject the null hypothesis. We can also say we have sufficient evidence to support the claim. If the test statistic is not in the critical region, we fail to reject the null hypothesis. We can also say we do not have sufficient evidence to support the claim.

Decision Rule with P-Values: If the p-value is less than or equal to the significance level, we have enough evidence to reject the null hypothesis. We can also say we have sufficient evidence to support the claim. If the p-value is greater than the significance level, we fail to reject the null hypothesis. We can also say we do not have sufficient evidence to support the claim.

More About Hypotheses

Writing the Null and Alternative Hypothesis can be tricky. Here are a few examples of claims followed by the respective hypotheses:

Test

The Claim

Null and Alternative Hypotheses

1 Proportion

Test the claim that vaccine effectiveness is greater than 0%

[latex]H_{0}: p = 0.00[/latex]

[latex]H_{A}: p > 0.00[/latex]

1 Mean, [latex]\sigma[/latex] Known

Test the claim that the mean systolic blood pressure differs from 120

[latex]H_{0}: \mu = 120[/latex]

[latex]H_{A}: \mu \neq 120[/latex]

1 Mean, [latex]\sigma[/latex] Unknown

Test the claim that the average fasting blood glucose level is below 115

[latex]H_{0}: \mu = 115[/latex]

[latex]H_{A}: \mu < 115[/latex]

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License