"

What is a Hypothesis Test for 2 Samples?

Searching the internet for a definition of hypothesis testing for 2 samples brings back a lot of different results. Most of them are a little different. The definitions you will find online usually are disjointed, covering hypothesis testing for independent means, paired means, and proportions. Instead of giving one uniform definition, we’ll take a look at key components that are common to all of the tests, and then some of the specific components and notation.

The Basic Idea

The appearance of these hypothesis tests (in the real world) will be very similar to the tests that we see with one sample. In fact, the examples of hypothesis tests that were in the previous introduction include tests for one sample as well as two samples. The basic structure of these hypothesis tests are very similar to the ones we saw before. You have a problem, hypothesis, data collection, some computations, results or conclusions. Some of the notation will be slightly different. These examples below are the same ones we presented in the previous introduction, but here we are highlighting the two-sample variations. The examples with bolded terms are the ones that use 2 samples.

Some Examples of Hypothesis Tests

Example 1: Agility Testing in Youth Football (Soccer)Players; Evaluating Reliability, Validity, and Correlates of Newly Developed Testing Protocols

Reactive agility (RAG)and change of direction speed (CODS) were analyzed in 13U and 15U youth soccer players. “Independent samples t-test indicated significant differences between U13 and U15 in S10 (t-test: 3.57, p < 0.001), S20M (t-test: 3.13, p < 0.001), 20Y (t-test: 4.89, p < 0.001), FS_RAG (t-test: 3.96, p < 0.001), and FS_CODS (t-test: 6.42, p < 0.001), with better performance in U15. Starters outperformed non-starters in most capacities among U13, but only in FS_RAG among U15 (t-test: 1.56, p < 0.05).”

Most of this might seem like gibberish for now, but essentially the two groups were analyzed and compared, with significant differences observed between the groups. This is a hypothesis test for 2 means, independent samples.

Source: https://pubmed.ncbi.nlm.nih.gov/31906269/

Example 2: Manual therapy in the treatment of carpal tunnel syndrome in diabetic patients: A randomized clinical trial

Thirty diabetic patients with carpal tunnel syndrome were split up into two groups. One received physiotherapy modality and the other received manual therapy. “Paired t-test revealed that all of the outcome measures had a significant change in the manual therapy group, whereas only the VAS and SSS changed significantly in the modality group at the end of 4 weeks. Independent t-test showed that the variables of SSS, FSS and MNT in the manual therapy group improved significantly greater than the modality group.”

This is a hypothesis test for matched pairs, sometimes known as 2 means, dependent samples.

Source: https://pubmed.ncbi.nlm.nih.gov/30197774/

Example 3: Omega-3 fatty acids decreased irritability of patients with bipolar disorder in an add-on, open label study

“The initial mean was 63.51 (SD 34.17), indicating that on average, subjects were irritable for about six of the previous ten days. The mean for the last recorded percentage was less than half of the initial score: 30.27 (SD 34.03). The decrease was found to be statistically significant using a paired sample t-test (t = 4.36, 36 df, p < .001).”

This is a hypothesis test for matched pairs, sometimes known as 2 means, dependent samples.

Source: https://nutritionj.biomedcentral.com/articles/10.1186/1475-2891-4-6

Example 4: Evaluating the Efficacy of COVID-19 Vaccines

“We reduced all values of vaccine efficacy by 30% to reflect the waning of vaccine efficacy against each endpoint over time. We tested the null hypothesis that the vaccine efficacy is 0% versus the alternative hypothesis that the vaccine efficacy is greater than 0% at the nominal significance level of 2.5%.”

Source: https://www.medrxiv.org/content/10.1101/2020.10.02.20205906v2.full

Example 5: Social Isolation During COVID-19 Pandemic. Perceived Stress and Containment Measures Compliance Among Polish and Italian Residents

“The Polish group had a higher stress level than the Italian group (mean PSS-10 total score 22,14 vs 17,01, respectively; p < 0.01). There was a greater prevalence of chronic diseases among Polish respondents. Italian subjects expressed more concern about their health, as well as about their future employment. Italian subjects did not comply with suggested restrictions as much as Polish subjects and were less eager to restrain from their usual activities (social, physical, and religious), which were more often perceived as “most needed matters” in Italian than in Polish residents.”

Even though the test wording itself does not explicitly state the tests we will study, this is a comparison of means from two different groups, so this is a test for two means, independent samples.

Source: https://www.frontiersin.org/articles/10.3389/fpsyg.2021.673514/full

Example 6: A Comparative Analysis of Student Performance in an Online vs. Face-to-Face Environmental Science Course From 2009 to 2016

“The independent sample t-test showed no significant difference in student performance between online and F2F learners with respect to gender [t(145) = 1.42, p = 0.122].”

Once again, a test of 2 means, independent samples.

Source: https://www.frontiersin.org/articles/10.3389/fcomp.2019.00007/full

But what does it all mean?

That’s what comes next. The examples above span a variety of different types of hypothesis tests. Within this chapter we will take a look at some of the terminology, formulas, and concepts related to Hypothesis Testing for 2 Samples.

Key Terminology and Formulas

Hypothesis: This is a claim or statement about a population, usually focusing on a parameter such as a proportion (%), mean, standard deviation, or variance. We will be focusing primarily on the proportion and the mean.

Hypothesis Test: Also known as a Significance Test or Test of Significance, the hypothesis test is the collection of procedures we use to test a claim about a population.

Null Hypothesis: This is a statement that the population parameter (such as the proportion, mean, standard deviation, or variance) is equal to some value. In simpler terms, the Null Hypothesis is a statement that “nothing is different from what usually happens.” The Null Hypothesis is usually denoted by [latex]H_{0}[/latex], followed by other symbols and notation that describe how the parameter from one population or group is the same as the parameter from another population or group.

Alternative Hypothesis: This is a statement that the population parameter (such as the proportion, mean, standard deviation, or variance) is somehow different the value involved in the Null Hypothesis. For our examples, “somehow different” will involve the use of [latex]<[/latex], [latex]>[/latex], or [latex]\neq[/latex]. In simpler terms, the Alternative Hypothesis is a statement that “something is different from what usually happens.” The Alternative Hypothesis is usually denoted by [latex]H_{1}[/latex], [latex]H_{A}[/latex], or [latex]H_{a}[/latex], followed by other symbols and notation that describe how the parameter from one population or group is different from the parameter from another population or group.

Significance Level: We previous learned about the significance level as the “left over” stuff from the confidence level. This is still true, but we will now focus more on the significance level as its own value, and we will use the symbol alpha, [latex]\alpha[/latex]. This looks like a lowercase “a,” or a drawing of a little fish. The significance level [latex]\alpha[/latex] is the probability of rejecting the null hypothesis when it is actually true (more on what this means in the next section). The common values are still similar to what we had previously, 1%, 5%, and 10%. We commonly write these as decimals instead, 0.01, 0.05, and 0.10.

Test Statistic: One of the key components of a hypothesis test is what we call a test statistic. This is a calculation, sort of like a z-score, that is specific to the type of test being conducted. The idea behind a test statistic, relating it back to science projects, would be like calculations from measurements that were taken. In this chapter we will address the test statistic for 2 proportions, 2 means (independent samples), and matched pairs (2 means from dependent samples). The formulas are listed in the table below:

What the different symbols mean:

Test Population Parameter(s) Test Statistic
2 Proportions [latex]p_1[/latex] and [latex]p_2[/latex] [latex]z = \displaystyle \frac{\hat{p_1} - \hat{p_2}}{\sqrt{\displaystyle \frac{\bar{p} \times \bar{q}}{n_1} + \displaystyle \frac{\bar{p} \times \bar{q}}{n_2}}}[/latex]
2 Means (Independent Samples) [latex]\mu_1[/latex] and [latex]\mu_2[/latex] [latex]t = \displaystyle \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sqrt{\displaystyle \frac{s_1^2}{n_1} + \displaystyle \frac{s_2^2}{n_2}}}[/latex]
Matched Pairs (2 Means, Dependent Samples) [latex]\mu_d[/latex] [latex]t = \displaystyle \frac{\bar{x} - 0}{\frac{s}{\sqrt{n}}}[/latex]

Critical Region: The critical region, also known as the rejection region, is the area in the normal (or other) distribution in which we reject the null hypothesis. Think of the critical region like a target area that you are aiming for. If we are able to get a value in this region, it means we have evidence for the claim.

Critical Value: These are like special z-scores for us; the critical value (or values, sometimes there are two) separates the critical region from the rest of the distribution. This is the non-target part, or what we are not aiming for. If our value is in this region, we do not have evidence for the claim.

P-Value: This is a special value that we compute. If we assume the null hypothesis is true, the p-value represents the probability that a test statistic is at least as extreme as the one we computed from our sample data; for us the test statistics would be either [latex]z[/latex] or [latex]t[/latex].

Decision Rule for Hypothesis Testing: There are a few ways we can arrive at our decision with a hypothesis test. We can arrive at our conclusion by using confidence intervals, critical values (also known as traditional method), and using p-values. Relating this to a science project, the decision rule would be what we take into consideration to arrive at our conclusion. When we make our decision, the wording will sound a little strange. We’ll say things like “we have enough evidence to reject the null hypothesis” or “there is insufficient evidence to reject the null hypothesis.”

Decision Rule with Critical Values: If the test statistic is in the critical region, we have enough evidence to reject the null hypothesis. We can also say we have sufficient evidence to support the claim. If the test statistic is not in the critical region, we fail to reject the null hypothesis. We can also say we do not have sufficient evidence to support the claim.

Decision Rule with P-Values: If the p-value is less than or equal to the significance level, we have enough evidence to reject the null hypothesis. We can also say we have sufficient evidence to support the claim. If the p-value is greater than the significance level, we fail to reject the null hypothesis. We can also say we do not have sufficient evidence to support the claim.

More About Hypotheses

Writing the Null and Alternative Hypothesis can be tricky. Here are a few examples of claims followed by the respective hypotheses:

Test The Claim Null and Alternative Hypotheses
2 Proportions Is there evidence that the proportion of IVC students employed during the summer differs from the proportion of IVC students employed during the winter? [latex]H_{0}: p_1 = p_2[/latex]

[latex]H_{A}: p_1 \neq p_2[/latex]

2 Means (Independent Samples) Is there evidence that the price of carrots is different in July than it is in December? [latex]H_{0}: \mu_1 = \mu_2[/latex]

[latex]H_{A}: \mu_1 \neq \mu_2[/latex]

Matched Pairs (2 Means, Dependent Samples) Is there evidence that medication X lowers cholesterol? [latex]H_{0}: \mu_d = 0[/latex]

[latex]H_{A}: \mu_d < 0[/latex]

 

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Basic Statistics Copyright © by Allyn Leon is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.