"

3. Variability

Measuring variability refers to determining how values within a dataset differ from one another. Essentially, variability measures how spread out or clustered a set of data is. For example, take two data sets:

1, 2, 3, 4, 5, and 6

3, 3, 3, 4, 4, and 4

Both datasets have a mean of 3.5, but which dataset is more spread out (and thus has more variability?) Intuitively, we would say the first dataset, but how can we actually measure variability?

Three ways of measuring variability are the range, standard deviation, and variance.

Range

Range = highest value – lowest value

 

To compute the range, take the lowest value in the dataset and subtract it from the highest value. This is also known as the exclusive range. Imagine you’re a professor and give a statistics exam and the highest score is 98 and the lowest score is 77. What is the range? 98-77=21. The range of exam scores is 21.

The inclusive range is less commonly used than the inclusive range in statistics and is computed by subtracting the lowest value from the highest value and then adding 1. Inclusive range includes the endpoints and we often use inclusive range in our daily lives without even realizing it. For example, if we ask a friend to “Pick a number between 1 and 10,” how many options do they have? The inclusive range in this example is 10, since 10-1+1=10. Thus, the range of possible values they could select is 10.

Standard Deviation

[latexpage]$s=\sqrt{\dfrac{\displaystyle\sum \left( X-\overline{X}\right) ^{2}}{n-1}}$

Note that there are several different ways to write statistical formulas. Standard deviation, for example, can also be written as:

$s=\sqrt{\frac{1}{N-1}\displaystyle\sum_{i=1}^N(X_i-\bar{X})^2}$

A simplified notation has generally been used throughout this text.

Essentially, standard deviation examines the average variation about the mean, or on average, how much each score differs from the center score. But if standard deviation is just the average variation about the mean, why doesn’t the formula just do that – take the average of each score’s distance from the mean? Take the following dataset: 1, 2, 3, 4, and 5. The mean is 3 and the distance each score is from the mean is listed in the table below:

Value Distance from the mean ($X-\bar{X}$)
1 -2
2 -1
3 0
4 1
5 2

Now, to find the mean distance from the mean, we would first sum all the distance values. However, when you sum the distance values, you get a value of 0, which will be the case for every dataset. Thus, standard deviation uses the squared distance from the mean.

Value Distance from the mean ($X-\bar{X}$) Squared distance from the mean ($X-\bar{X}^2$)
1 -2 4
2 -1 1
3 0 0
4 1 1
5 2 4

The sum of the squared distances from the mean in this example is 10. 10 divided by n-1, or 10/(5-1), is 2.5. Finally, the square root of 2.5 (a step students often forget) is 1.58. Thus, the standard deviation of this dataset is 1.58.

Typically, the standard deviation is an estimate of the population standard deviation and is calculated using sample data. As a result, the formula uses n-1 rather than just dividing by n in order to create a less biased estimate. If the entire population dataset is used, the formula for population standard deviation could be utilized which divides by n.

Variance

[latexpage]$s^2=\dfrac{\sum \left( X-\overline{X}\right) ^{2}}{n-1}}$

Note that variance is simply the standard deviation squared. Variance is less commonly reported since it is stated in units squared, whereas standard deviation uses the same units as the data set. For example, if I said that the average (mean) exam score was an 85 and the variance was 100, that would be more challenging to understand than saying that the average exam score was an 85 and the standard deviation was 10.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Quantitative Methods in Geography: A Lab Manual Copyright © by Nathan Burtch and Caitlin Finlayson is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.