12
The last frontier of critically consuming evaluation data relates to the use of a statistical test referred to as regression analysis. Saying this term out loud often elicits anxiety amongst my students, but in truth, regression analysis is all about predictions.
Specifically, regression analysis allows us to understand whether groups of factors predict a certain outcome. There are two main types of regression analysis used in evaluation, logit regression and ordinary least squares regression. Logit regression is for nominal outcomes and ordinary least squares is for continuous outcomes. Before your head begins to spin at the terminology, let’s focus on the basics of regression analysis.
Let’s go right to the concrete, an example related to our child welfare social worker’s evaluation of her parenting education intervention. All of the parents on her caseload have the same goal, reunifying with their children. Therefore, the outcome of interest in this evaluation is whether or not family reunification took place (yes or no). Note that because family reunification is a yes/no measure, it is classified as a nominal variable. Therefore, logit regression is the appropriate test to choose because it works with an outcome measure that is nominal.
In order to plan her future work with parents involved in the child welfare system, she may be curious about what factors are related to the family reunification outcome. Factors that might be related to this outcome could include a parent’s child abuse potential score and whether they accomplished all of their reunification goals, among other factors.
As we have measures for both of these factors, we can conduct a regression analysis to see how much these factors, taken together predict or explain a positive family reunification outcome. Each of these measures is called an independent measure or an independent variable.
Let’s start to understand the utility of logit regression analysis by interpreting what’s in Table 10.7. After grounding ourselves with the title, so we know what we are focusing on, we move on to the second row. We can see that there is a column where the independent measures – or factors we are interested in – are listed.
The next column has what appears to be gibberish, Exp(B). This is statistical language, but what it is important to know is that this column gives us what is called an “odds ratio” to interpret that links the independent measure to the outcome measure. We’ll interpret that in just a bit. The third column gives us “confidence intervals” which are akin to a margin of error, telling us how low and how high the odds ratio could be. Finally, the p value tells us whether there is a statistically significant relationship between the independent measure and the outcome measure.
The last thing we need to pay attention to is the last row, where we see the term “Nagelkerke R2.” This is the name of a statistical test that tells us how much the set of independent measures (together) explains the variation in the outcome. You can think of this as a quality measure.
Table 10.7: Logit regression predicting family reunification |
|||
Independent measures |
Exp(B)
|
Confidence intervals |
P value |
Has low child abuse potential score at end of intervention (yes/no) |
2.30 |
1.75-2.46 |
p<.001 |
Achieved all reunification goals within one year (yes/no) |
10.1 |
9.2-11.6 |
p<.05 |
*p< .05; **p< .01; ***p< .001 Nagelkerke R2=0.54 |
|||
There are two ways to interpret logistic regression analyses in evaluation. The first approach looks at how good the set of independent measures is at predicting the outcome. The second approach looks at how each independent measure relates to the outcome measure.
Interpretation 1: In this interpretation we are using logistic regression to predict family reunification (our outcome measure) among child welfare-involved parents, we have a two-measure “model.” In regression analysis, a “model” is a set of independent measures that are thought to relate to the outcome measure of interest. The goal is to test the model to see how much of the variation in the outcome, family reunification in this case, is explained.
This interpretation relies on that Nagelkerke R2 that we mentioned above. This statistical test should be read as a percentage. Remember that in fourth grade math, 1.0 is equal to 100%, .99 is equal to 99% and so on. When we interpret the percentage, we can see that 54% of the outcome was explained by the combination of independent measures we chose to include in our regression analysis.
This tells us that while our two measures or “model” predicts over half of the outcome, there are some measures that are missing. In other words, there are other factors that predict the family reunification outcome that are not included in this model. Something to consider for our next evaluation. This interpretation helps us to see the spectrum of factors that we should work on with clients that are geared towards a positive outcome.
Interpretation 2: In this interpretation we are still using logistic regression to predict family reunification (our outcome measure) among child welfare-involved parents, and we are looking at the same two-measure “model.” In this approach to interpretation, we look at one independent measure’s relationship to the outcome measure at a time. One of our measures tells us whether child welfare involved parents had a low or high child abuse potential score (yes/no), so let’s start there.
Our interpretation focuses on parents with low child abuse potential scores as compared to parents with high child abuse potential scores. To interpret the odds ratio along this line of the table, we look at the number in the child abuse potential score row and the Exp(B) column. We see the number 2.2. This tells us parents with low child abuse potential scores were 2.3 times more likely to be reunified with their children, “controlling for” (taking into consideration) the other independent measure in the model, which is about achieving reunification goals. Our p level tells us that this is a statistically significant finding, meaning it is not due to chance.
Let’s say that our odds ratio had been 0.23 instead of 2.3. In this situation, we subtract 0.23 from 1, and get .77. We interpret this as a percent. This would tell us that parents with low child abuse potential scores were 77% less likely to reunify with their children (which doesn’t make a whole lot of common sense, but this is just an example). When our odds ratios are positive, we talk about “times more likely” and when they are negative, we talk about “percent less likely.” So, an odds ratio of 0 point anything is always about lower likelihood.
But back to our real finding, that parents with lower scores on the child abuse potential scale may have better family reunification outcomes. This incentivizes us to work with our parents who have higher scores on this measure so that they can do better at managing the tasks and challenges of parenting.
If we move to a focus on the other independent measure, we are focusing on parents who achieved their reunification goals versus those who did not with respect to whether their family was reunified. Our odds ratio tells us that parents who achieved their reunification goals were 10.1 times more likely to reunify with their children, controlling for the child abuse potential score measure. This is also a statistically significant difference.
This finding tells us that it is not only important to craft the right goals (that parents buy into) but also to facilitate a pathway to goal completion by our clients. It also tells us that we can’t look at goal completion separately from child abuse potential scores, which are also an important factor in family reunification outcomes.
OK, so you’ve worked through the basics of logistic regression for evaluation. Now, let’s turn our attention to the interpretation of an ordinary least squares (OLS) regression, which is also focused on prediction. This time, instead of focusing on the prediction of a yes/no outcome, we are looking at a prediction of an outcome measure that is a continuous variable, giving a score. In this case, we are trying to predict what increases the outcome measure of child abuse potential, measured on the 1-100 scale we discussed above.
Table 10.8: OLS regression analysis predicting child abuse potential score |
||
Independent measures |
Beta |
P value |
Parental age |
0.24 |
NS |
Days of child welfare involvement |
1.08 |
p<.05 |
*p< .05; **p< .01; ***p< .001 NS=no significance R2=0.72 |
||
Let’s start to understand OLS regression analysis by interpreting what’s in Table 10.8. After grounding ourselves with the title, so we know what we are focusing on, we move on to the second row. We can see the terms “beta” and p value. We know that the p value tells us about statistical significance, but what about this beta thing? The beta score tells us about the relationship between the independent measure to the outcome measure. We’ll interpret that in just a bit.
Moving to the first column, we see that the “model” (or set of independent measures) includes parental age and the number of days a family has been involved in the child welfare system. The latter measure could be considered a proxy or stand-in measure for the complexity of the child welfare case. As with logit regression, there are two ways to interpret an OLS regression.
Interpretation 1: In this interpretation we are using OLS regression to predict child abuse potential scores (our outcome measure) among child welfare-involved parents, and we have a two-measure “model.” We set out to predict change in child abuse potential score. In this interpretation we focus on the R2 percentage (note that it is not the Nagelkerke R2but the regular plain old R2). In this evaluation, our model predicted 72% of the variation in the outcome, the child abuse potential score.
Interpretation 2: In this interpretation we are still using OLS regression to predict child abuse potential score (our outcome measure) among child welfare-involved parents, and we are looking at the same two-measure “model.” In this approach to interpretation, we look at one independent measure’s relationship to the outcome measure at a time. One of our measures tells us how parental age is (or is not) related to child abuse potential score, so let’s start there.
Right off the bat, we may notice that the finding is not significant – this means that a parent’s age is not related to child abuse potential scores when controlling for days of child welfare involvement. If this was statistically significant, for every year of a parent’s life, the child abuse potential score went up by 0.24 points, controlling for the other measure in the model. This would have told us that as parents are older, their child abuse potential goes up a little bit.
Now, we need to see what the relationship between days of child welfare involvement and child abuse potential score is. Looking at the beta score we see that for evert additional day a family is child welfare-involved, their child abuse potential score goes up. This suggests that something about child welfare involvement is not conducive to being a good parent, which is counterintuitive. This is something to be considered carefully by the evaluation team in order to try to change practice in this area.
In summary, univariate, bivariate and multivariate statistical tests are used to analyze evaluation data. This requires social workers to be able to interpret findings on the most basic level, so that they can inform their practice. Univariate information looks at summary data about a whole group. Bivariate data allows for the statistical comparison of groups or groups across time. There are different types of bivariate tests for measures that are continuous or nominal variables. Multivariate data analysis allows us to consider how sets of factors work together to explain outcomes.
Once social workers embrace their ethical duty to be critical consumers of research for evidence-based practice, the work of interpreting statistics becomes more of a priority. Hopefully this chapter has given you the basics you need to engage in practice evaluation data consumption with or without a team!
Discussion questions for chapter 12
- Thinking of your current internship or work placement, how could you use univariate, bivariate and multivariate statistics to inform your work?
- What are the differences between univariate, bivariate and multivariate statistics?
Chen, H., Cohen, P., & Chen, S. (2010). How big is a big odds ratio? Interpreting the magnitudes of odds ratios in epidemiological studies. Communications in Statistics Simulation and Computation, 39(4), 860–864.