On effect sizes in multiple regression

Multiple (linear) regression is arguably one of the most common statistical analyses used in the social sciences. Anytime researchers want to predict an approximately normally-distributed outcome from more than one predictor, they use multiple regression. Multiple regression provides unstandardized partial coefficients that indicate the estimated units the outcome increases (i.e., we can define a decrease as a negative increase) for every one unit increase in the predictor, holding all other predictors constant. Hypothesis tests and confidence intervals are available to determine whether the unstandardized partial coefficients are significantly different from zero.

The problem with unstandardized partial coefficients from multiple regression is that they are difficult to interpret. Both the predictors and the outcome are often measured on arbitrary units that have no objective meaning (e.g., Likert scales). Understanding unit increases in the outcome and one unit increases in predictors becomes difficult. Instead, it is common practice to interpret standardized partial coefficients as effect sizes in multiple regression. These coefficients are the unstandardized partial coefficients from a multiple regression where the outcome and predictors have been transformed to z-scores and the units are standard deviations. Standardized partial coefficients have the same interpretation as unstandardized partial coefficients except that the units are now standard deviations rather than arbitrary units. Therefore, standardized partial coefficients can be interpreted as the number of standard deviations the outcome increases for every standard deviation increase in the predictor, holding all other predictors constant. Most researchers have an understanding of standard deviations as a unit of measurement, theoretically making standardized partial coefficients easier to interpret than their unstandardized counterparts.

The problem with standardized partial coefficients is that researchers want to interpret them as a type of correlation. Researchers are familiar with correlations: they range from -1 to 1, have standard deviation units, and researchers know what values are considered weak versus strong in their scientific field. Indeed, the standardized coefficient from a simple regression is the (zero-order) correlation between the predictor and outcome. However, when moving to multiple regression, standardized partial coefficients are not on the correlation metric. The added phrase “while holding all other predictors constant” changes the interpretation. Standardized partial coefficients range from -∞ to +∞ rather than -1 to +1 and have fractions of standard deviation units rather than standard deviation units. This results in an unclear understanding of what a weak versus strong standardized partial coefficient is.

In this blog post, I will explain the correct way to interpret standardized partial coefficients, show how difficult that interpretation is, and advocate for instead using semi-partial correlations as effect sizes in multiple regression. This blog post was motivated by colleagues who interpret standardized partial coefficients from multiple regression as a type of correlation. They use Cohen’s heuristics for zero-order correlations to interpret standardized partial coefficients: ±.1 for a small effect size, ±.3 for a moderate effect size, and ±.5 for a large effect size. Other colleagues believe that standardized partial coefficients and semi-partial correlations are the same statistic. As I will show, standardized partial coefficients are not a type of correlation and thus should not be interpreted the same.

How to interpret the standardized partial coefficient

To understand the correct interpretation of standardized partial coefficients, we must first understand unstandardized partial coefficients. Unstandardized partial coefficients are actually the coefficients from separate simple regressions with residuals. The key point is that residuals are used, rather than the original predictors and outcome variables. Let’s look at the two predictor case where Y is the outcome and X1 and X2 are the predictors. To obtain the unstandardized partial coefficient for X1, the first step is to conduct two other regressions and save their unstandardized residuals as new variables. In one regression, X2 predicts X1 and the residuals are saved. These saved residuals represent X1 while holding X2 constant and I will refer to them as X1.X2. In the other regression, X2 predicts Y and the residuals are saved. Those saved residuals represent Y while holding X2 constant and I will refer to them as Y.X2. The second step is to conduct a simple regression with both residuals: X1.X2 predicting Y.X2. The unstandardized coefficient from this simple regression of the residuals is equal to the unstandardized partial coefficient from the multiple regression of the original variables. Because of their equivalence, we can interpret the unstandardized partial coefficient of X1 as the unstandardized coefficient from the simple regression of the residuals. (As a reminder, the interpretation for an unstandardized coefficient from a simple regression is the number of units the outcome increases for every single unit increase in the predictor; note, there is nothing to hold constant because there is only one predictor). Therefore, the unstandardized partial coefficient for X1 is the number of units the Y.X2 residuals increase for every single unit increase in the X1.X2 residuals. The important thing to realize is that the residuals X1.X2 and Y.X2 do not have the same units as X1 and Y, respectively. The units are smaller because the variance shared with X2 has been taken out.

Moving to standardized partial coefficients, we simply replace the predictors and outcome variables with their z-score transformations. To obtain the standardized partial coefficient of X1, we conduct the two other regressions, save the unstandardized residuals of z-scores as new variables, and then run the simple regression with the residuals of z-scores. Similar to the unstandardized partial coefficient of X1, the standardized partial coefficient of X1 is equal to the unstandardized coefficient from the simple regression of residuals. Therefore, we can interpret the standardized partial coefficient of X1 as the following: The number of units the Y-zscore.X2z-score residuals increase for every single unit increase in the X1z-score.X2z-score residuals.

Now in simple regression, the interpretation advantage of the standardized coefficient is that the units of both the predictor and outcome are z-scores, or standard deviations, and therefore can be interpreted as a correlation. However, the simple regression above involves the residuals of z-scores, which do not have standard deviation units. They have some fraction of standard deviation units depending on how much variance X1 and Y shared with X2. It may be 7/8th of a standard deviation or 4/5th of a standard deviation or any other fraction. I don’t know about you, but fractions of standard deviations units are difficult for me to interpret. So although standardized partial coefficient are intended to provide researchers with interpretable units, they actual fail to do so! Both unstandardized and standardized partial coefficients have units that are difficult to interpret.

However, standardized partial coefficients do have a big advantage over unstandardized partial coefficients in that they indicate relative strength of prediction. A predictor with a larger standardized partial coefficient magnitude is a strongest predictor and a predictor with a smaller standardized partial coefficient magnitude is a weaker predictor. This is not necessarily true of unstandardized partial coefficients. However, standardized partial coefficients do not indicate objective strength of prediction. Standardized partial coefficients range from -∞ to +∞ and it is not always clear what is an objectively strong or weak standardized partial coefficient. It is not the same as interpreting a correlation, in which a researcher has an idea of what strong and weak mean within the bounds of -1 and +1. To reiterate, this is because the units are not standard deviations, but rather fractions of standard deviations.

The case for the semi-partial correlation

However, we can do better than just relative strength of prediction. We can have an effect size in multiple regression that provides objective strength of prediction and is easier to interpret. Semi-partial correlations are a statistic that do all of these things. A predictor with a larger semi-partial correlation magnitude is a strongest predictor and the semi-partial correlation can be interpreted using the familiar correlation metric. The semi-partial correlation is the correlation between the outcome and the aspects of the predictor unique from all the other predictors. The unique aspects of a predictor are created by saving as a new variable the unstandardized residuals from a regression of all other predictors predicting the predictor of interest. For our two predictor example and the semi-partial correlation of X1, the residuals would be from the regression of X2 predicting X1. These saved residuals represent X1 while holding X2 constant and I will refer to them as X1.X2. The semi-partial correlation for X1 is the correlation between Y and X1.X2. There is no complex interpretation involving unit increases or decreases; instead, we can understand the semi-partial correlation as a type of correlation. In addition, hypothesis tests and confidence intervals are available for semi-partial correlations (i.e., the t- and p-values are the same as the partial regression coefficients). So, researchers can provide effect sizes, hypothesis tests and confidence intervals for multiple regression through the semi-partial correlations alone.

The relationship between the standardized partial coefficient and semi-partial correlations depends on multicollinearity. When there is no multicollinearity, the standardized partial coefficients and semi-partial correlations are the exact same value and the standardized partial coefficients can essentially be interpreted as semi-partial correlations. However, as multicollinearity increases, the discrepancy between the standardized partial coefficient and semi-partial correlation increases. This formula represents the relation:

β1 = sr1 / √(1-R1^2) = sr1 / √tolerance1

β1 refers to the standardized partial coefficient of predictor 1, sr1 refers to the semi-partial correlation of predictor 1, and R1^2 refers to the proportion of variance explained in predictor 1 by all other predictors. The expression 1 - R1^2 is referred to as the tolerance and represents the proportion of variance in a predictor that is free to predict the outcome in a multiple regression. The tolerance is often used as an index of multicollinearity in multiple regression. The greater the multicollinearity between predictors, the smaller the tolerance is. When looking at the equation above, a smaller tolerance leads to a larger standardized partial coefficient. Therefore, greater multicollinearity leads to a standardized coefficient that is larger than the semi-partial correlation. This becomes a problem when standardized coefficients are interpreted as semi-partial correlations. The greater the multicollinearity, the more the perceived effect size is upwardly biased.

Let’s look at a simple, yet classic, example from clinical psychology. A researcher wants to see if anxiety can predict some outcome after controlling for depression. She wants to show that her correlation between anxiety and the outcome is not just due to depression. Therefore, she conducts a multiple regression with anxiety and depression as the two predictors. Anxiety and depression are highly related, usually correlating about .60. With only two predictors, Ri2 is simply the squared correlation between the two predictors. Assuming the correlation between anxiety and depression was .60, the tolerance of each predictor would be .64 (i.e., 1 - .602). Plugging that tolerance value into the above equation, we get the following relation between the standardized partial coefficient and semi-partial correlation for anxiety:

β1 = 1.25*sr1

If the standardized partial coefficient was interpreted as the semi-partial correlation, the perceived effect size would be inflated by 25%! Consequently, a researcher would interpret a predictor as having a larger effect than it truly had. For multiple regressions with many predictors (e.g., 20), the tolerance of a predictor can be less than .64, resulting in even greater perceived upward bias. To be fair, the standardized partial coefficient and semi-partial correlation will be very similar and the perceived bias very small when there is negligible multicollinearity.

Famous researchers in articles published in top-tier journals interpret standardized partial coefficients as if they were semi-partial correlations using Jacob Cohen’s heuristics for small (r = .10), medium (r = .30), and large (r = .50) correlations. In the abstract of one of Angela Duckworth and colleagues’ (2012) papers, they state “conscientiousness demonstrated beneficial associations of small-to-medium magnitude with all success outcomes” based upon standardized partial coefficients around .20. However, their models had 9 other predictors, including cognitive ability. The potential tolerance of conscientiousness could easily have been significantly less than 1 indicating a perceived inflation of its effect size. Maybe the effect size of conscientiousness was actually small rather than small-to-medium.

In conclusion, I am not sure why researchers use standardized partial coefficients as their main effect size in multiple regression. Although more informative than unstandardized partial coefficients, they are still difficult to interpret. When I see a researcher report standardized partial coefficients, I try to find each predictor’s tolerance and convert them to semi-partial correlations – a statistic I know how to interpret. Some researchers incorrectly interpret them as semi-partial correlations, which overestimate predictors’ perceived effects. Instead, I argue for simply using semi-partial correlations as the main effect size in multiple regression. They can easily be understood as a type of correlation that has an interpretation most researchers are familiar with.