My main concern about using regression analysis is that it is suitable for some situations and not others. It is suitable for use where:
- Causal processes can be expected to be symmetric i.e. the causes of the absence can be expected to be the absence of the causes of the presence of the outcome
- One single model is being sought to account for all cases of outcomes present and absent
- The variables that make up a regression model can be assumed to act independently of each other.
These assumptions are different from those embedded in a QCA perspective, which assumes:
- Causal processes may not be symmetric
- There may be multiple different packages of causes generating all known cases of an outcome
- Some causes may only work when present as part of a package of causes, i.e, they are not independent
In Barbara Befani’s very informed 2016 book “Pathways to Change: Evaluating development interventions with Qualitative Comparative Analysis (QCA) Annex B explains the differences between QCA and regression analysis. The same explanation also applies to EvalC3, because like QCA it is also a form of comparative configurational analysis. I have quoted the annex in full here:
“The QCA is often compared with regression analysis because both methods attempt to establish an association between a number of causal factors and an outcome (see for example (Vis, 2012)). In regression analysis, these factors are referred to as “variables” because they usually can take any value in an interval of real numbers; while in QCA they are referred to as “conditions” because they denote presence or absence of a certain quality or state in a given case. However, despite some apparent similarities, the differences between QCA and regression are numerous and substantial (Thiem, Baumgartner, & Bol, 2015).
First of all, in regression analysis, association is intended as “concomitant variation” between a single variable and an outcome (see Annex A): if the value of the outcome tends to increase with the value of the independent variable, we observe a correlation between the variable and the outcome. By contrast, in QCA, association is intended as a set relation: union, intersection or inclusion. If the outcome is “included” in the condition, or logically implies the condition, the association will be one of “necessity”; conversely, if the condition is “included” in the outcome and logically implies the outcome, the association will be one of “sufficiency”. While correlation is symmetrical (if x is correlated with y, then y is correlated with x), association in QCA isn’t: conditions can be necessary but not sufficient, or sufficient but not necessary. This property is also referred to as “causal asymmetry”.
The second important difference between QCA and the most common type of regression analysis (that doesn’t take interaction effects into account) is that, while in regression analyses associations are established between the outcome and one variable at a time, QCA considers cases “as wholes” or “packages”, analysing associations between combinations of conditions and the outcome; which makes the emergence of contextual influence easier to spot. While in regression analysis the causal power of one variable, identified by the regression coefficient, is valid “on average” across the entire sample, in QCA the causal power of one condition is dependent on which other conditions it is combined with. In other words, the association is “conjunctural” (hence the word “conjunctural” in multiple conjunctural causation, see Annex A), or dependent on a specific context or setting.
Thirdly, while regression analysis aims at the identification of the one single model that fits the data best, QCA allows the identification of multiple, equally important pathways to the outcome; for example, two or more conditions that can be equally necessary for an outcome; or two or more combinations of conditions that are equally sufficient (hence the term “multiple” in multiple-conjunctural causality)”