Pro and Contra QCA

What I like about QCA

  1. The perspective on causality: equifinality, asymmetry, conjectural causation, the concepts of necessary and/or sufficient causes, all described in more detail here
  2. The combination of cross-case analysis and within-case analysis, the idea of moving back and forward between these levels of analysis

Where I am in disagreement

    1. Defining Necessity and Sufficiency
    2. Measuring Consistency and Coverage
    3. Using the Quine McCluskey algorithm
    4. The consequences of using a Truth Table
  1. Defining Necessity and Sufficiency

EvalC3 uses a categorical definition of necessity and sufficiency. Following colloquial and philosophical use, a prediction model attribute is either necessary or not, or sufficient or not. It is a black and white status, there are no degrees of necessity or degrees of sufficiency. To me, the idea of having degrees of sufficiency or necessity is contradictory to the very kernel of the meaning of both of those terms.

Yet QCA experts allow for this possibility when they talk about consistency of sufficient conditions and consistency of necessary conditions. For example, a configuration that has 20 True Positives and 5 False Positives would be described as having a Sufficiency consistency of 80%. Or a configuration with 20 True Positives and 10 False Negatives would be described as having a Necessity consistency of 66%. Along with this comes the more difficult notion of a threshold on these measures when a set of conditions aka a model then qualifies for a more categorical status of being sufficient, or necessary.  For example, anything having more than 75% Sufficiency consistency is deemed to be Sufficient. But how this threshold is to be defined in any objective and accountable way escapes me. All Schneider and Wagemann (2012) say can say is “…the notion that the exact location of the consistency threshold is heavily dependent on the specific research context”

2. Measuring consistency and coverage

QCA experts have made the task of communicating their analyses to others more challenging by defining these two terms differently, according to whether they are talking about conditions that are necessary or sufficient.

  • Consistency of sufficient conditions = True Positive / (True Positive and False Positive)
  • Consistency of necessary conditions = True Positive / (True Positive and False Negative)
  • Coverage of sufficient conditions = True Positive / (True Positive and False Negative)
  • Coverage of necessary conditions = True Positive / (True Positive and True Positive)

Again, keeping closer to the commonplace meaning of these terms, EvalC3 has only one definition each for consistency and coverage:

  • Consistency of a model = True Positive / (True Positive and False Positive)
  • Coverage of a model = True Positive / (True Positive and False Negative)

These two terms have others names in other fields of work:

  • Consistency is also known as Positive Predictive Value (PPV), or Precision
  • Coverage is also known as  True Positive Rate (TPR), Recall, or Sensitivity

3. Using the Quine McCluskey algorithm

This is called a “minimisation” algorithm, because it tries to reduce a larger set of configurations down to a smaller subset, that still accounts for all cases of outcomes present and absent. As I understand it the key to the way this algorithm works is by finding cases where there is only one condition/attribute difference between the two case configurations, and where they either have the same outcome present, or same outcome absent. Because the presence or absence of this one different condition seems to make no difference to the outcome, it is treated as disposable, and removed from both configurations. A search continues for any other case that is the same as these two reduced configurations, except for the presence of one other condition, or the absence of one other existing condition. The same reducing rule applies, if the outcome is the same when the condition is present or absent, then it can be removed from the configurations being examined. The process of comparing cases with different configurations continues until no more redundant configurations can be removed. The simplified i.e. shortened configurations that remain are the “solutions” i.e. predictive models found by the algorithm.

The problem with this algorithm, as I see it, is that because it is very incremental, only continuing to work where there is one condition difference, it seems by definition unable to find common minimal configurations in cases with two or more differences. This is is not a problem when the data set contains all possible configurations of conditions. But it becomes problematic as this case diversity becomes a smaller and small sub-set of all the possible configurations. In this situation, the final set of “solutions” (models) may be more numerous than those that can be found by other algorithms, like Decision Tree searches.

In contrast, search algorithms of the kind used in EvalC3 don’t depend so much on adequate diversity within a set of cases. They can find the best fitting set of attributes in two very different configurations. That said, they can still generate more than one equally good fitting model where there are relatively few cases and relatively many attributes.

“Limited diversity” in a data set also presents another problem common to both approaches. It means that any good fitting model may have limited external validity. Other new cases with new and different configurations may well contradict and thus cause the failure of these models.

4. The consequences of using the Truth Table

Here is an example of a Truth Table, see in this recent paper: Kien, Christina, Ludwig Grillich, Barbara Nussbaumer-Streit, and Rudolf Schoberberger. 2018. ‘Pathways Leading to Success and Non-Success: A Process Evaluation of a Cluster Randomized Physical Activity Health Promotion Program Applying Fuzzy-Set Qualitative Comparative Analysis’. BMC Public Health 18 (1): 1386. https://doi.org/10.1186/s12889-018-6284-x. PS: My use of this data set as an example is not a critique of this paper, it simply happens to be the one I am most immediately familiar with.

 

Truth Table

Each row represents a type of configuration, a unique pattern of case attributes. The column “n” tells us how many cases have each of these unique patterns. It is the rows in this table that QCA works with, more specifically the Quine McCluskey minimisation algorithm that finds the simplest possible set of versions (i.e. “solutions”) of these that still accounts for all the outcomes observed and not observed. The performance of each of these solutions is measured in terms of their coverage and consistency.

My understanding of the calculation of these measures is that they are also based on the contents of the truth table i.e. the incidence of each type of configuration (16 above), not the total number of cases they represent (24 above). This is an important difference, especially for someone wanting to operationalise the findings in real life. In the worst case there may be many cases with one type of configuration but only 1 of others. This could seriously skew the significance of the consistency and coverage measures of a given configuration.

In contrast, when the same data set is used for prediction modelling the Truth Table is “unpacked” into rows that represent all the cases, one by one, as shown below – for the Truth Table above.

Truth Table unpacked

Here are two Decision Tree models generated from the original and unpacked versions of the Truth Table. The unpacked version has 8 more rows i.e. 50% more.

Truth Table unpacked DT with measures

Truth Table packed DT with measures