While there is much that I like about QCA, there are three areas where I am in disagreement with QCA practice:
EvalC3 uses a categorical definition of necessity and sufficiency. Following colloquial and philosophical use, a prediction model attribute is either necessary or not, or sufficient or not. It is a black and white status, there are no degrees of necessity or degrees of sufficiency. To me, the idea of having degrees of sufficiency or necessity is contradictory to the very kernel of the meaning of both of those terms.
Yet QCA experts allow for this possibility when they talk about consistency of sufficient conditions and consistency of necessary conditions. For example, a configuration that has 20 True Positives and 5 False Positives would be described as having a Sufficiency consistency of 80%. Or a configuration with 20 True Positives and 10 False Negatives would be described as having a Necessity consistency of 66%. Along with this comes the more difficult notion of a threshold on these measures when a set of conditions aka a model then qualifies for a more categorical status of being sufficient, or necessary. For example, anything having more than 75% Sufficiency consistency is deemed to be Sufficient. But how this threshold is to be defined in any objective and accountable way escapes me. All Schneider and Wagemann (2012) say can say is “…the notion that the exact location of the consistency threshold is heavily dependent on the specific research context”
QCA experts have made the task of communicating their analyses to others more challenging by defining these two terms differently, according to whether they are talking about conditions that are necessary or sufficient.
- Consistency of sufficient conditions = True Positive / (True Positive and False Positive)
- Consistency of necessary conditions = True Positive / (True Positive and False Negative)
- Coverage of sufficient conditions = True Positive / (True Positive and False Negative)
- Coverage of necessary conditions = True Positive / (True Positive and True Positive)
Again, keeping closer to the commonplace meaning of these terms, EvalC3 has only one definition each for consistency and coverage:
- Consistency of a model = True Positive / (True Positive and False Positive)
- Coverage of a model = True Positive / (True Positive and False Negative)
These two terms have others names in other fields of work:
- Consistency is also known as Positive Predictive Value (PPV), or Precision
- Coverage is also known as True Positive Rate (TPR), Recall, or Sensitivity
As I understand it the key to the way this algorithm works is by finding cases where there is only one condition/attribute difference between the two case configurations, and where they either have the same outcome present, or same outcome absent. Because the presence or absence of this one different condition seems to make no difference to the outcome, it is treated as disposable, and removed from both configurations. A search continues for any other case that is the same as these two reduced configurations, except for the presence of one other condition, or the absence of one other existing condition. The same reducing rule applies, if the outcome is the same when the condition is present or absent, then it can be removed from the configurations being examined. The process of comparing cases with different configurations continues until no more redundant configurations can be removed. The simplified i.e. shortened configurations that remain are the “solutions” i.e. predictive models found by the algorithm.
The problem with this algorithm is that because it is very incremental, only continuing to work where there is one condition difference, it seems by definition unable to find common minimal configurations in cases with two or more differences. This is is not a problem when the data set contains all possible configurations of conditions. But it becomes problematic as this case diversity becomes a smaller and small sub-set of all the possible configurations. In this situation, the final set of “solutions” (models) may be more numerous than those that can be found by other algorithms, like Decision Tree searches.
In contrast, search algorithms of the kind used in EvalC3 don’t depend on adequate diversity within a set of cases. They can find the best fitting set of attributes in two very different configurations.
That said, “limited diversity” in a data set does present another problem common to both approaches. It means that any good fitting model may have limited external validity. Other new cases with new and different configurations may well contradict and thus cause the failure of these models.