A very useful book by Mahoney and Goertz (A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences, 2012) makes a distinction between within-case analysis and cross-case analysis. EvalC3 is designed primarily to facilitate cross-case analysis. But to get the maximum value from this kind of analysis it is important that it is well informed at two different stages by within-case analysis.
When and why
- Before a cross-case analysis: When selecting what attributes to include in a data set and to make use of when analysing that data, either through the use of EvalC3 or other methods such as QCA or using Decision Tree algorithms. Ideally the selection of which attributes to investigate in terms of their possible relationship to which outcomes, would be informed by some prior notion or theory of what might be happening, rather than random choice. The development of those views is likely to be enhanced by familiarity with the details of the cases that are making up the data set.
- After a cross-case analysis: When good prediction rules have been found and modal (i.e. representative) cases have been identified (see Selecting Cases). Once modal cases have been selected they can be put to use in various ways:
- As illustrative examples of the results predicted by the model (True Positives), and or incorrect results (False Positives). At the same time, within-case inspection can be used to verify if the attributes of the case in the data set are a correct description of the actual modal case i.e. do a measurement validity check
- As sources of causal explanations. The examination of individual cases should provide much more detailed information which could shed light on what (if any) causal mechanisms are at work that makes the prediction work.
- As sources of contradictory information, not available within the data set, which could disprove causal explanations that are developed.These could include confounders, i.e.a background factor that is a cause of both the attributes in a model and the associated outcome
Steps to take to identify and test likely causal mechanisms
There are four types of cases that can be selected for more in-depth inquiries about any underlying causal mechanisms that may be at work.
- Cases which exemplify the True Positive results, where the model correctly predicted the presence of the outcome. Look within these cases to find any likely causal mechanisms connecting the conditions that make up the configuration. Two sub-types would be useful to compare:
- Modal cases, which represented the average characteristics of cases in this group, taking all attributes into account, not just those within the prediction model. Click the Calculate Similarity button in View Cases to find these cases.
- Outlier cases, which represent those which were most dissimilar to all other cases in this group, apart from having the same prediction model characteristics. Click the Calculate Similarity button in View Cases to find these cases.
- Cases which exemplify the False Positives, where the model incorrectly predicted the presence of the outcome.There are at least two possible explanations that could be explored:
- In the False Positive cases, there are one or more other factors that all the cases have in common, which are blocking the model configuration from working i.e. delivering the outcome
- In the True Positive cases, there are one or more other factors that all the cases have in common, which are enabling the model configuration from working i.e. delivering the outcome, but which are absent in the False Positive cases.
- Cases which exemplify the False Negatives, where the outcome occurred despite the absence the attributes of the model. There are two types of interest here:
- There may be some False Negative cases that have all but one of the attributes found in the prediction model. These cases would be worth examining, in order to understand why the absence of a particular attribute that is part of the predictive model does not prevent the outcome from occurring. There may be some counter-balancing enabling factor at work, enabling the outcome. Such almost-the-same cases can be found using the Compare function in View Cases.
- Where a data set has some missing data points (i.e. blank cells) it is possible that some cases have been classed as FNs because they missed specific data on crucial attributes that would have otherwise classed them as TPs. In these circumstances it would be worth investigating the incidence of missing data on each of the attributes of a good performing model, and then scanning FN cases for those which have many of the necessary attributes but where the data on the others are missing.
- Where multiple models have been developed by using EvalC3 or QCA, it is possible that some cases with the expected outcome are still not covered by any of the models. By default, these will fall into the False Negative category. These case should be subject to particular attention because it is likely that the attributes that predict this outcome are outside the data set. They can only be discovered by doing a within-case investigation of these uncovered cases.
- Cases which exemplify the True Negatives, where the absence the attributes of the model is associated with the absence of the outcome
- There may cases here with all but one of the model attributes. These can be found using the Compare function in View Cases, after selecting a modal case in the True Positives group as the comparator. If found then the missing attribute may be viewed as an INUS attribute i.e. an attribute that is Insufficient but Necessary in a configuration that is Unnecessary but Sufficient for the outcome (See Befani, 2016). It would then be worth investigating how these critical attributes have their effects by doing a detailed within-case analysis of the cases with the critical missing attribute.
- Caveat: INUS status cannot be claimed for an attribute if the same configuration with all but one essential model attributes can also be found in the False Negatives group of cases (i.e. where the outcome is present).
The cases that fit each of the four types can be seen in the “View Cases ” worksheet, and found by using the Calculate Similarity and Compare functions.
When looking at individual True Positive cases in order to find causal mechanisms at work it may be of value to look at particular attributes in the model. Tweaking of a model, by selectively removing and replacing one attribute at a time, will show which attributes make the biggest difference to the model’s overall performance. It is these attributes which should be of particular interest when looking for the causal mechanism at work within a TP case.
There is now a Sensitivity button on the Design and Evaluate view, under the Explore section. Clicking on this will highlight the attribute in the currently loaded model whose removal makes the biggest difference to the model performance.
Elizabeth A. Stuart (2010) Matching methods for causal inference: A review and a look forward
Gary Goertz (2017) Multimethod Research, Causal Mechanisms, and Case Studies: An Integrated Approach, Princeton University Press