This is the point in the work flow when the focus changes from across-case analysis to within-case analysis. This is where case selection strategies and tools become relevant. Before doing any within-case investigations choices need to be made about which case(s) to focus on.
EvalC3 now has three sets of tools for comparing cases and to use for case selection.
This is the first screen that becomes visible after clicking on “View Cases”
Here you can see the cases listed row by row. Their attributes are listed column by column, with the outcome column being on the far right (often initially out of sight).
In the Status column on the left, all the cases are sorted into four groups, representing the four categories of cases seen in the Confusion Matrix (True Positive, False Positive, False Negative, True Negative). The values of the attributes which are part of the model that is currently loaded in the Design and Evaluate view can now be seen in red font (see Quotas = 1, in red above)
Now click on “Calculate Similarity”. This will generate the next view.
In the Similarity column, there are now some percentage figures. Similarity is measured as 1-Hamming Distance. Hamming Distance is the proportion of all values in one row which are different from the values in a row representing another case. In the worksheet shown above, the Similarity measure is the average for a case, when compared to all other cases in the dataset.
It is best to focus on one Confusion Matrix category at a time, by using the Excel filter option at the made of the Status column. Start by filtering out all but the True Positive cases. The similarity measure will then show you how similar each True Positive case is to all other True Positives. The row highlighted in color, across the whole table, is the case with the highest similarity to the others in view. We can call this a Modal case because it is a type of average, it has many attributes in common with other cases in that group. Cases with the lowest similarity measure can be called Outlier cases because they have few attributes in common with the other cases in that group.
Now select a case of interest with a cursor click, then click on the Compare button. The following screen will appear.
To the left, there are now two new columns. The selected case (Benin, highlighted in blue) is any case that is of particular interest. Clicking on Compare generates the percentage values seen in the MS&MD column. The light green highlighted cases are those most similar (MS) to the selected case, the beige highlighted cases are those most different (MD) from the selected case. Whenever we choose another row as the selected case, the percentages will be recalculated and the highlighted colors re-located to the highest and lowest valued cells. The Compare function gives us a view of how specific cases compare to each other.
3. Case filtering by attribute
We can also carry out more focused comparisons, according to our interest. By opening the drop-down menu on any field we can choose to remove some types of cases from the current view. For example, we may only want to find MS & MD among the cases that do have the outcome present. If we do this, the MS &MD values will automatically be recalculated.
The next step is to select cases for subsequent within-case investigations, to identify causal mechanisms that may be at work underlying the associations represented in the predictive model. See the within-case analysis page for more information on the options here.
Here is a PDF copy of this page