Finding Positive Deviants

For background reading on the value of finding “positive deviants” see these resources:

This brief post outlines how EvalC3 can help find cases which may be usable examples of Positive Deviance.

First, develop a predictive model that is good at predicting the absence of the outcome. Usually, we are trying to predict its presence.  It may be easiest to do this by testing out combinations of attributes that according to prior knowledge and theory are not conducive to the outcome occurring – especially attributes of this kind that are quite common.

Then focus on the False Positives i.e. those cases where the model attributes predicted the absence of the outcome but in practice the outcome was present. These cases qualify, on first glance, as Positive Deviants. They are the cases where it would be well worthwhile doing a within-case investigation in order to find out how they managed to succeed against the odds.

Try to minimise, but not totally eliminate, the number of False Positives. If there are a lot of False Positives all this may tell us is that the model is not very good, and is lacking some important attributes. If there are very few, perhaps only one, it is more likely this is a genuine Positive Deviance case achieving the outcome despite all the odds being against it doing so

This approach can be tested out using the Krook data set , which is built into EvalC3. The absence of quotas for women in parliament is sufficient for low levels of women’s participation in parliament. It predicts 13 of the 14 countries with such low levels. The one exception is Lesotho, where there are no quotas but there are high levels of participation of women in parliament. This is an example of a “positive deviance” case that would be worth investigating

Outliers and anomalies

Positive deviance cases are one type of outlier. An outlier being case whose attribute(s) are far from the average case. An anomaly is an unexpected case, given certain assumptions about what should be happening. So a False Negative case may be anomalous if the model being used represents what was officially expected to happen.  But there may be outlier cases within each of the Confusion Matrix categories.

To find outliers, go to View Cases view and click on Hamming Distance and then look for cases within any Confusion Matrix category that has the highest Hamming Distance measure.