The default approach to building a predictive model is manual.
- Once the Design and Evaluate view is open look at “Design” on the left side. Here you choose what values to place next to each of the attributes that are automatically listed here. The drop-down menu in the Status column provides three options: N/A meaning ignore this attribute; 1 = this attribute is present, 0 = this attribute will be absent.
- The default status for each attribute when this view is first opened is N/A.
- You also need to choose whether the Outcome is expected to be present or absent when these attributes are as described above, using the same kind of drop down menu in the Status column
- This combination of attribute values and the selected outcome then constitute a predictive model
- The performance of this model can then be seen immediately in the Confusion Matrix under the heading “Evaluate”, which is explained more below
- Click on the Save button (above left) to save details of this model and its performance. You will need to save the model with a name you will recognize later.
- If you want to remove all the attributes of a model in one go i.e re-set them all to n/a click on the round “Stop”sign to the right of attribute “Status”
Explore alternative approaches to building a better predictive model
- Chose which type of model you want to find. There are four options, each represented by a button you can click on:
- Necessary & Sufficient: This kind of model will consist of a single attribute, or set of attributes, which are both necessary and sufficient for the outcome. In the Confusion matrix, there will be no False Positives and no False Negatives. In reality, this kind of model is rare.
- Necessary but Insufficient: This kind of model will consist of a single attribute, or set of attributes, which are necessary but insufficient for the outcome. In the Confusion matrix, there will be no False Negatives but there will be some False Positives.
- Sufficient but Unnecessary: This kind of model will consist of a single attribute, or set of attributes, which are sufficient but unnecessary for the outcome. In the Confusion matrix, there will be no False Positives but there will be some False Negatives.
- Most predictive – of any kind: This kind of model will consist of a single attribute, or set of attributes, which are likely to be in sufficient and unnecessary for the outcome, but still are good predictors, as measured by Accuracy, for example. In the Confusion matrix, there will be some False Positives and there will be some False Negatives. But it may also be the case that this search finds one of the above three models.
- When any of the above buttons are clicked this will take you to a Find New Models pop-up menu. This presents a choice of four search algorithms. See Search Options on this website for more detailed information about these choices
- Choose the performance indicator: the measure that should be maximised by the best models that can be found. There are three groups of these: Overall, Specific and Relative. For more information on these see Evaluate Model. Clue: Start by using the most widely used measure: Accuracy
- Set constraints. These can be of three types, which canna be used by themselves or in combination:
- Particular attributes in the Design view whose values need to remain fixed. For example, as being present or absent
- Specific performance measures other than the one selected as the objective. For example that Lift =>100%
- Specific values for one or more cells in the Confusion Matrix
- Try setting False Positive = 0, to find Sufficient but Unnecessary attributes (or configurations of attributes)
- Try setting False Negative = 0, to find Necessary but Insufficient attributes (or configurations of attributes)
- Try setting False Negative = 0 and False Positive = 0 to find Necessary and Sufficient attributes
- Postscript: There are now three radio button options that can be used to set these constraints with one click
- Implement the search by clicking Okay
- If using exhaustive search, watch the process bar in order to assess if the results will be ready within the time available. If not, cancel.
View the results of the search, given the settings above.
The attributes that have been found as the best predictors of the outcome (known as “the model”) will appear in the Design area, replacing any previous selection. This found model will automatically be saved and the saved name will be visible to the right of the “Save Model” button
The raw results of the prediction model will be shown in the Confusion Matrix in the Evaluate area. See Evaluate Model for more information on how to read the Confusion Matrix.
The performance measures derived from the Confusion Matrix can be seen listed further below the martix. These are used to summarise the performance of the current model in predicting the outcome of interest.
Revise the results
Within the Design & Evaluate worksheet you can tweak the values of the attributes in the new model in order to:
- Incrementally improve performance of the model
- Identify what attributes in the model contribute most/least to its overall performance. For more on this option see Sensitivity Analysis
- Postscript: There is now a Sensitivity button on the right, which if clicked will then highlight the attribute in the current model which contributes the most to its good performance. This is measured by comparing the % point reduction in model performance when each attribute is selectively removed from the model
Save the results
Save the results of each version of the model that you find to be of value. This will be done automatically, with a unique name, if exhaustive of evolutionary searches have been carried out. But if there has been any manual tweaking the resulting model will then need to be saved manually