Re-iterating an analysis



There are two ways in which this can be done

The first relate to a situation where there is already a multi-attribute model but it still produces False Positives. This model might have been developed manually, or by using exhaustive or evolutionary search. The steps involved are as follows:

  1. Implement the additional single attribute search. This in effect treats the attributes of the current model  as constraints and tries to find an additional attribute that is best able to improve its performance.
  2. This process can be re-iterated to build up a simple model into a progressively more complex model.
  3. It should stop when either:
    1. There are no more False Positives i.e. the existing model is Sufficient for the outcome to be present, or
    2. Any additional attribute does not improve the model, on the performance measure that is being used

The second approach involves the development of what is known as a Decision Tree. It has the following form. The values on the links indicate if the attribute above is present or absent.


Attribute A is discovered by using the single attribute search to find the best performing model i.e. with least False Positive and least False Negatives.  If the model says the outcome is present when the attribute is present, we go down the right hand branch (labelled 1) and try to find Attribute C..We do this by reiterating the single attribute search.  If the new and enlarged model (A=1&C=1) says the outcome is present when Attribute A and C are both present, then we go further down the right most branch and try to find Attribute G.  We do this by reiterating the single attribute search.

We stop doing this reiteration when either there are no more False Positives or if the additional attribute does not improve the performance of the model it is building on e.g. if A=1&C=1&G=1 is no better than A&C

If there are False Negatives in the A=1&C=1 attribute model, we need to re-iterate the analysis to reduce these. The current A=1&C=1 model can be manually edited to  A=1&C=0 to convert these to False Positives. We then re-iterate the single attribute search to see if an additional attribute (Attribute F) will reduce these. If it does, we continue re-iteration until either of the two stopping conditions apply (no False Positive left, or no improvement in the model)

We repeat the process of exploring the consequence of both the presence and absence of each attribute in the tree. Each branch of the tree is in effect a prediction model for sub-set of all the cases in the data set

A working example: The diagram below shows the types of re-iteration possible. It uses the Krook data set available under the Data Sets tab of this website. 

CM tree Krook Conventional




%d bloggers like this: