UML: FILTERS - NMAI-lab/JLOAF GitHub Wiki

filters

Filters are responsible for reducing the size of the casebase

The Filter interface in the jLOAF framework follows the strategy pattern, where this interface has one method called filter, which is responsible for filtering the case base, and all the filters that implement it have their own implementations of that method. Filtering happens in the preprocessing phase so that the number of cases in the case base can be reduced, which will lead to faster retrievals of cases during run-time. the following sections give a brief explanation of the filters in this framework:

1- FeatureSelection:

FeatureSelection is a filter that filters the features of the inputs of the cases, rather than the cases them selves. It still implements the Filter interface, but it is an Abstract Class that many filters of the same behavior, which filter features rather than cases, extend. the Template Method pattern is used for the FeatureSelection filtering, where all feature-selection filters implement the filterFeature method in the FeatureSelection Abstract class. Feature Selection filters can be broken down into two sections:

1.1- Filter Features, such as SequentialBackwardSelection class, which returns a subset of the features of the inputs, i.e removes the features that might not be of big effect on the performance of the agent.

1.2- Weight Selection, such as HillClimbingAlgrithm and GeneticAlgorithm, these classes put weights on the features rather than removing any of them.

2- Clustering

The algorithms look for similar cases and combines them into one, so it can reduce the number of cases. there is more than a way to clustering, which is why the clustering class was made into an abstract class, and the different clustering classes extended it, again the template method pattern was used here.

3- Sampling

Reduces the number of cases in the case base. It can do this by oversampling or undersampling the minority or majority classes. However our algorithm does sampling in a novel manner by only making a an empty casebase and then adding new cases to it only if it couldn't predict the correct class for that case. This way it only adds cases that provide new information.