Classification v2 - RodentDataAnalytics/mwm-ml-gen GitHub Wiki

The classification process requires a segmentation object created by the Segmentation process and a file containing Labelling Data. Furthermore, a predefined number of clusters needs to be provided (an ideal number can be found by using the Num of Clusters functionality).

Contents

  1. Classification Overview
  2. The Classification Panel

Classification Overview

As described in the Labelling process, a semi-supervised clustering algorithm is used for the classification of the trajectories segments. This algorithm requires only a small amount of labelling data to be provided in order to classify the rest of the segments. Moreover, as with most clustering algorithms, a predefined number of target clusters needs to be provided. This number can be identified by using the Num of Clusters functionality.

The Classification Panel

  • Number of Clusters: Specifies the predefined number of target clusters. To identify an ideal number the Num of Clusters functionality functionality should be used.

  • Classify (Button): Runs the Classification proccess. In order to run this process a Segmentation Configurations object and a labelling data CSV file needs to be provided (by default the program uses the ones specified in the Segmentation Configurations Object Path and in the Labelling Data Path). During this process, the program classifies all the segments and computes some common classification metrics (error, percentage of classified and unclassified segments, etc). When the process is finished the classification results and metrics are stored inside an object called classification_configs_#id which is stored inside the specified Output Folder.

  • Classification Configuration Object Path This textfield shows the full path of the classification_configs_#id object created from the classification process. This will be the default object for the rest of the program. If the user wants to use another classification object then he needs to specify its path here.

Note: the generated id has the following format DATE (yyyy-mm-dd) & TIME (hh-mm), for example classification_configs_2016-07-13-17-45 means the classification object created on 13/July/2016 at 17:45

segmentation