Results - RodentDataAnalytics/mwm-ml-gen GitHub Wiki

This section will focus on the results generated by pressing the buttons of the Results panel of the main menu.

Contents

  1. Demo
  2. Metrics
  3. Strategies
  4. Transitions
  5. Probabilities
  6. Class Statistics
  7. Other Results
  8. The Friedman Test
  9. Extra Options

Demo

The demo button runs a full analysis using the original rat data from EPFL (reproduces the results of Vouros et al.).

Metrics

Generates figures showing the (a) the escape latency; (b) the average movement speed and (c) the average path length of the animals over the trials for one or two animal groups. In each case 2 figures are generated; the first illustrate the results as a boxplot (bars represent the first and third quartiles of the data; the gray or white line splitting the bars is the median, whiskers indicate the minimum and the maximum values and crosses are the outliers.) and the second illustrate the results as a barplot. The raw numbers of the boxplots are also saved in separate CSV files. Requires a default segmentation to be selected in order to run and in case more than one animal groups are specified then a dialog box with pop-up asking for which one or two groups the result will be generated. If two groups are provided then the white lines will refer to the first specified group and the black lines to the second. Moreover, in this case, the p-values of the Friedman test will be computed (see Friedman Test) and saved inside the Friedman_p.csv file. The Friedman test is computed throughout the trials (column 2) or throughout the days (column 3).

Any result will be saved on the results folder of the project inside a folder called metrics and inside a subfolder specifing the selected group(s).

metrics

Strategies

Generates figures showing the number of segments for each strategy adopted by the animals throughout the trials. For each strategy 2 figures are generated; the first illustrate the number of segments falling under each class on each trial as a boxplot (bars represent the first and third quartiles of the data; the gray or white line splitting the bars is the median, whiskers indicate the minimum and the maximum values and crosses are the outliers.) and the second illustrate the same results as percentages for each trail in a barplot. The raw numbers of the boxplots are also saved in separate CSV files. Requires a default segmentation and a default classification to be selected in order to run and in case more than one animal groups are present then a dialog box with pop-up asking for which one or two groups the result will be generated. If two groups are provided then the white lines will refer to the first specified group and the black lines to the second. Moreover, in this case, the p-values of the Friedman test will be computed (see Friedman Test). This process will be repeated number of iterations times.

Any result will be saved on the results folder of the project inside a folder with name Strategies-class _ [number of labels] _ [number of segments] _ [segments length] _ [segments overlap] _ [number of classifiers used] _ [number of iterations] _ [merging rule] [optional: -user note]. This folder will contain a number of subfolders depending on the [number of classifiers] or the [number of iterations] used for the selected classification and they will be named after g[animal group(s)]res_[x] ([x] equals 1 to number of classifiers] or the [number of iterations]) and one more subfolder g[animal group(s)]res_summary. Each of these subfolders will contain results in both image and CSV format. Specifically the final 'summary' subfolder will contain the average results of all the generated results and some more files:

  • pvalues_summary: Contains a summary of all the p-values with an additional column having a score showing the agreement (how many classifiers or ensembles agree on each strategy that there is significant difference between the two animal groups).

  • binomial: A collection of plots showing the 95% binomial confidence intervals for the classifiers or the ensembles regarding their agreement, if there is significant difference between the two specified animal groups on each strategy or not. Squares indicate the mean of the classifiers/ensembles that point out that there is significant difference on each strategy; errorbars are the 95% confidence intervals; the dashed line indicates the threshold of interest (0.5 or 50%). In order to be confident that there is indeed a significant difference between the two animal groups on each strategy and the strategy transitions the confidence intervals should be clearly above 0.5 (or 50%). Raw information are also extracted in a TEXT file.

  • pvalues_summary: Contains a figure of all the p-values of each strategy plotted in boxplots format.

strategies

strategies

Note: In case the user does not wish to run the Friedman test but requires the classification results for custom analysis then this procedure may be run multiple times and each time a different group may be selected. Inside the appropriate folder there will also be two CSV files containing the raw classification results arranged as follows:

  • AnimalID: The ID of the animal which performed this trajectory.

  • Trajectory: The ID of the trajectory.

  • TrialNo: The trial in which this swimming path was performed.

  • OriginalGroup: The group of the animal as specified in the beginning of the project.

  • TargetGroup: The new group of the animal in case animal groups has been merged prior to the results generation (refer to Extra Options).

  • Behaviours_: The rest of the row contains the class ID (file: RawData.csv) for each one of the segments of the trajectory (refer to List of Labels for the default class IDs) or the time that the animal spent performing each strategy (file: RawData_time.csv).

    • 0 indicates that this segment wasn't classified.

    • -1 indicates the end of the trajectory. Trajectories with no segments will have only -1.

strategies

Transitions

Generates a figure showing the number of transitions between strategies adopted by the animals over the trials. Requirements and results are equivalent to the ones described on the strategies section above.

Probabilities

Calculates the transition probabilities of strategies adopted by the animals within trials. Requires a default segmentation and a default classification to be selected in order to run and in case more than one animal groups are specified then a dialog box with pop-up asking for which one or two groups the result will be generated.

Any result will be saved on the results folder of the project inside a folder with name Transitions-class _ [number of labels] _ [number of segments] _ [segments length] _ [segments overlap] _ [number of classifiers used] _ [number of iterations] _ [merging rule] [optional: -user note]. This folder will contain a number of subfolders depending on the [number of iterations] used to generate the selected default classification named after g[animal group(s)]res_[x] ([x] equals 1 to [number of iterations]) and one more subfolder g[animal group(s)]res_summary. Each of these subfolders a TEXT-file (.txt) containing the results. The final 'summary' subfolder will contain the average results exported in a CSV-file (.csv).

probabilities

Class Statistics

Calculates the number of strategies detected by each classifier or ensemble and computes the agreement between the classifiers or the ensembles. Requires a default segmentation and a default classification to be selected in order to run.

Generates two CSV-files (.csv) one contains numerical results and the other percentages. These files are generated inside the subfolder statistics-class _ [number of labels] _ [number of segments] _ [segments length] _ [segments overlap] _ [number of classifiers used] _ [number of iterations] _ [merging rule] [optional: -user note].

statistics

Moreover, it generates a series of CSV-files (.csv) which contain the agreement among the classifiers of the ensembles (results also exported in MAT-format (.mat)):

agreement

Finally an overall 'agreement matrix' of the classification agreement is generated and exported as an image file (multiple files may be generated each one holding a 10x10 grid of the overall matrix. In that case the 10x10 grid will move from up to down and from right to left), agreement _matix_icon1.[specified image format], a CSV-file (.csv), agreement _matrix.csv and a MAT-file (.mat), agreement _matrix.mat.

agreement  matrix

Other Results

  • The folder exported_pics_segmentation_[number] contains image files of the segments exported via Browse Trajectories. Each file has a specific name traj[number] or traj[number]seg[number] showing the exact location of the trajectory or segment.

  • The folder labels _ [number of labels] _ [segments length] _ [segments overlap] _ cross _ validation contains figures of the Labelling Quality process and a CSV file containing the numeric values.

The Friedman Test

In case two animal groups are specified with uneven number of animals then some animals needs to be excluded from the group with the most animals in order for the two groups to be equalised. This is a requirement of the Friedman test. To discard animals the following window will appear:

equalize_groups

  1. Information on how many animals exist in each of the two specified groups and how many animals need to be removed from one of them in order for both groups to have the same number of animals.

  2. The left listbox lists all the animal ids of the group with the larger number of animals. Each of these ids can be selected and moved to the right listbox which will contain the excluded animals. The buttons => and <= are used to move the animal ids between the two listboxes. If the button => is greyed then no more animals may be excluded as the two groups are now having the same number of animals. If <= is greyed then the right listbox does not have any animal ids.

  3. In order to ease the exclusion process four sort buttons are placed which sort the animal ids by animal speed (Sort by Speed), animal path length (Sort by Path Length), animal latency (Sort by Latency) and animal id value (Sort by Value). The animals are always sorted in ascending order.

  4. After the appropriate number of animal ids has been reached the OK button will become clickable and pressing it would resume the program's result process. Clicking the Cancel button will return the main menu.

Extra Options

  1. When the dataset contains more than one animal groups, for some results a pop-up window will appear asking for which animal group(s) the analysis should perform. The user can select one or two groups (in case of two groups the Friedman test is also performed) or choose to merge groups together by using colon (:). For example 1:2 will result on merging the animal groups 1 and 2 together while 1:2,3:4 will result on merging the animal groups 1 and 2 and compare them with the merging groups 3 and 4.

select groups

  1. In case the selected classification contains multiple classifiers or ensembles a pop-up window will appear asking if for each classifier/ensemble figures should be generated. In this occasion the interest is on the results inside the summary folder and generating all the results for each classifier/ensemble takes more time.

detailed results

  1. Small swimming paths with no segments are automatically assigned to the class Direct Finding. A pop-up window will appear asking if these should be assigned to other classes. If yes then the Browse Trajectories GUI will appear. This GUI will now have limited functionalities: only the unsegmented trajectories can be visualised and only one label can be placed on each one of them; the other functionalities are disabled. Note: this extra labelling will be temporally only for the generation of the results in case the results needs to be re-generated it needs to be redone.

extra segments

  1. For the Class Statistics a pop-up windows will appear asking if the results should be generated before or after the smoothing function has been applied. In summary, since the classification has been performed on overlapping segments of the animals' swimming paths we need to map them back to the whole trajectories and computed the evolution of the strategies; this is done by a smoothing function which depends on the arena dimensions. If the statistics are generated based on the final mapped-to-the- trajectories segments then the folder holding these results will have the note '_smooth' on its name.

statistics smooth