3. Mindtagger - Mcirino/pdd GitHub Wiki

What is Mindtagger?

Mindtagger is a general data annotation tool that provides a customizable, interactive graphical user interface. An annotation task takes a list of items and a template that defines how each item should be rendered and what annotations can be added as inputs.

Mindtagger provides an interactive interface for users to look at the rendered items and quickly go through each of them to manually leave annotations. These annotations collected over time can be then outputted in various forms (SQL, CSV/TSV, and JSON) to be used outside of the tool such as augmenting the ground truth of a DeepDive application. Note that Mindtagger currently does not help with the sampling part but only supports the labeling task for the precision/recall estimation. Therefore, producing the right samples for the correct estimation is a Mindtagger user's responsibility.

More information on Mindtagger is available here.

Using Mindtagger

Step One: Installation

First, install Mindbender. Darwin is the Mac version.

Step Two: Set-up and Configuration

mindbender.sh is a Unix script which should live in a directory that contains sub-folders for each dataset. In this case, we have body_size_0 and body_size_1.

In each of these folders, we should have the dataset, mindtagger.conf, and template.html. template.html provides the interface, which can be customized by the user. You can specify the formatting and what to display.

The mindtagger.conf is only a few lines specifying the filename and the name of the column containing unique key ID’s. In this case, body_size_0.tsv has image names as unique to each entry.

Step Three: Run Mindtagger

To start Mindtagger, open Terminal and run mindbender.sh as tagger, including the configuration file. To navigate to Mindtagger, copy the http:// address into a browser.

Now that you have Mindtagger running, you are ready to tag datasets! Shift + / brings up a help menu with shortcuts. You can mark entries as correct, incorrect, or unknown, as well as add tags and notes. When you are done, you can export the tagged data as a .json file, which can be converted to a .csv file for use in Excel.