Understanding Our Training Data - CjMoor3/ArcCI-Collab-Repo GitHub Wiki

Arc-CI Training Data (COCO Format)

Our dataset will be stored in the COCO (Common Objects in Context) format. On this page, we will explain the organizational structure of COCO and describe how it is relevant to using this module.

COCO file objects are .json files divided into two dictionaries- images and annotations. The "images" dictionary holds data pertaining to the images used in manual classification, and "annotations" contain the segmentation masks created from those images in the classification process. Both dictionaries have unique data keys which will be discussed below.

"images"

The "images" dictionary holds the file name, size, and identification number of images used in the classification process so that we can link our classification masks.

"file_name"

"file_name" is the first key in the images dictionary. When data is saved, all images used to create annotations will be appended to the images dictionary starting with that image's file name. Each image is given a standardized file name that starts with "img-" and is followed by the images corresponding Universally Unique ID (UUID), and the images file extension. The UUID present in the image file name should be identical to the "id" data key in the images dictionary as they are the same.

"height", "width"

"height" and "width" describe the height and width of each image in terms of pixels. In our "Using the ArcCI Training GUI" section of this wiki, we describe why these data keys are important; we should only use data where the images are resized to 256x256 pixels to optimize the accuracy of our data.

"id"

The "id" key is the last key in the image dictionaries and it stores the UUID of the image. This value is present in its corresponding image's file name and is used to link images to the annotations created from them. The "id" key in the "annotations" dictionary should also be identical to this value.

"annotations"

Annotations are created when a user chooses to press a classification button while a segment is selected. The program then "annotates" the selected segment by marking it as the class that corresponds to the button pressed by the user. Image masks used for training deep learning models, along with data necessary to keep our dataset organized and uniform can be found here.

"id"

When an image's segmentation parameters are set in the training GUI, the program counts the segments in the image and attributes each segment an id out of the total number of segments that the image has. If an image has n segments, each segment will be given a number starting at 0 and ending with n. This number appears here in the training data once segments are classified or saved.

"image_id"

This value should correspond to the "id" of the image used to make the respective annotations in the "annotations" dictionary.

"category_id"

This key corresponds to the class value of the class attribute to each given segment.

Water 0
Thin Ice 1
Shadow 2
Submerged Ice 3
Snow/Ice 4
Melt Pond 5

"segmentation"

This key is a nested dictionary with two sub-keys: "counts" and "size"

"size" is the x,y pixel size of the image used to make this segment- it should always be [256, 256]

"counts" Each segment mask data object is encoded by run length and stored in this dictionary. See this resource to learn more about run-length encoding (RLE). (https://iq.opengenus.org/run-length-encoding/)