DataSet Explorer (DSE) tool - Clean-CaDET/dataset-explorer GitHub Wiki

In this section, we will present the DataSet Explorer functionalities. We will describe the most important functionalities and present activity diagrams to provide a better understanding of these functionalities. We will also provide videos demonstrating the functionalities. The playlist with all the videos can be found here.

Secondly, we present class diagrams showing entities and their relationships. By understanding these entities and their relationships, users and developers can better customize DataSet Explorer to fit their needs and integrate it with other software or systems.

Functionalities are divided into the following sections:

Annotation schema
New dataset
Annotations

1. Annotation schema

1.1 CRUD (create, read, update, delete)

DSE tool allows users to define annotation schema by creating code smells, heuristics and severities for each code smell. Users can create any number of code smells, heuristics, and severities. They can also read, update, and delete annotation schema entities (code smells, heuristics, and severities). An activity diagram of CREATE functionality can be seen below:

Annotation schema entities and their relationships can be examined in the class diagram below:

WATCH A VIDEO DEMONSTRATING THESE FUNCTIONALITIES.

1.2 Search and filter

The DSE tool allows users to search code smells and heuristics based on names. Users can also search severities based on values. Next, users can filter code smells by code snippet type.

WATCH A VIDEO DEMONSTRATING THESE FUNCTIONALITIES.

2. New dataset

2.1 CRUD (create, read, update, delete)

The DSE tool allows users to create, read, update, and delete datasets. After creating an empty dataset, the tool allows users to add projects to the dataset*. Users can also read, update, and delete projects within the dataset. Below is an activity diagram of CREATE functionality:

Dataset entities and their relationships can be examined in the class diagram below:

Since the dataset can contain several code smells, the tool will divide the instances within the project into several groups - a group for each code smell selected for the dataset. This group is represented by the SmellCandidateInstances class. Example: The user created a dataset for two code smells - Long Method and Large Class. The user adds projects to the dataset. The tool creates two groups (one group for Long Method and one group for Large Class code smell) and assigns instances to these groups:

WATCH A VIDEO DEMONSTRATING THESE FUNCTIONALITIES.

2.2 Search and filter

DSE tool allows users to search datasets, projects, and instances. It also allows users to filter instances based on group. For example, the user created a dataset for two code smells: Long Method and Large Class. The tool created two groups (a group for Long Method and a group for Large Class code smell) and assigned instances to them. Filtering by group will display only instances that belong to that group.

WATCH A VIDEO DEMONSTRATING THESE FUNCTIONALITIES.

3. Annotations

3.1 CRU (create, read, update)

DSE tool allows users to annotate datasets. The user chooses the project within the dataset, the group within the project, and the instance within the group. The user analyzes the instance and fills in the annotation form. Users can also read and update previously created annotations. An activity diagram of CREATE functionality can be seen below:

Annotation related entities and their relationships can be examined in the class diagram below:

WATCH A VIDEO DEMONSTRATING THESE FUNCTIONALITIES.

3.2 Automatic annotation mode

DSE tool allows users to automate the annotation process by switching to "Automatic annotation mode". When this mode is enabled, the DSE tool automatically switches the user to the next instance after the user annotates an instance.

WATCH A VIDEO DEMONSTRATING THIS FUNCTIONALITY.

3.3 Filter

DSE tool allows users to filter annotated instances. The user can filter instances based on several filters:

whether instances are annotated or not
what severity was assigned
whether the user left the note or not

WATCH A VIDEO DEMONSTRATING THIS FUNCTIONALITY.

3.4 Annotations analysis

DSE tool allows users to determine instances requiring further annotation. These include:

instances without any annotations
instances annotated by a single user
instances annotated by multiple users, but their annotations are in conflict, and there is no consensus among them

DSE tool allows users to determine fully annotated instances (all annotators annotated the instance) with disagreeing annotations. For such instances, users can see the completed annotation forms of other users. After the users have discussed the instance, the annotations should be changed to make a consensus.

WATCH A VIDEO DEMONSTRATING THESE FUNCTIONALITIES.

3.5 Exporting annotated dataset

The dataset can be exported in several ways:

The user can export their annotations (draft dataset) at any time, regardless of whether he has annotated all instances.
The user can export a completely annotated dataset, which means that at least two annotators annotated each instance. The exportation is based on draft datasets previously exported by each user, so it is necessary to specify the path to the text file containing a list of those draft datasets. The exported complete dataset contains the following:

Annotation files containing annotations of all users for each instance, and the final annotation obtained using the majority vote algorithm
Heuristics files containing heuristics marked as applicable by each user for each instance
Metrics files containing values of structural metrics for each instance

WATCH A VIDEO DEMONSTRATING THESE FUNCTIONALITIES.

DataSet Explorer (DSE) tool - Clean-CaDET/dataset-explorer GitHub Wiki

1. Annotation schema

1.1 CRUD (create, read, update, delete)

1.2 Search and filter

2. New dataset

2.1 CRUD (create, read, update, delete)

2.2 Search and filter

3. Annotations

3.1 CRU (create, read, update)

3.2 Automatic annotation mode

3.3 Filter

3.4 Annotations analysis

3.5 Exporting annotated dataset

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️