LAB 4 REPORT - SAISRIHARSHAS/Big-Data-Analytics-and-Applications-CS5542 GitHub Wiki

Create your own dataset for Image Classification Problem. Use the workflow as discussed in the Tutorial 4 Session using Decision Tree Algorithm. Report the accuracy and confusion matrix obtained. In the Wiki Page, include a brief description of your dataset and purpose behind image classification problem.

SOLUTION: Step 1

a) Selected input dataset based on our project "Virtual Reality project for Climate Change Awareness" and divided the images into test and training set. https://github.com/SAISRIHARSHAS/Big-Data-Analytics-and-Applications-CS5542/tree/master/Lab%204/Source/data

b) Extracting features by finding Key Descriptors

Key Descriptor Output:

Step 2

a) Created Bag of Features (BOF) using SIFT. Created Bag of Words (BOW) using the kMeans clustering and BOF to encode each image from training set. Finally the SIFT key vectors are plotted in the plane.

KMeans Output (result):

b) Created Histogram based on kMeans clustering. Histogram acts as feature vector for the image.

Histogram Output:

Step 3

Trained a image classifier with BOW. The image classifier used here is Decision Tree algorithm. Obtained confusion matrix as result by analyzing the prediction.

Sample Code:

Decision Tree Output:

Confusion Matrix Output:

Predicting Test Image:

Step 4

Classified image beasd on BOW. Obtained accuracy after classifying was 76%. The confusion matrix represents the prediction of the image.

Final Confusion Matrix and Accuracy: