Lab 5 - nikhitasharma/RTBigDataAnalytics_Project GitHub Wiki

Lab 5 Task:
Implementing classification model (Decision Tree) by building on SparkMLLib and able to classify students correctly by training the model with student images and sample video using feature extraction. Reporting the F-measure, Precision , recall and Confusion Matrix.

Feature Extraction:
Extracting features from individual student images to use them for testing the model.
Extracting features from classroom video to use them for training the model.

Training the Model:
We used decision tree algorithm as classification algorithm for model creation.
Feature vectors from the classroom video are given as input for training the model.

Testing the Model:
The feature vectors from individual student images are used for testing the model. The better the accuracy percentage, the best the model performs. This step was done to accurately estimate the efficiency of the classification model.

Screenshots link: https://github.com/nikhitasharma/RTBigDataAnalytics_Project/issues/3
Panopto Video Link:
https://umkc.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=28e6614e-2623-4eea-b525-c30158099bf0

Results:
Accuracy: 0.50 (50 %)

Confusion Matrix:

True Positive = 1; False Positive = 0; True Negative = 0; False Negative = 1

Precision: tp / (tp + fp) = 1

Recall: tp / (tp + fn) = 1

F-measure:

(Precision * Recall) / (Precision + Recall) = (1*1)/ (1+1) = 0.5