Lab Assignment 5 Image Classification Using Random Forest & Client Application using Spark API - rashmitripathi/Big_Data_Analytics_And_Apps GitHub Wiki

1. Spark Programming:

Write a spark program for the following Machine Learning Task.
Create your own dataset for Image Classification Problem. Use the workflow as discussed in the Tutorial 4 Session using any classification algorithm (e.g., Random Forest, Naïve Bayes) excluding Decision Tree. Report the accuracy and confusion matrix obtained.

The data set chosen for this different types of cars like as shown below:

This will help us in understanding which type of cars are more popular in users based upon their selection.

The image data set is divided into training and testing data-

    75% of the data set is considered as training data
    25% of the data is considered as test data. 

Images are classified using Random Forest algorithm. Random forest (or random forests) is an ensemble classifier that consists of many decision trees and outputs the class that is the mode of the class's output by individual trees

Test Images:

Training Images:

Generated Histogram as:

Confusion Matrix as:

Accuracy is : .107

2. Client Application using Spark API

Write a client application using the Spark API to connecting between Spark and your client. Your client can be either Web application or Android application.

As part of this, I selected to classify weather images and given it as input to Image Classifier Scala program.

Front End: (making ajax call to Scala program)

Correct Predictions:

Wrong Predictions

3. Google Conversion Actions API

Build a simple application to have a conversion using Google Conversation Actions API about the summary you had generated about your video.

(Already done in Lab Assignment 4)

Refer below link for more information:

https://github.com/rashmitripathi/Big_Data_Analytics_And_Apps/wiki/Lab-Assignment-4--Confusion-Matrix-for-Image-Classfication-&-Google-Conversational-App

References

Class Tutorial