Lab Assignment 3 Vedio Summarization & Linear Regression and k Clustering. - rashmitripathi/Big_Data_Analytics_And_Apps GitHub Wiki

## 1.Spark Programming:

Write a spark program for the following Machine Learning Tasks.

A group of primatologists wants to study the details of the daily movement, activities, and interactions of a group of 6 chimpanzees living on "chimp island" - a natural, though somewhat open habitat about 50 meters in diameter, bounded on all sides by water, in the San-Diego zoo. Since they don't want to sit all day every day recording the second-by second positions and activities of the chimps, they have come to you, a computer vision expert, for automated assistance. They are interested in both compiling statistics about the movement and location of individuals , and in the frequency and locations of different interactions and activities (feeding, sleeping, grooming, fighting, etc.) They are willing to help in labeling relevant activities, even to the point of answering a few hundred quick questions per day of data (what's she doing here?), but they don't want to sit through 24 hours of video to do it. Ultimately they want an automated database that they can use to find out how many hours a day chimp Jane sleeps and where, histogram preferred eating locations, obtain statistics on who grooms whom, etc. 1.Implement to build a linear regression model for selected two parameters for chimpanzee’s daily movement, activities and interaction. Define your own datasets.

2.Implement K-Means clustering for the clusters of the chimpanzee’s activities. Define your own data sets.

Linear regression

We have selected two parameters the first one shows the hour and second is position of chimpanzee as shown below. By this we can make out that when position is same then chimpanzee is sleeping.

Input

LinearInput

Output shows the mean square calculated

LinearOutput

Source Code SourceCode

K-Means Clustering Method

We have selected three parameters the first one shows the hour and second is position of chimpanzee,third is heartbeat of chimpanzee as shown below. By this we can make out that when position and heartbeat is same then chimpanzee activity is same.

Input:

kMeansInput

Output: Here we can see different clusters as shown as 0,1,2

kMeansOutput

Source Code:

SourceCode

2.Video Annotation:

Build a simple application to give the summary of a video by using Clarify API. Using OpenImg Library to the key-frame images from the clarify API.

Sample video used for summary

Input video

inputvideo

KeyFrames generated using code and then mainframes were generated

KeyFrames

MainFrame

MainFrame

HomePage application build for user to provide path using Maven and Servlet

KeyFrames

Home Page with Input provided by user

MainFrame

Summarized data as shown in text box to user by using clarify API and then summarized

MainFrame

References:

Class Tutorial