Lab Assignment 3 - nikky4222/BigDataSpring2017 GitHub Wiki

LAB ASSIGNMENT 3


Question

1.Implement to build a linear regression model for selected two parameters for chimpanzee’s daily movement, activities and interaction. Define your own datasets.
2.Implement K-Means clustering for the clusters of the chimpanzee’s activities. Define your own data sets.

Linear Regression

n this post I will implement the linear regression and get to see it work on data. Linear Regression is the oldest and most widely used predictive model in the field of machine learning. The goal is to minimize the sum of the squared errrors to fit a straight line to a set of data points.

Data Sets

A data set has been designed with 4 parameters eating,sleeping,grooming,fighting.A data set has been designed in Matlab





Similarly other 2 parameters are considered and the values are obtained resulting a final dataset as mentioned below.



The program takes the above mentioned text file as an input and calculate the error.The step size and iteration can be changed based on the accuracy.



After the program has been executed the mean train and test error is found.

K Means Clustering

k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster.

Dataset

A dataset is created similarly as done for linear regression with 4 parameters using Matlab.The data is mentioned below.



The Mean squared error is calculated based on the clusters and the centroid.



Image Annotation Using Clarifai API

The Clarifai API offers image and video recognition as a service. Whether you have one image or billions, you are only steps away from using artificial intelligence to recognize your visual content. The API is built around a simple idea. You send inputs (images) to the service and it returns predictions.
The type of prediction is based on what model you run the input through. For example, if you run your input through the 'food' model, the predictions it returns will contain concepts that the 'food' model knows about. If you run your input through the 'color' model, it will return predictions about the dominant colors in your image.
The output of names,weights that are displayed in the closed are saved to text file which can be used for further proccesing.
Firstly for the video that passed as an input Keyframes an Minframes are generated.

Key Frames

For the video that is sent as an input various key frames are generated.









MainFrames




Image Annotation & Summary

This main frames are given as an input to image annotation program that finds the name,weight for all the mainframes and display the annotations present in each main frame.



An output is generated with the name and weigths.
For the summary the top 6 words is printed as mentioned below by creating a loop and running through the mainframes.



The output for each mainframe is mentioned below.









Similarly all other values are displayed.The final annotations are randomly plotted on the image.





Similarly all other annotations of the main frames are done.
⚠️ **GitHub.com Fallback** ⚠️