Lab Assignment 4 - GeoSnipes/Big-Data GitHub Wiki

LAB ASSIGNMENT #4

Team Members

Pranoop Mutha - 15

Objective:

  • To write a TensorFlow program for the following Task.

  • Implement linear regression for your project dataset

  • Plot training cost using Matplotlib in python.

  • Implement SoftMax Classification for your project Dataset. Report accuracy

  • Visualizations (Tensor Board): training and testing both.

Features

Tensorflow

TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.

MNIST Classification

MNIST is a simple computer vision dataset. It consists of images of handwritten digits.

It also includes labels for each image, telling us which digit it is. For example, the labels for the above images are 5, 0, 4, and 1.

Each image in MNIST has a corresponding label, a number between 0 and 9 representing the digit drawn in the image.

Linear Regression

Linear regression is a basic and commonly used type of predictive analysis. The overall idea of regression is to examine two things:

(1) does a set of predictor variables do a good job in predicting an outcome (dependent) variable?

(2) Which variables in particular are significant predictors of the outcome variable, and in what way do they–indicated by the magnitude and sign of the beta estimates–impact the outcome variable.

These regression estimates are used to explain the relationship between one dependent variable and one or more independent variables. The simplest form of the regression equation with one dependent and one independent variable is defined by the formula y = c + b*x, where y = estimated dependent variable score, c = constant, b = regression coefficient, and x = score on the independent variable.

SoftMax Regression Model

Softmax regression is a generalized form of logistic regression which can be used in multi-class classification problems where the classes are mutually exclusive.The hand-written digit dataset used in this tutorial is a perfect example.

Steps:

We are using our project dataset, we are converting our images to mnist format.

This is the code we used for converting IMG to MNIST Format.

It was converted successfully.

Linear Regression

The MNIST_data generated was considered as input. Here we have used the following code to obtain accuracy for our linear regression model.

Here we have set the Validation size to be 200 and Learning rate as 0.1.

SoftMax Regression

The MNIST_data generated was considered as input.

Here we have used the following code to obtain accuracy for our linear regression model.

Here we have set the Validation size to be 100.

First, we run the mnist_train.py to train our model.The code ran successfully and mnist_model was generated.

Now we use the trained model to test the accuracy of model.

Accuracy of our model is 19.13%

Tensor Board Visualizations:

Launch the tensorboard to view the different Visualizations.

Visualizations Outputs:

Main Graph:

The graph visualization helps to understand the complicated tensorflow computation.

Cross - entropy: The cross entropy has decreased very slowly from approx. 2.30 in first epoch to approx. 2.21 in the 1000th epoch

Cross - hist Distribution and Histogram:

Bias Distributions and Histogram:

Max Weights Distribution and Histogram:

Weights Distribution and Histogram: