ICP8 - GeoSnipes/Big-Data GitHub Wiki
5-2 15 Naga Venkata Satya Pranoop Mutha
5-2 23 Geovanni West
- Write a tensor flow code implementing SOFTMAX classification for MNIST Dataset and report its accuracy.
- Display the tensorboard for the task
Before going into the assignment, we will learn some terminology and basics regarding Softmax Classification, TensorFlow, Deeplearning and some commands used in this code.
TensorFlow:
It is an open source machine learning framework and software library for numerical computation using data flow graphs.Nodes in the graph represent the mathemetical operations, while the graph edges represent the multi dimensional data arrays (i.e. tensors) communicated between them. This allows us to deploy to one or more CPU's or GPU's in a desktop, server or mobile device using single API. The leading company Google used this tensorflow.
Comparing and Contrast of Logistic and SoftMax Regression Softmax Regression (synonyms: Multinomial Logistic, Maximum Entropy Classifier, or just Multi-class Logistic Regression) is a generalization of logistic regression that we can use for multi-class classification (under the assumption that the classes are mutually exclusive). In contrast, we use the (standard) Logistic Regression model in binary classification tasks.
Computation:
Differences:
The placeholder_inputs() function creates two tf.placeholder ops that define the shape of the inputs, including the batch size, to the rest of the graph and into which the actual training examples will be fed.
After creating placeholders for the data, the graph is built from the mnist.py file according to a 3-stage pattern: inference(), loss(), and training().
- inference() - Builds the graph as far as is required for running the network forward to make predictions. It takes the images placeholder as input and builds on top of it a pair of fully connected layers with ReLU activation followed by a ten node linear layer specifying the output logits.
Commands:
- hidden1 = tf.nn.relu(tf.matmul(images, weights) + biases)
- hidden2 = tf.nn.relu(tf.matmul(hidden1, weights) + biases)
- logits = tf.matmul(hidden2, weights) + biases
- loss() - Adds to the inference graph the ops required to generate loss. First, the values from the labels_placeholder are converted to 64-bit integers. Then, a tf.nn.sparse_softmax_cross_entropy_with_logits op is added to automatically produce 1-hot labels from the labels_placeholder and compare the output logits from the inference() function with those 1-hot labels.
Command:
- labels = tf.to_int64(labels)
- cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=logits, name='xentropy')
It then uses tf.reduce_mean to average the cross entropy values across the batch dimension (the first dimension) as the total loss.
=> loss = tf.reduce_mean(cross_entropy, name='xentropy_mean')
- training() - Adds to the loss graph the ops required to compute and apply gradients. Firstly, it takes the loss tensor from the loss() function and hands it to a tf.summary.scalar, an op for generating summary values into the events file when used with a tf.summary.FileWriter (see below). In this case, it will emit the snapshot value of the loss every time the summaries are written out.
Commands:
- tf.summary.scalar('loss', loss)
- optimizer = tf.train.GradientDescentOptimizer(learning_rate)
Once the graph is built, it can be iteratively trained and evaluated in a loop controlled by the user code in fully_connected_feed.py.
Feed mechanism is used for patching a tensor directly into any operation in the graph. A feed temporarily replaces the output of an operation with a tensor value. In the fill_feed_dict() function, the given DataSet is queried for its next batch_size set of images and labels, and tensors matching the placeholders are filled containing the next images and labels.
A python dictionary object is then generated with the placeholders as keys and the representative feed tensors as values.
Command: feed_dict = { images_placeholder: images_feed, labels_placeholder: labels_feed, }
Input: MNIST Data Set which is a dataset of handwritten digits.
First step is to run the training data and perform SOFTMAX Classification on the input data. The equation is shown in the below screenshot
Stochastic Training:
Here we loop this weight equation over 1000 iterations only.
The accuracy was reported as 90.6% . It is fairly a good value of accuracy. But we still cannot the say that the model is accurate. It needs more accurate percentage for a model. As compared to MNIST's record accuraccy of 99.7%, we can say that this model is less accurate.
Then we visualize using the Tensorboard. PFB the commands
1. Gradient - Descent
2. Cross - Entropy:
3. Bias Distribution:
4. Bias in Histograms
Link to Source Code https://github.com/GeoSnipes/Big-Data/tree/master/ICP/icp8/Source/MNIST_SOFTMAX