Deep_Learning_3 - SaranAkkiraju/Python_and_Deep_Learning_Programming_ICP GitHub Wiki

Objective:

In the code provided there are three mistake which stop the code to get run successfully; find those mistakes and explain why they need to be corrected to be able to get the code run
Add embedding layer to the model, did you experience any improvement?
Apply the code on 20_newsgroup data set we worked in the previous classesfromsklearn.datasets importfetch_20newsgroupsnewsgroups_train =fetch_20newsgroups(subset='train', shuffle=True, categories=categories,)

The 3 mistakes in the code given are:

Input dimensions should be vocab_size
Output neurons should be 3 for positive, negative, neutral in the target column of dataset.
Output layer activation function should be softmax as it works best for the multi class classification

Compressed the dataset to 20000 records.
Tokenizing the data and converting into the text to matrix form.
Used the label encoder method to convert the text to digits and fit, transformed the data.
Split data into train and test which is considering the 25% as the test data.
Used the deep learning sequential model with 2 layers.
1st layer used is Embedded layer which finds out the meaning and captures the semantic relationships within the data. It follows the default word2vec algorithm which looks at the bigrams relationship among the data
2nd layer - No of neurons is 300 and activation function is relu.
Output layer - No of neurons is 20 and activation function is softmax as the output layer
No of the epoch is 5, Batch size is 256 and loss function used is sparse_categorical_crossentropy
Accuracy is 62%