Wiki Report for ICP 10 - NagaSurendraBethapudi/Python-ICP GitHub Wiki

Video Link :

Question 1 , Question 2 :

In the code provided https://umkc.box.com/s/3so2s3dx7cjp4hwnurjx6t3it161ptey, there are three mistake which stop the code to get run successfully; find those mistakes and explain why they need to be corrected to be able to get the code run2.Add embedding layer to the model, did you experience any improvement?

Explanation :

Imported the libraries
Imported the data
Mistake 1 : input_dim is not defined

Mistake 2 : Generally, we use softmax activation instead of sigmoid with the cross-entropy loss because softmax activation distributes the probability throughout each output node.

Mistake 3 : As the main moto of predection is positive or negative we will remove 'unsup' value from label column and will pass number of classes as 2 instead of 3

Accuracy and loss plots

Embedding the layers:

Accuracy and loss plots

Question 3 : Apply the code on 20_newsgroup data set we worked in the previous classes

Explanation

Imported the libraries
Imported the data While importing the code given in the question, we got error like categories is not defined , so i looked over kaggle for categories. #https://www.kaggle.com/arihantjain09/20-groups-best-predictions-and-visuals`

categories = ['alt.atheism', 'sci.space']

Build the model

model = Sequential()
model.add(Embedding(vocab_size, 50, input_length=max_review_len)) 
model.add(Flatten()) #flatten has to be done after embedding
model.add(Dropout(0.2)) #dropping some outliers
model.add(layers.Dense(300,activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
#compilation
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['acc'])
history=model.fit(padded_train,y_train, epochs=5, verbose=True, validation_data=(paded_test,y_test), batch_size=64)

Found Accuracy and loss of the model

Accuracy and loss plots

Learnings :

I learned about word embedding, Frequency based encoding, prediction based encoding, different activation functions and embedding the layers.

Difficulties:

Everything looks good