Wiki Report for ICP 10 - NagaSurendraBethapudi/Python-ICP GitHub Wiki

Video Link :

  1. https://drive.google.com/file/d/1FtUSmBPqY_EdyywWMzpw-cyvwGeN3jcj/view?usp=sharing
  2. https://drive.google.com/file/d/1WZwcW4mRB9_jv7Z6QBIteHGBxFtMD8Fw/view?usp=sharing

Question 1 , Question 2 :

In the code provided https://umkc.box.com/s/3so2s3dx7cjp4hwnurjx6t3it161ptey, there are three mistake which stop the code to get run successfully; find those mistakes and explain why they need to be corrected to be able to get the code run2.Add embedding layer to the model, did you experience any improvement?

Explanation :

  1. Imported the libraries
  2. Imported the data
  3. Mistake 1 : input_dim is not defined

  1. Mistake 2 : Generally, we use softmax activation instead of sigmoid with the cross-entropy loss because softmax activation distributes the probability throughout each output node.

  1. Mistake 3 : As the main moto of predection is positive or negative we will remove 'unsup' value from label column and will pass number of classes as 2 instead of 3

  1. Accuracy and loss plots

Embedding the layers:

  1. Accuracy and loss plots

Question 3 : Apply the code on 20_newsgroup data set we worked in the previous classes

Explanation

  1. Imported the libraries
  2. Imported the data While importing the code given in the question, we got error like categories is not defined , so i looked over kaggle for categories. #https://www.kaggle.com/arihantjain09/20-groups-best-predictions-and-visuals`
  • categories = ['alt.atheism', 'sci.space']
  1. Build the model
model = Sequential()
model.add(Embedding(vocab_size, 50, input_length=max_review_len)) 
model.add(Flatten()) #flatten has to be done after embedding
model.add(Dropout(0.2)) #dropping some outliers
model.add(layers.Dense(300,activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
#compilation
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['acc'])
history=model.fit(padded_train,y_train, epochs=5, verbose=True, validation_data=(paded_test,y_test), batch_size=64)
  1. Found Accuracy and loss of the model

  1. Accuracy and loss plots


Learnings :

I learned about word embedding, Frequency based encoding, prediction based encoding, different activation functions and embedding the layers.

Difficulties:

Everything looks good