Wiki Report for ICP 10 - NagaSurendraBethapudi/Python-ICP GitHub Wiki
Video Link :
- https://drive.google.com/file/d/1FtUSmBPqY_EdyywWMzpw-cyvwGeN3jcj/view?usp=sharing
- https://drive.google.com/file/d/1WZwcW4mRB9_jv7Z6QBIteHGBxFtMD8Fw/view?usp=sharing
Question 1 , Question 2 :
In the code provided https://umkc.box.com/s/3so2s3dx7cjp4hwnurjx6t3it161ptey, there are three mistake which stop the code to get run successfully; find those mistakes and explain why they need to be corrected to be able to get the code run2.Add embedding layer to the model, did you experience any improvement?
Explanation :
- Imported the libraries
- Imported the data
- Mistake 1 : input_dim is not defined
- Mistake 2 : Generally, we use softmax activation instead of sigmoid with the cross-entropy loss because softmax activation distributes the probability throughout each output node.
- Mistake 3 : As the main moto of predection is positive or negative we will remove 'unsup' value from label column and will pass number of classes as 2 instead of 3
- Accuracy and loss plots
Embedding the layers:
- Accuracy and loss plots
Question 3 : Apply the code on 20_newsgroup data set we worked in the previous classes
Explanation
- Imported the libraries
- Imported the data While importing the code given in the question, we got error like categories is not defined , so i looked over kaggle for categories. #https://www.kaggle.com/arihantjain09/20-groups-best-predictions-and-visuals`
- categories = ['alt.atheism', 'sci.space']
- Build the model
model = Sequential()
model.add(Embedding(vocab_size, 50, input_length=max_review_len))
model.add(Flatten()) #flatten has to be done after embedding
model.add(Dropout(0.2)) #dropping some outliers
model.add(layers.Dense(300,activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
#compilation
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['acc'])
history=model.fit(padded_train,y_train, epochs=5, verbose=True, validation_data=(paded_test,y_test), batch_size=64)
- Found Accuracy and loss of the model
- Accuracy and loss plots
Learnings :
I learned about word embedding, Frequency based encoding, prediction based encoding, different activation functions and embedding the layers.
Difficulties:
Everything looks good