ICP_12 DeepLearning - acvc279/Python_Deeplearning GitHub Wiki

VIDEO LINK: https://drive.google.com/file/d/1VUY1ypPOTh_NAVjkqW9d2183LwjVduBq/view?usp=drivesdk

1. Save the model and use the saved model to predict on new text data (ex, “A lot of good things are happening. We are respected again throughout the world, and that's a great thing.@realDonaldTrump”)

First import all the laibraries which are required to execute.

  • Load the data data = pd.read_csv('/content/drive/MyDrive/Sentiment.csv')
  • Keep only nessary colomns data = data['text','sentiment'](/acvc279/Python_Deeplearning/wiki/'text','sentiment')
  • Enhance the text data
data['text'] = data['text'].apply((lambda x: re.sub('[^a-zA-z0-9\s]', '', x)))
for idx, row in data.iterrows():
    row[0] = row[0].replace('rt', ' ')
  • Perform tokinization
tokenizer = Tokenizer(num_words=max_fatures, split=' ')
tokenizer.fit_on_texts(data['text'].values)
  • Create a model
  • Label encoder
  • Filering the model and finding accuraccy
  • Saved the model: model.save('sentimentsaved.h5')
  • Here we are going to predict the whether the given tweet is positive or negative
  • Loading the saved file data = models.load_model('sentimentsaved.h5')
  • splitting the test to tokens and sequence to text converting
tweet = ['A lot of good things are happening. We are respected again throughout the world, and thats a great thing.@realDonaldTrump”']
max_fatures = 2000
tokenizer = Tokenizer(num_words=max_fatures, split=' ')
tokenizer.fit_on_texts(tweet)
tweet = tokenizer.texts_to_sequences(tweet)
  • Predicting the tweet
sentiment = model.predict(tweet,batch_size=1,verbose = 2)
if(np.argmax(sentiment) == 0):
    print("negative")
elif (np.argmax(sentiment) == 1):
    print("positive")
  • OUTPUT of sourcecode:
  • Output of predicting segment of tweet:

2. Apply GridSearchCV on the source code provided in the class

  • Get the source code which we have done on first question.
  • Applying the GridsearchCV on the source code
#GridsearchCV
from keras.wrappers.scikit_learn import KerasClassifier
model = KerasClassifier(build_fn=createmodel, verbose=0)
batch_size= [10, 20, 40]
epochs = [1, 2, 3]
param_grid= dict(batch_size=batch_size, epochs=epochs)

from sklearn.model_selection import GridSearchCV
grid= GridSearchCV(estimator=model, param_grid=param_grid)
grid_result= grid.fit(X_train, Y_train)
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
  • Output after performing gridcv:

3. Apply the code on spam dataset availablein thesourcecode (text classification on the spam.csvdata set)

  • Perform the source on spam dataset.

  • Load the spam dataset: data = pd.read_csv('spam.csv',encoding='latin-1')

  • keep the required colomns: data = data['v2','v1'](/acvc279/Python_Deeplearning/wiki/'v2','v1')

  • Now done the operations which have done on 1st question which is source code and execute it.

  • Output:

  • Learned from these ICP: RNN, GridSerachCV, LSTN

  • Difficulty faced: nothing, all good.