ICP7 - SaranAkkiraju/Python_and_Deep_Learning_Programming_ICP GitHub Wiki

Objectives

1. Extract the following web URL text using BeautifulSouphttps://en.wikipedia.org/wiki/Google

2. Save it in input.txt

Url

3. Apply the following on the text and show output:

a. Tokenization

word

sent

b. POS

POS

c. Stemming

stem

e. Trigram

trigram

f. Named Entity Recognition

NER

4. Change the classifier in the given code to

a. KNeighborsClassifierand see how accuracy changes

KNN

b. change the tfidf vectorizer to use bigram and see how the accuracy changes TfidfVectorizer(ngram_range=(1,2))

norm

bi

c. Put argument stop_words='english' and see how accuracy changes

stop_words