Lab Assignment 1 B - SaratM34/KDM-Lab-Assignments GitHub Wiki

Name: Mudunuri Sri Sai Sarat Chandra Varma
ClassID: 14
Mail: [email protected]

Objective: The objective of this lab is to perform NLP tasks(POS, NER, Co-reference resolution system) on given sentences manually and to perform tasks on datasets using CoreNLP. Finally, create a simple question answering system.

IDE's used:

  • Intellij
  • PyCharm

Screenshots and Explanation:

1) Generating NLP tasks for the following sentences manually.

a) The dog saw John in the park

Part-Of-Speech (POS) tagger

  • saw-VBD
  • dog-NN
  • The-DT
  • John-NNP
  • park-NN
  • in-In
  • the-DT

Named entity recognizer (NER)

  • The-The
  • dog-ANIMAL
  • saw-saw
  • John-PERSON
  • in-in
  • the-the
  • park-PLACE

Co-reference resolution system

  • As it is a single sentence Co-reference resolution cannot be generated.

b)The little bear saw the fine fat trout in the rocky brook.

Part-Of-Speech (POS) tagger

  • saw-VBD
  • bear-NN
  • The-DT
  • little-JJ
  • trout-NNS
  • the-DT
  • fat-JJ
  • fine-JJ
  • brook-NN
  • in-IN
  • the-DT
  • rocky-JJ
  • ./. (punct)

Named entity recognizer (NER)

  • The-The
  • little-little
  • bear-ANIMAL
  • saw-saw
  • the-the
  • fine-fine
  • fat-fat
  • trout-trout
  • in-in
  • the-the
  • rocky-rocky
  • brook-brook
  • -13--13

Co-reference resolution system

  • As it is a single sentence Co-reference resolution cannot be generated.

2) Create a NLP project for the following tasks using CoreNLP

Input: I have taken an dataset file from entertainment and given as input.

  • Reading input from a file

Part-Of-Speech (POS) tagger

Named entity recognizer (NER)

Co-reference resolution system

Sentiment Analysis

3) Create a simple question answering system as an extension of the dataset and tasks done in (2).

Steps: I given a datasets(Politics, Entertainment and Sports) as input and created some questions based on the text in the dataset. I have given two files as input to the program one questions file and one story file i.e the dataset. The program runs the question file takes each question process it based on the dataset given and gives the answer. I have created a total of 9 questions (What, Where, Who). And prints no answer if answer not found. The program returns person name for Who question, location name for where and information for the What question based on the given dataset.

Output:

  • In the below screenshots QuesstionID is the questions and Answer is retrieved by processing the given dataset based on the question.

⚠️ **GitHub.com Fallback** ⚠️