Why NLP - doraithodla/notes GitHub Wiki
I can give a dozen reasons why you should learn NLP. Let us start with a few.
- NLP is a rapidly evolving field with many unsolved problems. This provides plenty of opportunities for product innovation
- More of world's information is in text format. Learning to analyze text, mine information from it and extract insights will create large markets.
- It is an interesting field of study at the intersection of AI/ML/CS and others like CogSci.
However, till now, we could only read and write text. Digital text changes all that.
Our focus in this series of articles is analyzing text, deriving insights from it.
We will do the following:
- Provide a brief description of the problem we are trying to solve
- Provide code examples that demonstrate simple solutions
- Discuss how these examples can be improved
- Provide some references and theoretical background.
Here is an example.
Problem:
- Identify and cluster similar news items based on their title and first para
- Get news from RSS feeds (from multiple sources)
- Extract title and first para
- Vectorize the text
- Cluster the news items
- Find similarity score
Other similar apps
- Tweet similarity
- Research paper clustering and similarity
- Customer support call log analysis
Technologies: tokenization, Stemming, Lemmatization, Wordnet, colocation, relationship mining, POS, NER, text cleaning, vectorization, tf/idf, topic mining, intent mining, question/answering, calculating similarities with others text