10.Natural language processing.Common packages for natural language processing - sporedata/researchdesigneR GitHub Wiki

Common packages for natural language processing

  1. Beautiful Soup libraries is a Python library for pulling data out of HTML and XML files.
  2. text2vec provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents that are larger than available RAM.
  3. rvest package works with magrittr to make it easy to scrape information from web pages, like beautiful soup.
  4. tidytext Text mining for word processing and sentiment analysis using 'dplyr', 'ggplot2', and other tidy tools.
  5. stringr particularly handy package to work with regular expressions as it provides a few useful pattern matching functions.
  6. spacyr provides a convenient wrapper of that package in R, making it easy to access the powerful functionality of spaCy in a simple format.