T4SA - sporedata/researchdesigneR GitHub Wiki

General description

Twitter for Sentiment Analysis (T4SA) corpus is a collection of tweets containing text and images collected from July to December 2016. During this time span, the researchers exploited Twitter’s Sample API to access a random 1% sample of the stream of all globally produced tweets, discarding:

  • tweets not containing any static image or containing other media (i.e., They also discarded tweets containing only videos and/or animated GIFs)
  • tweets not written in the English language
  • tweets whose text was less than 5 words long
  • retweets.

Related publications / Literature

Data access

You can download the T4SA dataset at Cross-Media Learning for Image Sentiment Analysis in the Wild .