[GSoC 2013] Some new Insights by Joe Mathai - nilakshdas/ThinkUp GitHub Wiki

1 Uniqueness meter for TWEET/POST

An average tweet Savvy person might have greater than 4000 tweets to his credits.The probability of a same idea repeating in this 4000 tweets is very high and if the person is very conscious about what he tweets it would be a great help if a uniqueness meter is provided as a plugin and alert the user a ratio or a percentage match to a previous tweet(s) in the past and display that to. The same idea can be implemented across to posts too.

Implementation detail:

The idea for implementation is to use the shingling technique(set similarity using Jaccard coefficient) for tweets.This will be able to successfully identify subsets that are common in the multiple tweets and since this is a kind of fuzzy logic implementation the ratio or percentage can be set so that the Insight is generated if it is above a certain percentage. If the posts are taken into consideration for uniqueness it will still give a positive result as this will identify a small portion of repeated idea from a larger group of text. alternative : If the shingling is not feasible then a tweet analysis for common words can be used avoiding commonly used words that will be stored before hand.This will not be as effective as the set similarity.

visualization: For a visual representation the Google charts gauge representation can be used.

links:- http://en.wikipedia.org/wiki/W-shingling https://google-developers.appspot.com/chart/interactive/docs/gallery/gauge


2 Additional feature to standout plugin --AMPLIFICATION factor

The standout plugin currently just shows if someone interesting(based on followers followed you).But it would be an interesting and productive insight if we can calculate a Amplification factor for the same that is the amount of possible people you reach if that (standout) person re-tweets your tweet. Amplification factor data also helps you gain an insight into the audience you miss out on when some one stops following you and possible projections can be made.

Implementation: The basic idea is to compare the followers of that person to your followers and then identify the number of unique followers.This again is a probability factor and just predicts the extend you possibly gain from that person added to your followers.

alternative: Along with an amplification factor one can also form a word cloud from the followers of the standout person (using the most common words in their bio's)this again gives you and insight into the kind of people you are further reaching out to by comparing the word cloud of your followers with his/hers.

Visualization: Possibly a word cloud haven't decided yet on the visualization for the amplification factor. Links: http://en.wikipedia.org/wiki/Tag_cloud


3 Track-me plugin

This plugin aims at giving the user an insight into his movements on the map and trace it for a week/month/year from the data he generates.The idea is to collect the data from Facebook check ins, Foursquare check-in and also data about the place from the tweets and then plot the movement on a map similar to the "Going global !" plugin but with lines to show the movement back and forth.

This allows the user to have an insight about from where he generates different data and the frequency of the visit to a particular visit.Additionally a time stamp can be associated with each location to show where the user was at the exact time and the content he generated at that place.

In addition to that there could also be a feature that shows the most popular tweets generated from that particular place and also a comparison of the movement over this week to that with the last week or month.

Implementation: The Facebook api has the required documentation for the check-in and data can be fetched directly.The same goes for the foursquare as well as g+ they too has the check-in feature and the required documentation. The challenge lies in extracting the location from the twitter.I have gone through their forums and the search api's geocode feature doesn't quiet work the way it is with other api's. Alternative: An alternative to twitter checkin is to extract locations from tweet and mark it in the map.

Example: @joe_mathai hey i am at xyz restaurant with my friends and ....

Code can be written to extract the location xyz resturant and check if it exists(using the places api from google) or is map-able.This is just a possibility.

Visualisation: The google map api would best suit this plugin as it offers a feature that lets you draw lines of differnt specification on the map and this could be used to track movements.

Links: https://developers.google.com/maps/documentation/javascript/v2/overlays https://developers.facebook.com/docs/reference/api/checkin/ https://developers.google.com/places/documentation/search https://developer.foursquare.com/docs/


4.Acceptance level

This plugin aims at giving the user(who shares some opinion) a sense of how a particular tweet of his is being accepted by his followers.The idea is to analyse replies to a particular tweet(Sentiment Analysis) and then depending on them calculate a level of acceptance of the tweet and then represent it on a bar graph that can be placed above the current insights like "conversation starter".

Implementation:

In the thinkup plugins there is wordfrequency.php that already is used for conversation starter insight and has the necessary data for this plugin and can also be implemented as a subroutine to it.The analysis can be done with the help of sentiment analysis model by using Google's Prediction API. https://developers.google.com/prediction/docs/sentiment_analysis The Google prediction API lets you create a model for the data and associate labels with each.It should not be very difficult to train a model for tweet analysis and after a few test run it can be configured to the need. Depending on the labels returned one can easily calculate the net acceptance.

Visualization: A level indicating bar graph might be suitable for representation.