Queries and Datasets - Texera/texera GitHub Wiki

Datasets

1. A snippet of the Twitter dataset.

Each tweet is stored in Json format. To friendly visualize Json format, we suggest some online Json viewer, such as JsonViewer.

{"create_at": "2017-03-26T16:39:13.000Z", "id": 846144537676431360, "text": "@carrieunderwood @opry hi carrie", "in_reply_to_status": 845836315648315392, "in_reply_to_user": 386244525, "favorite_count": 0, "retweet_count": 0, "lang": "en", "is_retweet": false, "user_mentions": [386244525, 19772559], "user": {"id": 4217866818, "name": "Lisa", "screen_name": "296_3676", "profile_image_url": "http://pbs.twimg.com/profile_images/664966945435992065/Tw4npe2S_normal.jpg", "lang": "en", "location": "null", "create_at": "2015-11-12", "description": "null", "followers_count": 31, "friends_count": 67, "statues_count": 182}, "place": {"country": "United States", "country_code": "United States", "full_name": "Beaver Dam, WI", "id": "1389f2635209d576", "name": "Beaver Dam", "place_type": "city", "bounding_box": [[-88.870587, 43.431528], [-88.786438, 43.508406]]}, "geo_tag": {"stateID": 55, "stateName": "Wisconsin", "countyID": 55027, "countyName": "Dodge", "cityID": 5505900, "cityName": "Beaver Dam"}}
{"create_at": "2017-08-09T11:22:09.000Z", "id": 895349496749596672, "text": "Join the Noodles & Co. team! See our latest #job opening here: https://t.co/NljQxBaLc0 #Veterans #MilSpouse #Greenville, NC #Hiring", "in_reply_to_status": -1, "in_reply_to_user": -1, "favorite_count": 0, "coordinate": [-77.3818152, 35.5794052], "retweet_count": 0, "lang": "en", "is_retweet": false, "hashtags": ["job", "Veterans", "MilSpouse", "Greenville", "Hiring"], "user": {"id": 88254516, "name": "TMJ-NC HRTA Jobs", "screen_name": "tmj_nc_hrta", "profile_image_url": "http://pbs.twimg.com/profile_images/667871532920639488/VroXHje4_normal.jpg", "lang": "en", "location": "North Carolina", "create_at": "2009-11-07", "description": "Follow this account for geo-targeted Hospitality/Restaurant/Tourism job tweets in North Carolina. Need help? Tweet us at @CareerArc!", "followers_count": 399, "friends_count": 275, "statues_count": 474}, "place": {"country": "United States", "country_code": "United States", "full_name": "North Carolina, USA", "id": "3b98b02fba3f9753", "name": "North Carolina", "place_type": "admin", "bounding_box": [[-84.321948, 33.752879], [-75.40012, 36.588118]]}, "geo_tag": {"stateID": 37, "stateName": "North Carolina", "countyID": 37147, "countyName": "Pitt", "cityID": 3728080, "cityName": "Greenville"}}
{"create_at": "2017-06-12T10:05:40.000Z", "id": 874311754897137666, "text": "I told Tommy he had an obsession to something and he goes \"you're my obsession\" and wow it was so cute I love him so much\ufffd\ufffd", "in_reply_to_status": -1, "in_reply_to_user": -1, "favorite_count": 0, "retweet_count": 0, "lang": "en", "is_retweet": false, "user": {"id": 398387501, "name": "Amber Hargis", "screen_name": "AmberHargis", "profile_image_url": "http://pbs.twimg.com/profile_images/873390596974669824/2P3J0Hiw_normal.jpg", "lang": "en", "location": "Columbus, OH", "create_at": "2011-10-25", "description": "@tommymalone2\u2764\ufe0f", "followers_count": 825, "friends_count": 912, "statues_count": 9757}, "place": {"country": "United States", "country_code": "United States", "full_name": "Gahanna, OH", "id": "c97807ac2cd60207", "name": "Gahanna", "place_type": "city", "bounding_box": [[-82.905845, 39.987076], [-82.802554, 40.05651]]}, "geo_tag": {"stateID": 39, "stateName": "Ohio", "countyID": 39049, "countyName": "Franklin", "cityID": 3929106, "cityName": "Gahanna"}}

2. A snippet of the COCO dataset.

{"id": 10000, "text": "train2014/COCO_train2014_000000105363.jpg"}
{"id": 10001, "text": "val2014/COCO_val2014_000000402233.jpg"}
{"id": 10002, "text": "val2014/COCO_val2014_000000559252.jpg"}
COCO_train2014_000000105363 COCO_val2014_000000402233 COCO_val2014_000000559252

3. A snippet of the UCF101 dataset.

{"id": 5000, "text": "Haircut/v_Haircut_g20_c02.avi"}
{"id": 5001, "text": "ApplyLipstick/v_ApplyLipstick_g19_c02.avi"}
{"id": 5002, "text": "HandstandWalking/v_HandstandWalking_g02_c01.avi"}

Queries

1. Ten queries on the Twitter dataset.

To study the behavior of CORE with different numbers of predicates, we randomly select five queries with two strong correlated predicates and five queries with three strong correlated predicates.

Id Queries
q1 SentimentStanfordNLP ('negative', 'neutral')→ POSTaggerSpacyLG ('VBD', 'WRB', 'IN')
q2 SentimentStanfordNLP ('negative', 'neutral')→ POSTaggerSpacyLG ('PRP')
q3 SentimentStanfordNLP ('neutral', 'positive')→ POSTaggerSpacyLG ('NNPS', 'VB', 'VBZ', 'WRB')
q4 SentimentStanfordNLP ('neutral', 'positive')→ POSTaggerSpacySM ('VBD', 'WRB', 'PRP')
q5 SentimentStanfordNLP ('neutral', 'positive')→ POSTaggerSpacySM ('PRP')
q6 SentimentStanfordNLP ('neutral', 'positive')→ POSTaggerStanfordNLP ('NNPS', 'VBP', 'WRB', '.')→ POSTaggerSpacyLG ('NNPS', 'VBD', 'VBN', 'WRB', 'DT')
q7 SentimentStanfordNLP ('positive')→ POSTaggerStanfordNLP ('NNPS', 'VB', 'VBD', 'VBN')→ POSTaggerSpacyLG ('NNPS', 'VB', 'VBZ', 'WRB')
q8 SentimentStanfordNLP ('neutral', 'positive')→ POSTaggerStanfordNLP ('NNPS', 'VB', 'VBD', 'VBN')→ POSTaggerSpacyLG ('NNPS', 'VBD', 'VBN', 'WRB', 'DT')
q9 SentimentStanfordNLP ('neutral', 'positive')→ POSTaggerStanfordNLP ('NNPS', 'VB', 'VBD', 'VBN')→ POSTaggerSpacyLG ('NNPS', 'VB', 'VBZ', 'WRB')
q10 SentimentStanfordNLP ('neutral')→ POSTaggerStanfordNLP ('VBP', 'VBZ', 'WRB')→ POSTaggerSpacyLG ('NNPS', 'VBD', 'VBN', 'WRB', 'DT')

2. Ten queries on the COCO dataset.

To study the behavior of CORE with different orders of predicates, we randomly select four pairs of queries. Each pair of queries contains two queries with different orders, such as q2 and q3.

Id Queries
q1 ObjectDetection ('car', 'chair', 'dining table', 'bench', 'bed', 'bird', 'vase')→ ObjectDetection ('person')
q2 ObjectDetection ('person')→ ObjectDetection ('car', 'chair', 'cup', 'dog', 'handbag', 'sink', 'pizza')
q3 ObjectDetection ('car', 'chair', 'cup', 'dog', 'handbag', 'sink', 'pizza')→ ObjectDetection ('person')
q4 ObjectDetection ('person')→ ObjectDetection ('car', 'chair', 'cup', 'bottle', 'bed', 'cell phone', 'motorcycle')
q5 ObjectDetection ('car', 'chair', 'cup', 'bottle', 'bed', 'cell phone', 'motorcycle')→ ObjectDetection ('person')
q6 ObjectDetection ('person')→ ObjectDetection ('car', 'chair', 'cup', 'tv', 'bed', 'bench', 'sink')
q7 ObjectDetection ('car', 'chair', 'cup', 'tv', 'bed', 'bench', 'sink')→ ObjectDetection ('person')
q8 ObjectDetection ('person')→ ObjectDetection ('car', 'chair', 'bottle', 'bowl', 'handbag', 'book', 'bird')
q9 ObjectDetection ('car', 'chair', 'bottle', 'bowl', 'handbag', 'book', 'bird')→ ObjectDetection ('person')
q10 ObjectDetection ('person')→ ObjectDetection ('car', 'chair', 'dining table', 'book', 'surfboard', 'bird', 'vase')

3. Ten queries on the UCF101 dataset.

For the UCF101dataset, we randomly select ten queries with strong correlations.

Id Queries
q1 ActivityRecognition ('Archery', 'BalanceBeam', 'Basketball', 'BandMarching', 'BasketballDunk', 'Biking', 'BreastStroke', 'BenchPress', 'BoxingPunchingBag', 'BlowDryHair', 'Bowling', 'BabyCrawling', 'ApplyLipstick')→ ObjectDetection ('chair', 'sports ball', 'dog', 'car', 'tv', 'horse', 'bicycle', 'skateboard', 'tennis racket', 'surfboard', 'bird')
q2 ActivityRecognition ('Archery, 'BalanceBeam', 'Basketball', 'BandMarching', 'Biking', 'BreastStroke', 'BrushingTeeth', 'BaseballPitch', 'BoxingPunchingBag', 'BoxingSpeedBag', 'Bowling', 'ApplyLipstick')→ ObjectDetection ('chair', 'sports ball', 'dog', 'car', 'tv', 'horse', 'bicycle', 'skateboard', 'tennis racket', 'surfboard', 'bird')
q3 ActivityRecognition ('Archery', 'BalanceBeam', 'Basketball', 'BandMarching', 'Biking', 'BodyWeightSquats', 'BreastStroke', 'BrushingTeeth', 'BaseballPitch', 'Bowling', 'BabyCrawling')→ ObjectDetection ('chair', 'sports ball', 'dog', 'car', 'tv', 'horse', 'bicycle', 'skateboard', 'tennis racket', 'surfboard', 'bird')
q4 ActivityRecognition ('Archery', 'BalanceBeam', 'Basketball', 'BasketballDunk', 'BlowingCandles', 'Biking', 'BreastStroke', 'BrushingTeeth', 'BlowDryHair', 'BoxingSpeedBag')→ ObjectDetection ('chair', 'sports ball', 'dog', 'car', 'tv', 'horse', 'bicycle', 'skateboard', 'tennis racket', 'surfboard', 'bird')
q5 ActivityRecognition ('Archery', 'BalanceBeam', 'Basketball', 'BlowingCandles', 'BreastStroke', 'BrushingTeeth', 'BaseballPitch', 'BenchPress', 'BlowDryHair', 'BoxingSpeedBag', 'Bowling')→ ObjectDetection ('chair', 'sports ball', 'dog', 'car', 'tv', 'horse', 'bicycle', 'skateboard', 'tennis racket', 'surfboard', 'bird')
q6 ActivityRecognition ('Archery', 'Basketball', 'BandMarching', 'BasketballDunk', 'Biking', 'BodyWeightSquats', 'BreastStroke', 'BoxingPunchingBag', 'BoxingSpeedBag', 'Bowling', 'ApplyLipstick')→ ObjectDetection ('chair', 'sports ball', 'dog', 'car', 'tv', 'horse', 'bicycle', 'skateboard', 'tennis racket', 'surfboard', 'bird')
q7 ActivityRecognition ('Archery', 'BalanceBeam', 'BasketballDunk', 'BlowingCandles', 'BodyWeightSquats', 'BreastStroke', 'BaseballPitch', 'BoxingPunchingBag', 'BoxingSpeedBag', 'Bowling', 'BabyCrawling')→ ObjectDetection ('chair', 'sports ball', 'dog', 'car', 'tv', 'horse', 'bicycle', 'skateboard', 'tennis racket', 'surfboard', 'bird')
q8 ActivityRecognition ('Archery', 'Basketball', 'BandMarching', 'BasketballDunk', 'BlowingCandles', 'Biking', 'BrushingTeeth', 'BaseballPitch', 'BenchPress', 'Bowling')→ ObjectDetection ('chair', 'sports ball', 'dog', 'car', 'tv', 'horse', 'bicycle', 'skateboard', 'tennis racket', 'surfboard', 'bird')
q9 ObjectDetection('chair', 'sports ball', 'dog', 'car', 'tv', 'horse', 'bicycle', 'skateboard', 'tennis racket', 'boat', 'cup')→ ActivityRecognition ('Archery', 'BalanceBeam', 'Basketball', 'BandMarching', 'BasketballDunk', 'BlowingCandles', 'BodyWeightSquats', 'BreastStroke', 'BrushingTeeth', 'BaseballPitch', 'BoxingPunchingBag', 'BoxingSpeedBag', 'BabyCrawling', 'ApplyLipstick')
q10 ActivityRecognition('Archery', 'BalanceBeam', 'Basketball', 'BandMarching', 'BasketballDunk', 'BlowingCandles', 'BodyWeightSquats', 'BreastStroke', 'BrushingTeeth', 'BaseballPitch', 'BoxingPunchingBag', 'BoxingSpeedBag', 'BabyCrawling', 'ApplyLipstick')→ ObjectDetection ('chair', 'sports ball', 'dog', 'car', 'tv', 'horse', 'bicycle', 'skateboard', 'tennis racket', 'boat', 'cup')
⚠️ **GitHub.com Fallback** ⚠️