ICP4 - gabriellawillis/BigData GitHub Wiki

Tasks

Create Hive Tables and Perform Queries for Use Case based on Petrol Data

a) CREATE petrol table and load data.

b) SELECT distributer_id, vol_OUT FROM petrol order by vol_OUT desc limit 10;

c) SELECT distributer_id, vol_OUT FROM petrol order by vol_OUT limit 10;

d) List all distributors who have this difference, along with the year and the difference which they have in that year.

e) Sort By

f) Cluster By

g) Distribute By

Create Hive Tables and Perform Queries for Use Case based on Olympics Data

a) Create Olympic table and load data:

b) select country, SUM (total) from olympic where sport = “Swimming” GROUP BY country;

c) select year, SUM(total) from olympic where country = “India” GROUP BY year;

d) select country, SUM(total) from olympic GROUP BY country;

e) select country, SUM(gold) from olympic GROUP BY country;

f) Which country got medals for Shooting, year wise classification?

Create Hive Tables and Perform Queries for Use Case based on Movielens

a) Create 3 tables called movies, ratings and users. Load the data into tables

b) List all movies with genre of movie is “Action” and “Drama”

c) List movie ids of all movies with rating equal to 5.

d) Find top 11 average rated "Action" movies with descending order of rating.