ICP 10 - Gnkhakimova/CS5590-BigData GitHub Wiki
ICP 10
Data Frame and SQL
Task 1
- Import the dataset and create data frames directly on import

- Save data to file
Saved file - Check for Duplicate records in the dataset

- Apply Union operation on the dataset and order the output by Country Name alphabetically



- Use Groupby Query based on treatment

Task 2
- Apply the basic queries related to Joins and aggregate functions (at least 2)


- Write a query to fetch 13th Row in the dataset.

Bonus
- Write a parseLine method to split the comma-delimited row and create a Data frame.
