ICP 5 - navyagonug/CS5590-BIG-DATA-PROGRAMMING-USING-HADOOP-AND-SPARK GitHub Wiki
PROBLEM STATEMENT
1.Use Sqoop to import and export mySQL Tables to HDFS. 2.Create Hive Tables through HQL Script , Use Sqoop to import and export tables to Relational Databases 3.Perform three queries from databases
FEATURES
In this in-class programming, Sqoop is used for transferring the data(tables) from relational database(In this case,MySQL) to Hadoop as well as Hive. Also, A dataset is considered and queries for wordcount, identifying patterns using regular expressions and finding out the statistics of that data. MYSQL,Sqoop,Hive are used.
CONFIGURATIONS
Sqoop is downloaded from the official site and installed. Similarly mysql connector is also downloaded,installed and is moved to the appropriate location.
Question
Use Sqoop to import and export mySQL Tables to HDFS.
APPROACH
Initially, A table is created in mySQL and values are inserted into it. Later, using the import command, data is transferred from mySQL to HDFS. On the other hand, From HDFS to mySQL, A table must be created manually in mySQL with matching parameters and by using export command, data is transferred. Below are the series of screenshots as well the size of the data and time taken to process.
QUESTION
Create Hive Tables through HQL Script , Use Sqoop to import and export tables to Relational Databases
APPROACH
A similar approach is followed where the data is transferred using import and export between mySQL and Hive.
PROBLEM
Perform three queries from databases
APPROACH
The following are the screenshots which includes queries as well as outputs. Initially, Data is loaded into mySQL and then imported into Hive.