ICP 5 - Gnkhakimova/CS5590-BigData GitHub Wiki

ICP 5

Apache Sqoop

Tasks

  1. Create table in MySQL and Import into HDFS through Sqoop, Export table from HDFS to MySQL.
  2. Create Hive Tables Using HQL
  3. Create Hive Table and export to MySQL, run three queries.

Configuration

  1. Oracle Virtual Box
  2. Cloudera
  3. Terminal
  4. MySQL
  5. Sqoop
  6. Hive

Features

For this task we had to perform different operations by moving data between different sources using Sqoop which transfers data and load it to the target location.

Task 1
Created table in MySQL and moved it to HDFS using Sqoop.



Task 2
Create Hive table using HQL




Task 3
Load data set and perform different queries on it.



Limitations

Had issue when transferring data from Hive to MySql since MySQL does not support complex data types, reduce job was trowing an exception, had to you different dataset which uses primitive data types.

Reference

  1. https://www.educba.com/sqoop-commands/
  2. https://www.ukdataservice.ac.uk/media/604456/hiveworkshoppractical.pdf