Module 1: ICP #6 - SnehaMishra28/BigData_Programming_Summer2018 GitHub Wiki

Team: 12
Professor: Yugyung Lee

Name: Sneha Mishra
Class ID: 11
Email: [email protected]
MyGitHub

Technical Partner:
Name: Aditya Soman
Class ID: 19
Email: [email protected]
GitHub

Objective

Introduction to Sqoop.

Features

  1. Install Sqoop.
  2. Use Sqoop to import and export mySQL Tables to HDFS.
  3. Create Hive Tables through HQL script, use Sqoop to import and export tables to Relational Databases.
  4. Perform three queries from databases.

Steps:

Step 1: Install Sqoop (or Cloudera)

Step 2: Part 1

Import and export mySQL Tables to HDFS

Step 2: Part 2

Import and export tables to Relational Databases

Step 2: Part 3

Perform three queries from databases

References:

  1. https://dzone.com/articles/sqoop-import-data-from-mysql-to-hive
  2. https://stackoverflow.com/questions/22404641/using-sqoop-to-import-data-from-mysql-to-hive
  3. https://stackoverflow.com/questions/23472688/data-import-from-mysql-with-apache-sqoop-error-no-manager-for-connect-string
  4. https://stackoverflow.com/questions/26515700/mysql-jdbc-driver-5-1-33-time-zone-issue
  5. http://community.cloudera.com/t5/Hadoop-101-Training-Quickstart/I-run-a-Hadoop-job-but-it-got-stucked-and-nothing-is/td-p/47856/page/2
  6. https://stackoverflow.com/questions/27282535/sqoop-cannot-find-mysqldump-when-using-direct-import-into-hdfs