ICP 5 - manaswinivedula/Big-Data-Programming GitHub Wiki

Sqoop Transfers between Hive and Mysql

Task1

1. Opened an SQL shell, created a database named db2 and table named employees.

sudo service mysqld start mysql -u root -pcloudera show db; create database db2;

2. Inserted records into the employees table using the below commands.

create table employees(name VARCHAR(100), salary FLOAT, id INT); insert into employees values('Manaswini',23456, 1),('mike',2345,2),('jhon',3456,3);

3. Now, importing the Mysql table employees into the hive warehouse using sqoop import command,

sqoop import --connect jdbc:mysql://localhost/db2 --username root --password cloudera --table employees --m 1

4.checking the imported records using the below command.

hadoop fs -ls employees/ hadoop fs -cat employees/*

5. Now importing the Mysql table into the hive warehouse to a specific directory result2.

sqoop import --connect jdbc:mysql://localhost/db2 --username root --password cloudera --table employees --m 1 --target-dir result2

6. Checking the imported records using the below command

hadoop fs -cat result2/part-m-*

7.creating a new table named e in MySQL database.

create table e(name VARCHAR(100), salary FLOAT, id INT);

8.Now, using the sqoop export command exporting the table in the result2 directory of hive to the table e to the database db2 of Msql.

sqoop export --connect jdbc:mysql://localhost/db2 --username root --password cloudera--table e --export-dir result2/part-m-00000

9. checking the exported result in MySql.

select * from e;

Task2

1. The file tables-schemas.hql will create a table in hive data warehouse and load the local data to the hive table.

hive -f tables-schemas.hql

2. created a table named emp1 in Mysql

create table emp1(name VARCHAR(100), salary FLOAT, id INT);

3. Now, using the sqoop export command exporting the table in the new directory of the hive to the table emp1 to the database db1 of Msql.

sqoop export --connect jdbc:mysql://localhost/db1 --username root --password cloudera--table emp1 --export-dir /user/hive/warehouse/new -m 1

4. Checking the exported records in SQL

select * from emp1;

Task3

1. Creating a table stocks1 with the required columns and data types using the below command

create table stocks1(dt date, val1 float,val2 float, val3 float, val4 float, val5 bigint, val6 float) row format delimited fields terminated by ',' stored as textfile;

2.loading the stocks, CSV from local to the stocks1 table in the hive warehouse

load data local inpath '/home/cloudera/Downloads/stocks.csv' into table stocks1;

3. Created a new table named stocks1 in MYSQL.

create table stocks1(dt DATE,val1 FLOAT, val2 FLOAT, val3 FLOAT, val4 FLOAT,val5 BIGINT, val6 FLOAT);

4. Now, using the sqoop export command exporting the table in the stocks1 directory of the hive to the table stocks1 in the database db2 of Msql.

sqoop export --connect jdbc:mysql://localhost/db2 --username root --password cloudera --table stocks1 --export-dir /user/hive/warehouse/stocks1/stocks.csv -m 1

5.Checking the exported records in SQL

select * from stocks1 limit 10;

6.To analyze the statistics of stocks1 table.

analyze table stocks1 compute statistics;

Output:

7. To count the same number of day, month or year from the stocks1 and ordering them their numbers.

select word,count(1) as count from (select explode(split(dt,'-')) as word from stocks1)w group by word order by word;

Output:

8. To observe the differences between stocks at the start of the day to the end of the day. If it is <2 that means it is the loss of share value on that particular day

select dt,(val4-val1) from stocks1 where val4-val1<2 limit 10;

9. To calculate the top 5 dates on which stocks have the highest prices.

select dt,val5 from stocks order by val5 DESC limit 5;