Installing Hive on existing Hadoop cluster - dryshliak/hadoop GitHub Wiki
- Existing Hadoop cluster
- Hive release
- MySQL
-
Please choice node where installed fewer hadoop services
-
Release you can download from Apache site http://www.apache.org/dyn/closer.cgi/hive/
cd /home/ubuntu
wget http://apache.ip-connect.vn.ua/hive/hive-2.3.5/apache-hive-2.3.5-bin.tar.gz
tar xvzf apache-hive-2.3.5-bin.tar.gz
mv apache-hive-2.3.5-bin hive
- Setup Hive Environment
#run
sudo su
echo 'HIVE_HOME=/home/ubuntu/hive' >> /etc/environment
echo 'HIVE_CONF_DIR=/home/ubuntu/hive/conf' >> /etc/environment
echo 'export PATH=/home/ubuntu/hive/bin:$PATH' >> /etc/bash.bashrc
echo 'export CLASSPATH=$CLASSPATH:/home/ubuntu/hadoop/lib/*:.' >> /etc/bash.bashrc
echo 'export CLASSPATH=$CLASSPATH:/home/ubuntu/hive/lib/*:.' >> /etc/bash.bashrc
exit
- To check environment variable please relogin and try below
echo $HIVE_HOME
echo $HIVE_CONF_DIR
- Modidy hive-env.sh file
cp ~/hive/conf/hive-env.sh.template ~/hive/conf/hive-env.sh
chmod +x ~/hive/conf/hive-env.sh
vi ~/hive/conf/hive-env.sh
add below lines at the end of hive-env.sh file
echo 'export HADOOP_HOME=/home/ubuntu/hadoop' >> ~/hive/conf/hive-env.sh
echo 'export HIVE_CONF_DIR=/home/ubuntu/hive/conf' >> ~/hive/conf/hive-env.sh
- Installing MySQL database for Hive metastore and mysql java connector. During installation it will ask you to set database user “root” password. Set it and note it down
sudo apt-get install mysql-server -y
sudo apt-get install libmysql-java
- Create link or copy MySQL java connector in Hive library folder
cp /usr/share/java/mysql-connector-java.jar /home/ubuntu/hive/lib
- Starting MySQL service
sudo service mysql start
sudo service mysql status
- Connect to MySQL and create metastore database for Hive
mysql -u root -p<password>
mysql> CREATE DATABASE metastore;
mysql> USE metastore;
mysql> SOURCE /home/ubuntu/hive/scripts/metastore/upgrade/mysql/hive-schema-2.3.0.mysql.sql
mysql> CREATE USER 'hive'@localhost IDENTIFIED BY '<hive password>'
mysql> GRANT all on *.* to 'hive'@localhost;
mysql> flush privileges;
- Add/update below lines in hive-default.xml file. Please note that these properties may already exist in that case you should delete those properties and add this one. Or else you may get an error.
Don't forget to replace symbols *** at "javax.jdo.option.ConnectionPassword" property to the appropriate password for hive user which you have set before
cp ~/hive/conf/hive-default.xml.template ~/hive/conf/hive-site.xml
vi ~/hive/conf/hive-site.xml
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/hive_temp</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.execution.engine</name>
<value>mr</value>
<description>
Expects one of [mr, tez, spark].
Chooses execution engine. Options are: mr (Map reduce, default)</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true&useSSL=false</value>
<description>metadata is stored in a MySQL server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>Username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>***</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive_tmp</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission</description>
</property>
- Create Hive database directore on HDFS
hdfs dfs -mkdir /user/
hdfs dfs -mkdir /user/hive
hdfs dfs -mkdir /user/hive/warehouse
hdfs dfs -mkdir /tmp
hdfs dfs -chmod -R a+rwx /user/hive/warehouse
hdfs dfs -chmod g+w /tmp
- Now run hive command and you should see "hive>" prompt.
hive
hive>show tables;
- Run Metastore and Hive as daemons
mkdir $HIVE_HOME/logs
hive --service metastore --hiveconf hive.log.dir=$HIVE_HOME/logs --hiveconf hive.log.file=metastore.log >/dev/null 2>&1 &
hive --service hiveserver2 --hiveconf hive.log.dir=$HIVE_HOME/logs --hiveconf hive.log.file=hs2.log >/dev/null 2>&1 &
- On node where you are running hive service add below property in core-site.xml
$user_name - specify the username which will have access to hive beeline command line
<property>
<name>hadoop.proxyuser.$user_name.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.$user_name.groups</name>
<value>*</value>
</property>
- Connect to Hive through Beeline
beeline -u jdbc:hive2://localhost:10000/default -n hive
- In beeline terminal type
0: jdbc:hive2://localhost:10000/default>
0: jdbc:hive2://localhost:10000/default> show tables;
- Web UI for HiveServer2
A Web User Interface (UI) provides configuration, logging, metrics and active session information. The Web UI is available at port 10002 (127.0.0.1:10002) by default.
https://cwiki.apache.org/confluence/display/Hive/Setting+up+HiveServer2 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-Beeline–CommandLineShell