hiveserver2做数据清洗案例 - TBDSUDC/tdbs-document GitHub Wiki

Hiveserver2数据分析

[TOC]

Hiveserver2的连接方式:

  1. 直接连接hiveserver2的服务和端口号地址

    DriverManager.getConnection("jdbc:hive2://{hiveserve2地址}:{端口号}/{数据库}", "用户名", "密码");

  2. 连接zookeeper获取hiveserver2的当前可用地址

    ​ DriverManager.getConnection("jdbc:hive2://{zookeeper的地址}:{端口号}/{数据库};serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2", "用户名", "密码");

引入对应依赖配置到POM

<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>2.2.0-SNAPSHOT-TBDS-4.0.3.3</version>
</dependency>

Talk is simple,Show me code

package demo;

import java.sql.*;
/**
 * hiveserver2是通过hive的thrift服务对外提供的接口服务
 * Thrift服务不是很稳定,请大家在使用这种方式访问hive的时候,做好容错机制。
 */
public class HiveDemo {
    public static void main(String args[]) {
        if (args.length < 3) {
            System.out.println("usage:java -Djava.ext.dirs=/usr/hdp/2.2.0.0-2041/hive/lib:/usr/hdp/2.2.0.0-2041/hadoop " +
                    "-cp dev-demo-1.0-SNAPSHOT.jar com.tencent.tbds.demo.HiveDemo zkIP:port user userPasswd");
            System.out.println("examples: usage:java -Djava.ext.dirs=/usr/hdp/2.2.0.0-2041/hive/lib:/usr/hdp/2.2.0.0-2041/hadoop -cp " +
                    "dev-demo-1.0-SNAPSHOT.jar com.tencent.tbds.demo.HiveDemo ****:2181 demoUser demoUserPassword");
            return;
        }

        Connection conn;
        try {
            Class.forName("org.apache.hive.jdbc.HiveDriver");
			//单点方式连接hiveserver2的服务,hiveserver2的服务容易出现oom的情况,建议采用高可用方式
            //conn = DriverManager.getConnection("jdbc:hive2://****:10000/default", "username", "password");

            //高可用方式:客户端字段选择可用的hiveserver
            conn = DriverManager.getConnection("jdbc:hive2://" + args[0] + "/default;" +
 "serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2",
                    args[1], args[2]);
            Statement st = conn.createStatement();
            String sqlstring = "SHOW DATABASES";
            ResultSet rs = st.executeQuery(sqlstring);
            System.out.println("Show all databases in hive:");
            while (rs.next()) {
                System.out.println(rs.getString(1));
            }
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
        } catch (SQLException e) {
            e.printStackTrace();
        }
    }
}

Demo下载(推荐使用HA高可用连接方式)

Hiveserver2-高可用,通过ZK连接

Hiveserver2-jdbc直接连接。

⚠️ **GitHub.com Fallback** ⚠️