0) postgresql 계정 생성 (postgresql 설치한 서버에서 실행)

hive에서 metastore로 사용할 계정을 생성한다

$ psql postgres
psql (13.5)
Type "help" for help.

postgres=# create user hive with password 'hive';
CREATE ROLE
postgres=# create database hive_meta owner=hive;
CREATE DATABASE
postgres=# create schema authorization hive;
CREATE SCHEMA
postgres=# grant all privileges on all tables in schema hive to hive;
GRANT
postgres=# alter user hive with password 'hive';
ALTER ROLE
hive_meta=> grant all privileges on database hive_meta to hive;
GRANT
postgres=# use hive
postgres-# \l
                              List of databases
   Name    | Owner  | Encoding |  Collate   |   Ctype    | Access privileges
-----------+--------+----------+------------+------------+-------------------
 hive_meta | hive   | UTF8     | ko_KR.utf8 | ko_KR.utf8 |
 postgres  | wasadm | UTF8     | ko_KR.utf8 | ko_KR.utf8 |
 template0 | wasadm | UTF8     | ko_KR.utf8 | ko_KR.utf8 | =c/wasadm        +
           |        |          |            |            | wasadm=CTc/wasadm
 template1 | wasadm | UTF8     | ko_KR.utf8 | ko_KR.utf8 | =c/wasadm        +
           |        |          |            |            | wasadm=CTc/wasadm
(4 rows)

1) 하이브 다운로드 : http://apache.mirror.cdnetworks.com/hive/hive-3.1.2/

$ cd ~/usr/local
$ tar -xvzf apache-hive-3.1.2-bin.tar.gz
$ cd apache-hive-3.1.2-bin

2) hive-env.sh 파일 수정

$ cd $HIVE_HOME/conf
$ cp hive-env.sh.template hive-env.sh
$ vi hive-env.sh
# Set HADOOP_HOME to point to a specific hadoop install directory
# HADOOP_HOME=${bin}/../../hadoop
HADOOP_HOME=$HADOOP_HOME

3) hive-site.xml 파일 생성

$ vi hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
            <name>hive.metastore.local</name>
            <value>false</value>
    </property>
    <property>
            <name>javax.jdo.option.ConnectionURL</name>
            <value>jdbc:postgresql://192.168.10.101:5432/hive_meta</value>
    </property>
    <property>
            <name>javax.jdo.option.ConnectionDriverName</name>
            <value>org.postgresql.Driver</value>
    </property>
    <property>
            <name>javax.jdo.option.ConnectionUserName</name>
            <value>hive</value>
    </property>
    <property>
            <name>javax.jdo.option.ConnectionPassword</name>
            <value>hive</value>
    </property>
   <property>
       <name>hive.metastore.warehouse.dir</name>
	   <value>/user/wasadm/warehouse</value>   <!-- 무조건 설치한 리눅스 계정이랑 같게 -->
   </property>
   <property>
       <name>hive.server2.enable.doAs</name>
    	<!-- value>true</value -->
        <value>false</value>
    	<description>
	      Setting this property to true will have HiveServer2 execute
	      Hive operations as the user making the calls to it.
	    </description>
   </property>
</configuration>

3) hive metastore 초기화

$ $HIVE_HOME/bin/schematool -dbType postgres -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/wasadm/usr/local/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/wasadm/usr/local/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:        jdbc:postgresql://10.225.19.104:5432/hive_meta
Metastore Connection Driver :    org.postgresql.Driver
Metastore connection User:       hive
Starting metastore schema initialization to 3.1.0
Initialization script hive-schema-3.1.0.postgres.sql
Initialization script completed
schemaTool completed

4) hive cli 실행 명령어로 동작 확인

$ [wasadm@dwhdfad01 bin]$ ./hive
......
Logging initialized using configuration in jar:file:/home/wasadm/usr/local/apache-hive-3.1.2-bin/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
Hive Session ID = 652a85fa-eca4-4e02-9d2d-70972d9482f2
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>

5) hive server2 실행

$ nohup $HIVE_HOME/bin/hive --service metastore > /dev/null 2>&1 &
$ nohup $HIVE_HOME/bin/hiveserver2 > /dev/null 2>&1 &

6) beeline 접속 테스트

$ beeline -u jdbc:hive2://dwhdfad01:10000
Connecting to jdbc:hive2://dwhdfad01:10000
2021-11-26 17:20:49,433 INFO jdbc.Utils: Supplied authorities: dwhdfad01:10000
2021-11-26 17:20:49,433 INFO jdbc.Utils: Resolved authority: dwhdfad01:10000
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 2.3.9)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 2.3.9 by Apache Hive
0: jdbc:hive2://dwhdfad01:10000>

python hive client

https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-PythonClientDriver

spark-hive connect

https://stackoverflow.com/questions/31980584/how-to-connect-spark-sql-to-remote-hive-metastore-via-thrift-protocol-with-no

+ Recent posts