1、Hive基本概念
1.1、Hive概述
概念:The Apache Hive ? data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive.
本質:將SQL轉化為MapReduce程式,Hive的存盤系統是HDFS,HIVE計算引擎是MapReduce,HIVE的資源調度器是YARN,
1.2、Hive優缺點
優點: 1)類SQL語法,易于上手;2)SQL自動轉化為MapReduce,減少學習成本;3)適用于大資料的分析場景;4)Hive支持用戶根據需求自定義函式;
缺點:1)Hive的執行延遲比較高;2)迭代式演算法無法表達;3)處理小資料沒有優勢;4)自動生成的MapReduce作業不夠智能化;5)Hive調優比較困難,粒度較粗;
1.3、Hive架構原理
Hive簡介及Hive架構和原理
10、Hive核心概念和架構原理

Thrift Server
Hive的可選組件,此組件是一個軟體框架服務,允許客戶端使用包括Java、C++、Ruby和其他很多種語言,通過編程的方式遠程訪問Hive,
驅動器:Driver
(1)決議器(SQL Parser):將SQL字串轉換成抽象語法樹AST,這一步一般都用第三方工具庫完成,比如antlr;對AST進行語法分析,比如表是否存在、欄位是否存在、SQL語意是否有誤;
(2)編譯器(Physical Plan):對HQL陳述句進行詞法、語法、語意的編譯(需要跟元資料關聯),編譯完成后會生成一個執行計劃;
(3)優化器(Query Optimizer):對邏輯執行計劃進行優化,減少不必要的列、使用磁區、使用索引等;
(4)執行器(Execution):將優化后的把邏輯執行計劃轉換成可以運行的物理計劃(對于Hive來說,就是MapReduce/Spark),提交到Hadoop的Yarn上執行,
1.4、Hive與傳統資料庫對比
2、Hive安裝
2.1、安裝MySql
原因在于Hive默認使用的元資料庫為derby,開啟Hive之后就會占用元資料庫,且不與其他客戶端共享資料,如果想多視窗操作就會報錯,操作比較局限,以我們需要將Hive的元資料地址改為MySQL(rpm安裝和卸載軟體),可支持多視窗操作,
卸載已經安裝的MySql
[atguigu@hadoop102 ~]$ rpm -qa|grep mariadb
mariadb-libs-5.5.56-2.el7.x86_64
[atguigu@hadoop102 ~]$ sudo rpm -e --nodeps mariadb-libs
解壓MySQL安裝包
[atguigu@hadoop102 software]$ tar -zxvf mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
# MySql的tar包沒有壓縮使用 -xvf選項就可
[atguigu@hadoop102 software]$ tar -xvf mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar
mysql-community-embedded-5.7.28-1.el7.x86_64.rpm
mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm
mysql-community-devel-5.7.28-1.el7.x86_64.rpm
mysql-community-embedded-compat-5.7.28-1.el7.x86_64.rpm
mysql-community-libs-5.7.28-1.el7.x86_64.rpm
mysql-community-test-5.7.28-1.el7.x86_64.rpm
mysql-community-common-5.7.28-1.el7.x86_64.rpm
mysql-community-embedded-devel-5.7.28-1.el7.x86_64.rpm
mysql-community-client-5.7.28-1.el7.x86_64.rpm
mysql-community-server-5.7.28-1.el7.x86_64.rpm
[atguigu@hadoop102 software]$ ll
總用量 1720252
-rw-r--r--. 1 atguigu atguigu 609556480 10月 17 17:24 mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar
-rw-r--r--. 1 atguigu atguigu 45109364 9月 30 2019 mysql-community-client-5.7.28-1.el7.x86_64.rpm
-rw-r--r--. 1 atguigu atguigu 318768 9月 30 2019 mysql-community-common-5.7.28-1.el7.x86_64.rpm
-rw-r--r--. 1 atguigu atguigu 7037096 9月 30 2019 mysql-community-devel-5.7.28-1.el7.x86_64.rpm
-rw-r--r--. 1 atguigu atguigu 49329100 9月 30 2019 mysql-community-embedded-5.7.28-1.el7.x86_64.rpm
-rw-r--r--. 1 atguigu atguigu 23354908 9月 30 2019 mysql-community-embedded-compat-5.7.28-1.el7.x86_64.rpm
-rw-r--r--. 1 atguigu atguigu 136837816 9月 30 2019 mysql-community-embedded-devel-5.7.28-1.el7.x86_64.rpm
-rw-r--r--. 1 atguigu atguigu 4374364 9月 30 2019 mysql-community-libs-5.7.28-1.el7.x86_64.rpm
-rw-r--r--. 1 atguigu atguigu 1353312 9月 30 2019 mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm
-rw-r--r--. 1 atguigu atguigu 208694824 9月 30 2019 mysql-community-server-5.7.28-1.el7.x86_64.rpm
-rw-r--r--. 1 atguigu atguigu 133129992 9月 30 2019 mysql-community-test-5.7.28-1.el7.x86_64.rpm
[atguigu@hadoop102 software]$
使用RPM安裝MySql
RPM包名稱及其概要:
mysql-community-server 資料庫服務器和相關工具
mysql-community-client MySQL 客戶端應用程式和工具
mysql-community-common 服務器和客戶端通用的庫檔案
mysql-community-devel MySQL 資料庫客戶端應用程式開發的頭檔案和庫檔案
mysql-community-libs 用于 MySQL 資料庫客戶端應用程式的共享庫
mysql-community-libs-compat 對于之前 MySQL 安裝的共享兼容性庫
mysql-community-embedded MySQL 嵌入式庫
mysql-community-embedded-devel 嵌入式的 MySQL 開發頭檔案和庫檔案
mysql-community-test MySQL 服務器的測驗套件
[atguigu@hadoop102 software]$ sudo rpm -ivh mysql-community-common-5.7.28-1.el7.x86_64.rpm
警告:mysql-community-common-5.7.28-1.el7.x86_64.rpm: 頭V3 DSA/SHA1 Signature, 密鑰 ID 5072e1f5: NOKEY
準備中... ################################# [100%]
正在升級/安裝...
1:mysql-community-common-5.7.28-1.e################################# [100%]
[atguigu@hadoop102 software]$ sudo rpm -ivh mysql-community-libs-5.7.28-1.el7.x86_64.rpm
警告:mysql-community-libs-5.7.28-1.el7.x86_64.rpm: 頭V3 DSA/SHA1 Signature, 密鑰 ID 5072e1f5: NOKEY
準備中... ################################# [100%]
正在升級/安裝...
1:mysql-community-libs-5.7.28-1.el7################################# [100%]
[atguigu@hadoop102 software]$ sudo rpm -ivh mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm
警告:mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm: 頭V3 DSA/SHA1 Signature, 密鑰 ID 5072e1f5: NOKEY
準備中... ################################# [100%]
正在升級/安裝...
1:mysql-community-libs-compat-5.7.2################################# [100%]
[atguigu@hadoop102 software]$ sudo rpm -ivh mysql-community-client-5.7.28-1.el7.x86_64.rpm
警告:mysql-community-client-5.7.28-1.el7.x86_64.rpm: 頭V3 DSA/SHA1 Signature, 密鑰 ID 5072e1f5: NOKEY
準備中... ################################# [100%]
正在升級/安裝...
1:mysql-community-client-5.7.28-1.e################################# [100%]
[atguigu@hadoop102 software]$ sudo rpm -ivh mysql-community-server-5.7.28-1.el7.x86_64.rpm
警告:mysql-community-server-5.7.28-1.el7.x86_64.rpm: 頭V3 DSA/SHA1 Signature, 密鑰 ID 5072e1f5: NOKEY
準備中... ################################# [100%]
正在升級/安裝...
1:mysql-community-server-5.7.28-1.e################################# [100%]
配置安裝的MySql
# 查看datadir指向的目錄
[atguigu@hadoop102 etc]$ cat /etc/my.cnf
# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/5.7/en/server-configuration-defaults.html
[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
# datadir指向的目錄下的所有內容
[atguigu@hadoop102 etc]$ cd /var/lib/mysql
[atguigu@hadoop102 mysql]$ sudo rm -rf ./*
# 初始化資料庫
[atguigu@hadoop102 mysql]$ sudo mysqld --initialize --user=mysql
# 查看臨時生成的root用戶的密碼
[atguigu@hadoop102 mysql]$ sudo cat /var/log/mysqld.log
2022-01-03T09:05:10.990246Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2022-01-03T09:05:16.710989Z 0 [Warning] InnoDB: New log files created, LSN=45790
2022-01-03T09:05:17.853899Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.
2022-01-03T09:05:18.010178Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 4586a786-6c74-11ec-9fb9-000c2955b598.
2022-01-03T09:05:18.042571Z 0 [Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
2022-01-03T09:05:18.451056Z 0 [Warning] CA certificate ca.pem is self signed.
2022-01-03T09:05:18.589675Z 1 [Note] A temporary password is generated for root@localhost: )9Shrzd/oirE
# 啟動MySQL服務
[atguigu@hadoop102 mysql]$ sudo systemctl start mysqld
# 登錄MySQL資料庫
[atguigu@hadoop102 mysql]$ mysql -uroot -p')9Shrzd/oirE'
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.28
Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
# 修改root用戶的密碼,否則執行其他的操作會報錯
mysql> set password = password("root");
Query OK, 0 rows affected, 1 warning (0.00 sec)
# 修改mysql庫下的user表中的root用戶允許任意ip連接
mysql> update mysql.user set host='%' where user='root';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
# 離開MySql
mysql> quit;
Bye
2.2、Hive安裝部署
# 解壓Hive
[atguigu@hadoop102 software]$ tar -zxvf apache-hive-3.1.2-bin.tar.gz -C /opt/module/
# 添加Hive環境變數
[atguigu@hadoop102 software]$ cat /etc/profile.d/my_env.sh
# JAVA_HOME
export JAVA_HOME=/opt/module/jdk1.8.0_212
export PATH=$PATH:$JAVA_HOME/bin
# HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-3.1.3
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
# ZOOKEEPER_HOME
export ZOOKEEPER_HOME=/opt/module/apache-zookeeper-3.5.7-bin
export PATH=$PATH:$ZOOKEEPER_HOME/bin
export PATH=$PATH:$ZOOKEEPER_HOME/sbin
# HIVE_HOME
HIVE_HOME=/opt/module/apache-hive-3.1.2-bin
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin
# 重繪系統環境變數
[atguigu@hadoop102 software]$ source /etc/profile
# 解決日志Jar包沖突
[atguigu@hadoop102 software]$ mv $HIVE_HOME/lib/log4j-slf4j-impl-2.10.0.jar $HIVE_HOME/lib/log4j-slf4j-impl-2.10.0.jar.bak
2.3、Hive元資料配置到MySql
[atguigu@hadoop102 software]$ cp mysql-connector-java-5.1.37.jar $HIVE_HOME/lib
[atguigu@hadoop102 software]$ cat $HIVE_HOME/conf/hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- jdbc連接的URL -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop102:3306/metastore?useSSL=false</value>
</property>
<!-- jdbc連接的Driver-->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<!-- jdbc連接的username-->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<!-- jdbc連接的password -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property>
<!-- Hive默認在HDFS的作業目錄 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<!-- Hive元資料存盤的驗證 -->
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<!-- 元資料存盤授權 -->
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
</configuration>
2.4、啟動Hive
myclusters.sh
# 登陸MySQL
[atguigu@hadoop102 software]$ mysql -uroot -proot
# 新建Hive元資料庫
mysql> create database metastore;
mysql> quit;
# 初始化Hive元資料庫
[atguigu@hadoop102 software]$ schematool -initSchema -dbType mysql -verbose
# 啟動Hadoop集群
[atguigu@hadoop102 software]$ myclusters.sh start
# 啟動Hive
[atguigu@hadoop102 hive]$ hive
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/home/atguigu/.local/bin:/home/atguigu/bin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-hive-3.1.2-bin/bin)
Hive Session ID = 0e138704-f46b-48e6-9b5f-67f1acda92a2
Logging initialized using configuration in jar:file:/opt/module/apache-hive-3.1.2-bin/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Hive Session ID = 41ae884b-9d14-40eb-b53b-f58a1410049c
# 查看資料庫
hive> show databases;
OK
default
Time taken: 1.009 seconds, Fetched: 1 row(s)
# 查看默認資料庫中的表
hive> show tables;
OK
Time taken: 0.045 seconds
# 在默認資料庫中建表
hive> create table test (id int);
OK
Time taken: 0.864 seconds
# 向表中插入資料
hive> insert into test values(1);
Query ID = atguigu_20220103174139_894e6a7c-1997-41f5-8abe-34478acad198
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Interrupting... Be patient, this might take some time.
Press Ctrl+C again to kill JVM
Starting Job = job_1641202744100_0001, Tracking URL = http://hadoop103:8088/proxy/application_1641202744100_0001/
Kill Command = /opt/module/hadoop-3.1.3/bin/mapred job -kill job_1641202744100_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2022-01-03 17:42:33,045 Stage-1 map = 0%, reduce = 0%
2022-01-03 17:42:59,762 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 11.01 sec
2022-01-03 17:43:25,209 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 22.36 sec
MapReduce Total cumulative CPU time: 22 seconds 360 msec
Ended Job = job_1641202744100_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://hadoop102:9820/user/hive/warehouse/test/.hive-staging_hive_2022-01-03_17-41-40_004_8605542926052281340-1/-ext-10000
Loading data to table default.test
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 22.36 sec HDFS Read: 12765 HDFS Write: 199 SUCCESS
Total MapReduce CPU Time Spent: 22 seconds 360 msec
OK
Time taken: 112.741 seconds
# 查看表中的資料
hive> select * from test;
OK
1
Time taken: 0.211 seconds, Fetched: 1 row(s)
# 離開Hive
hive> quit;
[atguigu@hadoop102 software]$

另開一個視窗查看新建的表
[atguigu@hadoop102 ~]$ hive
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-hive-3.1.2-bin/bin:/home/atguigu/.local/bin:/home/atguigu/bin)
Hive Session ID = dcc0168b-a0ec-4510-a48a-861a82019407
Logging initialized using configuration in jar:file:/opt/module/apache-hive-3.1.2-bin/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Hive Session ID = ec1d304b-3988-40e5-8955-54e3bdfd0eb9
hive> show tables;
OK
test
Time taken: 0.836 seconds, Fetched: 1 row(s)
hive> show databases;
OK
default
Time taken: 0.026 seconds, Fetched: 1 row(s)
hive> quit;
2.5、使用JDBC方式訪問Hive
在hive-site.xml檔案中添加如下配置資訊(前兩個property是新添加的)
[atguigu@hadoop102 ~]$ cat $HIVE_HOME/conf/hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- 指定hiveserver2連接的host -->
<property>
<name>hive.server2.thrift.bind.host</name>
<value>hadoop102</value>
</property>
<!-- 指定hiveserver2連接的埠號 -->
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<!-- jdbc連接的URL -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop102:3306/metastore?useSSL=false</value>
</property>
<!-- jdbc連接的Driver-->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<!-- jdbc連接的username-->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<!-- jdbc連接的password -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property>
<!-- Hive默認在HDFS的作業目錄 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<!-- Hive元資料存盤的驗證 -->
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<!-- 元資料存盤授權 -->
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
</configuration>
啟動hiveserver2(使用程序中不能關掉)
[atguigu@hadoop102 ~]$ hive --service hiveserver2
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-hive-3.1.2-bin/bin:/home/atguigu/.local/bin:/home/atguigu/bin)
2022-01-03 20:14:16: Starting HiveServer2
Hive Session ID = a60c0358-d232-4d08-8f11-215017b6122c
Hive Session ID = 43853016-ce83-429f-9670-9d2a96c081ea
Hive Session ID = 496671ff-41d6-44f2-8f41-a838572d61cd
Hive Session ID = da1c1cb7-c0c0-49a0-82df-cf32fb6c6cd0
Hive Session ID = c506f9f9-6a36-4b8f-828b-fb5c9723e514
啟動beeline客戶端(需要多等待一會)
# -n atguigu 與 [atguigu@hadoop102 software] 的用戶名保持一致
[atguigu@hadoop102 software]$ beeline -u jdbc:hive2://hadoop102:10000 -n atguigu
Connecting to jdbc:hive2://hadoop102:10000
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.2 by Apache Hive
# 命令的輸出格式更美觀
0: jdbc:hive2://hadoop102:10000> show databases;
INFO : Compiling command(queryId=atguigu_20220103202645_ec2b96fc-0e49-4822-9853-0553a563a64c): show databases
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=atguigu_20220103202645_ec2b96fc-0e49-4822-9853-0553a563a64c); Time taken: 1.023 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=atguigu_20220103202645_ec2b96fc-0e49-4822-9853-0553a563a64c): show databases
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=atguigu_20220103202645_ec2b96fc-0e49-4822-9853-0553a563a64c); Time taken: 0.058 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
+----------------+
| database_name |
+----------------+
| default |
+----------------+
1 row selected (1.787 seconds)
啟動hiveserver2(后臺運行)
[atguigu@hadoop202 hive]$ nohup hive --service metastore 2>&1 &
[atguigu@hadoop202 hive]$ nohup hive --service hiveserver2 2>&1 &
nohup: 放在命令開頭,表示不掛起,也就是關閉終端行程也繼續保持運行狀態
0:標準輸入
1:標準輸出
2:錯誤輸出
2>&1 : 表示將錯誤重定向到標準輸出上
&: 放在命令結尾,表示后臺運行
一般會組合使用: nohup [xxx命令操作]> file 2>&1 & , 表示將xxx命令運行的結果輸出到file中,并保持命令啟動的行程在后臺運行,
2.6、Hive常用互動命令
[atguigu@hadoop102 bin]$ hive -help
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/home/atguigu/.local/bin:/home/atguigu/bin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-hive-3.1.2-bin/bin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-hive-3.1.2-bin/bin)
Hive Session ID = ab0b4235-084f-4d02-bc00-dc9dbc2a0acf
usage: hive
-d,--define <key=value> Variable substitution to apply to Hive
commands. e.g. -d A=B or --define A=B
--database <databasename> Specify the database to use
-e <quoted-query-string> SQL from command line
-f <filename> SQL from files
-H,--help Print help information
--hiveconf <property=value> Use value for given property
--hivevar <key=value> Variable substitution to apply to Hive
commands. e.g. --hivevar A=B
-i <filename> Initialization SQL file
-S,--silent Silent mode in interactive shell
-v,--verbose Verbose mode (echo executed SQL to the
console)
# 不進入hive的互動視窗執行sql陳述句
[atguigu@hadoop102 bin]$ hive -e "select id from test;"
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/home/atguigu/.local/bin:/home/atguigu/bin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-hive-3.1.2-bin/bin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-hive-3.1.2-bin/bin)
Hive Session ID = b798f0f2-44d9-4429-b699-327a1f527e96
Logging initialized using configuration in jar:file:/opt/module/apache-hive-3.1.2-bin/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
Hive Session ID = 82850e5e-accf-4a6e-9208-eb8b21e596d3
OK
1
Time taken: 3.405 seconds, Fetched: 1 row(s)
[atguigu@hadoop102 datas]$ cat hivef.sql
select * from test;
# 不執行SQL腳本
[atguigu@hadoop102 datas]$ hive -f /opt/module/apache-hive-3.1.2-bin/datas/hivef.sql
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/home/atguigu/.local/bin:/home/atguigu/bin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-hive-3.1.2-bin/bin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-hive-3.1.2-bin/bin)
Hive Session ID = 6c763bda-fb28-4d62-bdf0-54a2d9d3c31e
Logging initialized using configuration in jar:file:/opt/module/apache-hive-3.1.2-bin/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
Hive Session ID = 01eb09c5-52df-4579-bdc1-1f8ddd37eaa7
OK
1
Time taken: 2.594 seconds, Fetched: 1 row(s)
# hive中查看hdfs檔案系統
hive> dfs -ls /;
Found 10 items
-rw-r--r-- 2 atguigu supergroup 5 2021-12-14 23:17 /dage.txt
-rw-r--r-- 3 atguigu supergroup 12562 2021-12-17 21:24 /fsimage.xml
drwxr-xr-x - atguigu supergroup 0 2021-12-12 22:10 /hahahha
drwxr-xr-x - atguigu supergroup 0 2021-12-12 22:09 /input
drwxr-xr-x - atguigu supergroup 0 2021-12-26 23:19 /output
drwxr-xr-x - atguigu supergroup 0 2021-12-26 23:23 /output1
drwxr-xr-x - atguigu supergroup 0 2021-12-28 20:46 /output2
drwxr-xr-x - atguigu supergroup 0 2021-12-14 21:21 /sanguo
drwxrwx--- - atguigu supergroup 0 2022-01-03 17:38 /tmp
drwxr-xr-x - atguigu supergroup 0 2022-01-03 17:41 /user
# 查看在hive中輸入的所有歷史命令
[atguigu@hadoop102 ~]$ cd;pwd
/home/atguigu
[atguigu@hadoop102 ~]$ cat .hivehistory
show tables;
show databases;
quit;
quit
;
dfs -ls /;
cat .hivehistory
;
dfs -ls /;
quit;
2.7、Hive常見屬性配置
hive視窗列印默認庫和表頭
[atguigu@hadoop102 ~]$ cat $HIVE_HOME/conf/hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- hive視窗列印默認庫和表頭 -->
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<!-- 指定hiveserver2連接的host -->
<property>
<name>hive.server2.thrift.bind.host</name>
<value>hadoop102</value>
</property>
<!-- 指定hiveserver2連接的埠號 -->
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<!-- jdbc連接的URL -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop102:3306/metastore?useSSL=false</value>
</property>
<!-- jdbc連接的Driver-->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<!-- jdbc連接的username-->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<!-- jdbc連接的password -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property>
<!-- Hive默認在HDFS的作業目錄 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<!-- Hive元資料存盤的驗證 -->
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<!-- 元資料存盤授權 -->
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
</configuration>
[atguigu@hadoop102 ~]$ hive
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/home/atguigu/.local/bin:/home/atguigu/bin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-hive-3.1.2-bin/bin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-zookeeper-3.5.7-bin/bin:/opt/module/apache-zookeeper-3.5.7-bin/sbin:/opt/module/jdk1.8.0_212/bin:/opt/module/hadoop-3.1.3/bin:/opt/module/hadoop-3.1.3/sbin:/opt/module/apache-hive-3.1.2-bin/bin)
Hive Session ID = 2974245c-c05f-49c0-aeff-dad02314e8fc
Logging initialized using configuration in jar:file:/opt/module/apache-hive-3.1.2-bin/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Hive Session ID = ac9d8985-4539-4103-bec5-14f6e03676bb
hive (default)> select * FROM test;
OK
test.id
1
Time taken: 2.317 seconds, Fetched: 1 row(s)
Hive運行日志資訊配置
[atguigu@hadoop102 ~]$ mv $HIVE_HOME/conf/hive-log4j2.properties.template $HIVE_HOME/conf/hive-log4j2.properties
[atguigu@hadoop102 ~]$ cat $HIVE_HOME/conf/hive-log4j2.properties.template
# Hive的log默認存放在/tmp/atguigu/hive.log目錄下(當前用戶名下)
# property.hive.log.dir = ${sys:java.io.tmpdir}/${sys:user.name}
property.hive.log.dir = /opt/module/apache-hive-3.1.2-bin/logs
Hive 引數配置的三種方式
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/402691.html
標籤:其他
