主頁 >  其他 > 一、Hadoop課程

一、Hadoop課程

2021-07-19 06:21:00 其他

Hadoop課程

2.1 初始設定

初始環境這里平臺已設定好,同學們需要了解一下如何設定,

1. 修改主機名,以master節點為例

[ec2-user@ip-172-31-32-47 ~]$ sudo vi /etc/hostname 
#在里面刪去所有內容,在首行添加 master作為自己新的主機名,
#重啟虛擬機,使配置生效
[ec2-user@ip-172-31-32-47 ~]$ sudo reboot

2. 修改hosts映射,以master節點為例

#查看所有節點的IP
[ec2-user@master ~]$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9001
        inet 172.31.32.47  netmask 255.255.240.0  broadcast 172.31.47.255
        inet6 fe80::8b2:80ff:fe01:e5c2  prefixlen 64  scopeid 0x20<link>
        ether 0a:b2:80:01:e5:c2  txqueuelen 1000  (Ethernet)
        RX packets 3461  bytes 687720 (671.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3262  bytes 544011 (531.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
[ec2-user@slave1 ~]$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9001
        inet 172.31.36.81  netmask 255.255.240.0  broadcast 172.31.47.255
        inet6 fe80::87d:36ff:fe72:bc0c  prefixlen 64  scopeid 0x20<link>
        ether 0a:7d:36:72:bc:0c  txqueuelen 1000  (Ethernet)
        RX packets 2195  bytes 543199 (530.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2178  bytes 361053 (352.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
[ec2-user@slave2 ~]$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9001
        inet 172.31.46.142  netmask 255.255.240.0  broadcast 172.31.47.255
        inet6 fe80::850:68ff:fe8c:6c5e  prefixlen 64  scopeid 0x20<link>
        ether 0a:50:68:8c:6c:5e  txqueuelen 1000  (Ethernet)
        RX packets 2284  bytes 547630 (534.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2241  bytes 375782 (366.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

#以IP 主機名格式寫道hosts檔案中
[ec2-user@master ~]$ sudo vi /etc/hosts
#查看修改結果,注意:所有節點都要修改hosts檔案
[ec2-user@master ~]$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost6 localhost6.localdomain6
172.31.32.47 master
172.31.36.81 slave1
172.31.46.142 slave2

2.2 安裝Java環境

我們先來了解一下為什么要安裝JDK,JDK是 Java 語言的軟體開發工具包,提供給程式員使用,主要用于移動設備、嵌入式設備上的java應用程式,JDK是整個java開發的核心,它包含了JAVA的運行環境(JVM+Java系統類別庫)和JAVA工具,

1. 解壓jdk1.8

#將jdk解壓到指定路徑
[ec2-user@master ~]$ sudo tar -zxvf hadoop/jdk-8u144-linux-x64.tar.gz -C /usr/local/src/
#查看目標目錄下是否有jdk解壓包
[ec2-user@master ~]$ ls /usr/local/src/
jdk1.8.0_144

2. 重命名為jdk

[ec2-user@master ~]$ cd /usr/local/src/
[ec2-user@master src]$ ls
jdk1.8.0_144
[ec2-user@master src]$ sudo mv jdk1.8.0_144/ jdk
[ec2-user@master src]$ ls
jdk

3. 添加環境變數(所有節點)–以master為例

[ec2-user@master src]$ sudo vi /etc/profile
#在檔案末尾添加如下內容
export JAVA_HOME=/usr/local/src/jdk
export PATH=$PATH:$JAVA_HOME/bin
#重繪環境變數
[ec2-user@master src]$ source /etc/profile

4. 查看jdk版本,驗證是否安裝成功

[ec2-user@master src]$ java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

5. 修改權限(所有節點,以master為例)

因為我們的實驗是采用普通用戶執行的,但是/usr/local/src/目錄需要root權限才能操作,如果不修改權限,在分發檔案時會顯示權限不足,

[ec2-user@master ~]$ ll /usr/local/
total 0
drwxr-xr-x 2 root root  6 Apr  9  2019 bin
drwxr-xr-x 2 root root  6 Apr  9  2019 etc
drwxr-xr-x 2 root root  6 Apr  9  2019 games
drwxr-xr-x 2 root root  6 Apr  9  2019 include
drwxr-xr-x 2 root root  6 Apr  9  2019 lib
drwxr-xr-x 2 root root  6 Apr  9  2019 lib64
drwxr-xr-x 2 root root  6 Apr  9  2019 libexec
drwxr-xr-x 2 root root  6 Apr  9  2019 sbin
drwxr-xr-x 5 root root 49 Mar  4 20:51 share
drwxr-xr-x 4 root root 31 Mar 19 06:54 src
#把/usr/local/src/目錄和子檔案夾的所屬用戶以及所屬組設定為ec2-user用戶
[ec2-user@master ~]$ sudo chown -R ec2-user:ec2-user /usr/local/src/
#再次查看/usr/local/src/目錄所屬用戶以及所屬組
[ec2-user@master ~]$ ll /usr/local/
total 0
drwxr-xr-x 2 root     root      6 Apr  9  2019 bin
drwxr-xr-x 2 root     root      6 Apr  9  2019 etc
drwxr-xr-x 2 root     root      6 Apr  9  2019 games
drwxr-xr-x 2 root     root      6 Apr  9  2019 include
drwxr-xr-x 2 root     root      6 Apr  9  2019 lib
drwxr-xr-x 2 root     root      6 Apr  9  2019 lib64
drwxr-xr-x 2 root     root      6 Apr  9  2019 libexec
drwxr-xr-x 2 root     root      6 Apr  9  2019 sbin
drwxr-xr-x 5 root     root     49 Mar  4 20:51 share
drwxr-xr-x 4 ec2-user ec2-user 31 Mar 19 06:54 src

6. 遠程分發到其他節點

[ec2-user@master ~]$ scp -r /usr/local/src/jdk/ slave1:/usr/local/src/
[ec2-user@master ~]$ scp -r /usr/local/src/jdk/ slave2:/usr/local/src/

2.3 安裝Hadoop集群

1. 解壓

[ec2-user@master src]$tar -zxvf /home/ec2-user/hadoop/hadoop-2.9.1.tar.gz -C /usr/local/src/
[ec2-user@master src]$ ls
hadoop-2.9.1  jdk

2. 重命名為Hadoop

[ec2-user@master src]$ pwd
/usr/local/src
[ec2-user@master src]$ mv hadoop-2.9.1/ hadoop
[ec2-user@master src]$ ls
hadoop  jdk

3. 添加環境變數(所有節點)–以master為例

[ec2-user@master ~]$ sudo vi /etc/profile
#在檔案末尾添加如下內容
export HADOOP_HOME=/usr/local/src/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_CLASSPATH=/usr/local/src/hadoop/lib/*
#重繪環境變數
[ec2-user@master ~]$ source /etc/profile

4. 修改core-site.xml組態檔

[ec2-user@master ~]$ cd /usr/local/src/hadoop/etc/hadoop/
[ec2-user@master hadoop]$ vi core-site.xml 

<configuration></configuration>標簽中添加如下內容:

	<property>
		<name>fs.defaultFS</name>
		<value>hdfs://master:9000</value>
	</property>

	<property>
		<name>hadoop.tmp.dir</name>
		<value>/usr/local/src/hadoop/tmp</value>
	</property>

5. 修改hdfs-site.xml組態檔

[ec2-user@master hadoop]$ pwd
/usr/local/src/hadoop/etc/hadoop
[ec2-user@master hadoop]$ vi hdfs-site.xml 

<configuration></configuration>標簽中添加如下內容:

<property>
	<name>dfs.replication</name>
	<value>3</value>
</property>

<!-- 指定Hadoop輔助名稱節點主機配s置 -->
<property>
	<name>dfs.namenode.secondary.http-address</name>
	<value>slave1:50090</value>
</property>

<property>
	<name>dfs.namenode.name.dir</name>
	<value>/usr/local/src/hadoop/tmp/dfs/name</value>
</property>

<property>
	<name>dfs.datanode.data.dir</name>
	<value>/usr/local/src/hadoop/tmp/dfs/data</value>
</property>

<property>
	<name>dfs.webhdfs.enabled</name>
	<value>true</value>
</property>

6. 修改yarn-site.xml組態檔

[ec2-user@master hadoop]$ pwd
/usr/local/src/hadoop/etc/hadoop
[ec2-user@master hadoop]$ vi yarn-site.xml 

<configuration></configuration>標簽中添加如下內容:

	<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>

	<property>
		<name>yarn.resourcemanager.hostname</name>
		<value>master</value>
	</property>

	<property>
		<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
		<value>org.apache.hadoop.mapred.ShuffleHandler</value>
	</property>

7. 修改mapred-site.xml組態檔

[ec2-user@master hadoop]$ pwd
/usr/local/src/hadoop/etc/hadoop
[ec2-user@master hadoop]$ cp mapred-site.xml.template mapred-site.xml
[ec2-user@master hadoop]$ vi mapred-site.xml

<configuration></configuration>標簽中添加如下內容:

    <property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>

8. 修改hadoop-env.sh組態檔

[ec2-user@master hadoop]$ pwd
/usr/local/src/hadoop/etc/hadoop
[ec2-user@master hadoop]$ vi hadoop-env.sh 

配置jdk路徑:

export JAVA_HOME=/usr/local/src/jdk

注意:要根據自己路徑來修改,

9. 修改slaves組態檔

[ec2-user@master hadoop]$ pwd
/usr/local/src/hadoop/etc/hadoop
[ec2-user@master hadoop]$ vi slaves 
[ec2-user@master hadoop]$ cat slaves 
slave1
slave2

10. 遠程分發到其他節點

[ec2-user@master hadoop]$ cd /usr/local/src/
[ec2-user@master src]$ scp -r hadoop/ slave1:/usr/local/src/
[ec2-user@master src]$ scp -r hadoop/ slave2:/usr/local/src/

11. 在namenode節點格式化namenode

[ec2-user@master src]$ hdfs namenode -format

img

12. 啟動hadoop集群

#在namenode節點啟動Hadoop集群
[ec2-user@master src]$ start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
The authenticity of host 'master (172.31.32.47)' can't be established.
ECDSA key fingerprint is SHA256:Tueyo4xR8lsxmdA11GlXAO3w44n6T75dYHe9flk8Y70.
ECDSA key fingerprint is MD5:22:9b:6d:f2:f3:11:a2:6d:4d:dd:ec:25:56:3b:2d:b2.
Are you sure you want to continue connecting (yes/no)? yes
master: Warning: Permanently added 'master,172.31.32.47' (ECDSA) to the list of known hosts.
master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-ec2-user-namenode-master.out
slave2: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-ec2-user-datanode-slave2.out
slave1: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-ec2-user-datanode-slave1.out
Starting secondary namenodes [slave1]
slave1: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-ec2-user-secondarynamenode-slave1.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-ec2-user-resourcemanager-master.out
slave1: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-ec2-user-nodemanager-slave1.out
slave2: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-ec2-user-nodemanager-slave2.out
#jps查看行程
[ec2-user@master src]$ jps
31522 Jps
31256 ResourceManager
30973 NameNode
[ec2-user@master src]$ ssh slave1
Last login: Fri Mar 19 06:15:47 2021 from 219.153.251.37

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
[ec2-user@slave1 ~]$ jps
29424 DataNode
29635 NodeManager
29544 SecondaryNameNode
29789 Jps
[ec2-user@slave1 ~]$ ssh slave2
Last login: Fri Mar 19 06:15:57 2021 from 219.153.251.37

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
[ec2-user@slave2 ~]$ jps
29633 Jps
29479 NodeManager
29354 DataNode

13. 查看hadoop集群狀態

[ec2-user@master ~]$ hdfs dfsadmin -report
Configured Capacity: 17154662400 (15.98 GB)
Present Capacity: 11389693952 (10.61 GB)
DFS Remaining: 11389685760 (10.61 GB)
DFS Used: 8192 (8 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (2):

Name: 172.31.36.81:50010 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 8577331200 (7.99 GB)
DFS Used: 4096 (4 KB)
Non DFS Used: 2882510848 (2.68 GB)
DFS Remaining: 5694816256 (5.30 GB)
DFS Used%: 0.00%
DFS Remaining%: 66.39%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Fri Mar 19 07:45:06 UTC 2021
Last Block Report: Fri Mar 19 07:41:00 UTC 2021


Name: 172.31.46.142:50010 (slave2)
Hostname: slave2
Decommission Status : Normal
Configured Capacity: 8577331200 (7.99 GB)
DFS Used: 4096 (4 KB)
Non DFS Used: 2882457600 (2.68 GB)
DFS Remaining: 5694869504 (5.30 GB)
DFS Used%: 0.00%
DFS Remaining%: 66.39%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Fri Mar 19 07:45:06 UTC 2021
Last Block Report: Fri Mar 19 07:41:00 UTC 2021

2.4 安裝Hive

1. 安裝MySQL

在安裝hive前我們需要先安裝MySQL資料庫,用來存盤hive的元資料,

1)下載mysql源安裝包

[ec2-user@master ~]$ sudo wget http://dev.mysql.com/get/mysql57-community-release-el7-8.noarch.rpm

2)安裝mysql源

[ec2-user@master ~]$ sudo yum localinstall mysql57-community-release-el7-8.noarch.rpm

3)檢查mysql源是否安裝成功

[ec2-user@master ~]$ sudo yum repolist enabled | grep "mysql.*.community.*"
mysql-connectors-community/x86_64     MySQL Connectors Community          146+39
mysql-tools-community/x86_64          MySQL Tools Community                  123
mysql57-community/x86_64              MySQL 5.7 Community Server             484

4)安裝MySQL

[ec2-user@master ~]$ sudo yum install mysql-community-server

5)啟動MySQL服務并查看運行狀態

[ec2-user@master ~]$ sudo systemctl start mysqld
[ec2-user@master ~]$ sudo systemctl status mysqld
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2021-03-19 07:56:43 UTC; 1s ago
     Docs: man:mysqld(8)
           http://dev.mysql.com/doc/refman/en/using-systemd.html
  Process: 31978 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
  Process: 31927 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
 Main PID: 31981 (mysqld)
   CGroup: /system.slice/mysqld.service
           └─31981 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid

Mar 19 07:56:39 master systemd[1]: Starting MySQL Server...
Mar 19 07:56:43 master systemd[1]: Started MySQL Server.

6)查看mysql初始密碼

[ec2-user@master ~]$ sudo grep "password" /var/log/mysqld.log
2021-03-19T07:56:41.030922Z 1 [Note] A temporary password is generated for root@localhost: v=OKXu0laSo;

7)修改mysql登陸密碼

先把之前我們查看到的初始密碼復制下來,在進入mysql需要輸入密碼時粘貼下來,回車,就可以進入MySQL命令列,

[ec2-user@master ~]$ sudo mysql -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 5.7.33

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 

修改密碼,設定MySQL登陸密碼為1234:

mysql> set password for 'root'@'localhost'=password('1234');
ERROR 1819 (HY000): Your password does not satisfy the current policy requirements

由上可知,新密碼設定的時候如果設定的過于簡單會報錯,

這時我們需要修改密碼規則:

mysql> set global validate_password_policy=0;
Query OK, 0 rows affected (0.00 sec)

mysql> set global validate_password_length=1;
Query OK, 0 rows affected (0.00 sec)

重新設定密碼:

mysql> set password for 'root'@'localhost'=password('1234');
Query OK, 0 rows affected, 1 warning (0.00 sec)

8) 設定遠程登陸

先退出MySQL,以新密碼登陸MySQL,

[ec2-user@master ~]$ mysql -uroot -p1234
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 10
Server version: 5.7.33 MySQL Community Server (GPL)

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 

創建用戶:

mysql> create user 'root'@'172.%.%.%' identified by '1234';
Query OK, 0 rows affected (0.00 sec)

允許遠程連接:

mysql> grant all privileges on *.* to 'root'@'172.%.%.%' with grant option;
Query OK, 0 rows affected (0.00 sec)

重繪權限:

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

至此,MySQL安裝成功,

2. 把hive解壓到指定位置

[ec2-user@master ~]$ tar -zxvf hadoop/apache-hive-1.1.0-bin.tar.gz -C /usr/local/src/

3. 重命名

[ec2-user@master ~]$ cd /usr/local/src/
[ec2-user@master src]$ ls
apache-hive-1.1.0-bin  hadoop  jdk
[ec2-user@master src]$ mv apache-hive-1.1.0-bin/ hive
[ec2-user@master src]$ ls
hadoop  hive  jdk

4. 添加環境變數

[ec2-user@master src]$ sudo vi /etc/profile
#在檔案末尾添加如下內容
export HIVE_HOME=/usr/local/src/hive
export PATH=$PATH:$HIVE_HOME/bin
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/src/hive/lib/*
#重繪環境變數
[ec2-user@master src]$ source /etc/profile

5. 修改hive-site.xml組態檔

[ec2-user@master src]$ cd hive/conf/
#創建hive-site.xml檔案
[ec2-user@master conf]$ touch hive-site.xml
[ec2-user@master conf]$ vi hive-site.xml 

在hive-site.xml檔案中添加如下內容:

<configuration>
<property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
</property>

<property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&amp;useSSL=false</value>
</property>

<property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
</property>

<property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
</property>

<property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>1234</value>
</property>
</configuration>

注意:MySQL密碼要改成自己設定的密碼,

6. 修改hive-env.sh組態檔

[ec2-user@master conf]$ pwd
/usr/local/src/hive/conf
[ec2-user@master conf]$ cp hive-env.sh.template hive-env.sh
[ec2-user@master conf]$ vi hive-env.sh
#在里面添加如下配置
export HADOOP_HOME=/usr/local/src/hadoop
export HIVE_CONF_DIR=/usr/local/src/hive/conf

7. 添加MySQL連接包

把MySQL驅動放到hive的lib目錄下,

[ec2-user@master conf]$ cp /home/ec2-user/hadoop/mysql-connector-java-5.1.44-bin.jar $HIVE_HOME/lib
[ec2-user@master conf]$ ls $HIVE_HOME/lib/mysql-connector-java-5.1.44-bin.jar 
/usr/local/src/hive/lib/mysql-connector-java-5.1.44-bin.jar

8. 啟動Hadoop集群(hive需要hdfs分布式檔案系統存盤來資料)

如果Hadoop已啟動,則不需要執行這一步,

start-all.sh

9. 初始化MySQL中的hive的資料庫

[ec2-user@master conf]$ schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-1.1.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-1.1.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Metastore connection URL:	 jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&useSSL=false
Metastore Connection Driver :	 com.mysql.jdbc.Driver
Metastore connection User:	 root
Starting metastore schema initialization to 1.1.0
Initialization script hive-schema-1.1.0.mysql.sql
Initialization script completed
schemaTool completed

10. 啟動hive并測驗

[ec2-user@master conf]$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-1.1.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-1.1.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-1.1.0.jar!/hive-log4j.properties
hive> show databases;
OK
default
Time taken: 0.587 seconds, Fetched: 1 row(s)

至此,hive安裝成功,

2.5 安裝Sqoop

1. 解壓

[ec2-user@master ~]$ tar -zxvf hadoop/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C /usr/local/src/

2. 重命名為sqoop

[ec2-user@master ~]$ cd /usr/local/src/
[ec2-user@master src]$ ls
hadoop  hive  jdk  sqoop-1.4.7.bin__hadoop-2.6.0
[ec2-user@master src]$ mv sqoop-1.4.7.bin__hadoop-2.6.0/ sqoop
[ec2-user@master src]$ ls
hadoop  hive  jdk  sqoop

3. 添加環境變數

[ec2-user@master src]$ sudo vi /etc/profile
#在里面添加如下代碼
export SQOOP_HOME=/usr/local/src/sqoop
export PATH=$PATH:$SQOOP_HOME/bin
#重繪環境變數
[ec2-user@master src]$ source /etc/profile

4. 修改sqoop-env.sh組態檔

[ec2-user@master src]$ cd sqoop/conf/
[ec2-user@master conf]$ mv sqoop-env-template.sh sqoop-env.sh
[ec2-user@master conf]$ vi sqoop-env.sh 

在里面修改一下配置項,根據自己的環境來修改:

#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/usr/local/src/hadoop

#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/usr/local/src/hadoop

#Set the path to where bin/hive is available
export HIVE_HOME=/usr/local/src/hive

5. 把mysql驅動放到sqoop的lib目錄下

[ec2-user@master conf]$ cp /home/ec2-user/hadoop/mysql-connector-java-5.1.44-bin.jar $SQOOP_HOME/lib[ec2-user@master conf]$ ls $SQOOP_HOME/lib/mysql-connector-java-5.1.44-bin.jar 
/usr/local/src/sqoop/lib/mysql-connector-java-5.1.44-bin.jar

6. 驗證sqoop是否配置成功

[ec2-user@master conf]$ sqoop help
Warning: /usr/local/src/sqoop/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /usr/local/src/sqoop/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
21/03/19 08:53:06 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
usage: sqoop COMMAND [ARGS]

Available commands:
  codegen            Generate code to interact with database records
  create-hive-table  Import a table definition into Hive
  eval               Evaluate a SQL statement and display the results
  export             Export an HDFS directory to a database table
  help               List available commands
  import             Import a table from a database to HDFS
  import-all-tables  Import tables from a database to HDFS
  import-mainframe   Import datasets from a mainframe server to HDFS
  job                Work with saved jobs
  list-databases     List available databases on a server
  list-tables        List available tables in a database
  merge              Merge results of incremental imports
  metastore          Run a standalone Sqoop metastore
  version            Display version information

See 'sqoop help COMMAND' for information on a specific command.

轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/289137.html

標籤:其他

上一篇:HCNA Routing&Switching之動態路由協議OSPF基礎(二)

下一篇:一、Hadoop課程

標籤雲
其他(157675) Python(38076) JavaScript(25376) Java(17977) C(15215) 區塊鏈(8255) C#(7972) AI(7469) 爪哇(7425) MySQL(7132) html(6777) 基礎類(6313) sql(6102) 熊猫(6058) PHP(5869) 数组(5741) R(5409) Linux(5327) 反应(5209) 腳本語言(PerlPython)(5129) 非技術區(4971) Android(4554) 数据框(4311) css(4259) 节点.js(4032) C語言(3288) json(3245) 列表(3129) 扑(3119) C++語言(3117) 安卓(2998) 打字稿(2995) VBA(2789) Java相關(2746) 疑難問題(2699) 细绳(2522) 單片機工控(2479) iOS(2429) ASP.NET(2402) MongoDB(2323) 麻木的(2285) 正则表达式(2254) 字典(2211) 循环(2198) 迅速(2185) 擅长(2169) 镖(2155) 功能(1967) .NET技术(1958) Web開發(1951) python-3.x(1918) HtmlCss(1915) 弹簧靴(1913) C++(1909) xml(1889) PostgreSQL(1872) .NETCore(1853) 谷歌表格(1846) Unity3D(1843) for循环(1842)

熱門瀏覽
  • 網閘典型架構簡述

    網閘架構一般分為兩種:三主機的三系統架構網閘和雙主機的2+1架構網閘。 三主機架構分別為內端機、外端機和仲裁機。三機無論從軟體和硬體上均各自獨立。首先從硬體上來看,三機都用各自獨立的主板、記憶體及存盤設備。從軟體上來看,三機有各自獨立的作業系統。這樣能達到完全的三機獨立。對于“2+1”系統,“2”分為 ......

    uj5u.com 2020-09-10 02:00:44 more
  • 如何從xshell上傳檔案到centos linux虛擬機里

    如何從xshell上傳檔案到centos linux虛擬機里及:虛擬機CentOs下執行 yum -y install lrzsz命令,出現錯誤:鏡像無法找到軟體包 前言 一、安裝lrzsz步驟 二、上傳檔案 三、遇到的問題及解決方案 總結 前言 提示:其實很簡單,往虛擬機上安裝一個上傳檔案的工具 ......

    uj5u.com 2020-09-10 02:00:47 more
  • 一、SQLMAP入門

    一、SQLMAP入門 1、判斷是否存在注入 sqlmap.py -u 網址/id=1 id=1不可缺少。當注入點后面的引數大于兩個時。需要加雙引號, sqlmap.py -u "網址/id=1&uid=1" 2、判斷文本中的請求是否存在注入 從文本中加載http請求,SQLMAP可以從一個文本檔案中 ......

    uj5u.com 2020-09-10 02:00:50 more
  • Metasploit 簡單使用教程

    metasploit 簡單使用教程 浩先生, 2020-08-28 16:18:25 分類專欄: kail 網路安全 linux 文章標簽: linux資訊安全 編輯 著作權 metasploit 使用教程 前言 一、Metasploit是什么? 二、準備作業 三、具體步驟 前言 Msfconsole ......

    uj5u.com 2020-09-10 02:00:53 more
  • 游戲逆向之驅動層與用戶層通訊

    驅動層代碼: #pragma once #include <ntifs.h> #define add_code CTL_CODE(FILE_DEVICE_UNKNOWN,0x800,METHOD_BUFFERED,FILE_ANY_ACCESS) /* 更多游戲逆向視頻www.yxfzedu.com ......

    uj5u.com 2020-09-10 02:00:56 more
  • 北斗電力時鐘(北斗授時服務器)讓網路資料更精準

    北斗電力時鐘(北斗授時服務器)讓網路資料更精準 北斗電力時鐘(北斗授時服務器)讓網路資料更精準 京準電子科技官微——ahjzsz 近幾年,資訊技術的得了快速發展,互聯網在逐漸普及,其在人們生活和生產中都得到了廣泛應用,并且取得了不錯的應用效果。計算機網路資訊在電力系統中的應用,一方面使電力系統的運行 ......

    uj5u.com 2020-09-10 02:01:03 more
  • 【CTF】CTFHub 技能樹 彩蛋 writeup

    ?碎碎念 CTFHub:https://www.ctfhub.com/ 筆者入門CTF時時剛開始刷的是bugku的舊平臺,后來才有了CTFHub。 感覺不論是網頁UI設計,還是題目質量,賽事跟蹤,工具軟體都做得很不錯。 而且因為獨到的金幣制度的確讓人有一種想去刷題賺金幣的感覺。 個人還是非常喜歡這個 ......

    uj5u.com 2020-09-10 02:04:05 more
  • 02windows基礎操作

    我學到了一下幾點 Windows系統目錄結構與滲透的作用 常見Windows的服務詳解 Windows埠詳解 常用的Windows注冊表詳解 hacker DOS命令詳解(net user / type /md /rd/ dir /cd /net use copy、批處理 等) 利用dos命令制作 ......

    uj5u.com 2020-09-10 02:04:18 more
  • 03.Linux基礎操作

    我學到了以下幾點 01Linux系統介紹02系統安裝,密碼啊破解03Linux常用命令04LAMP 01LINUX windows: win03 8 12 16 19 配置不繁瑣 Linux:redhat,centos(紅帽社區版),Ubuntu server,suse unix:金融機構,證券,銀 ......

    uj5u.com 2020-09-10 02:04:30 more
  • 05HTML

    01HTML介紹 02頭部標簽講解03基礎標簽講解04表單標簽講解 HTML前段語言 js1.了解代碼2.根據代碼 懂得挖掘漏洞 (POST注入/XSS漏洞上傳)3.黑帽seo 白帽seo 客戶網站被黑帽植入劫持代碼如何處理4.熟悉html表單 <html><head><title>TDK標題,描述 ......

    uj5u.com 2020-09-10 02:04:36 more
最新发布
  • 2023年最新微信小程式抓包教程

    01 開門見山 隔一個月發一篇文章,不過分。 首先回顧一下《微信系結手機號資料庫被脫庫事件》,我也是第一時間得知了這個訊息,然后跟蹤了整件事情的經過。下面是這起事件的相關截圖以及近日流出的一萬條資料樣本: 個人認為這件事也沒什么,還不如關注一下之前45億快遞資料查詢渠道疑似在近日復活的訊息。 訊息是 ......

    uj5u.com 2023-04-20 08:48:24 more
  • web3 產品介紹:metamask 錢包 使用最多的瀏覽器插件錢包

    Metamask錢包是一種基于區塊鏈技術的數字貨幣錢包,它允許用戶在安全、便捷的環境下管理自己的加密資產。Metamask錢包是以太坊生態系統中最流行的錢包之一,它具有易于使用、安全性高和功能強大等優點。 本文將詳細介紹Metamask錢包的功能和使用方法。 一、 Metamask錢包的功能 數字資 ......

    uj5u.com 2023-04-20 08:47:46 more
  • vulnhub_Earth

    前言 靶機地址->>>vulnhub_Earth 攻擊機ip:192.168.20.121 靶機ip:192.168.20.122 參考文章 https://www.cnblogs.com/Jing-X/archive/2022/04/03/16097695.html https://www.cnb ......

    uj5u.com 2023-04-20 07:46:20 more
  • 從4k到42k,軟體測驗工程師的漲薪史,給我看哭了

    清明節一過,盲猜大家已經無心上班,在數著日子準備過五一,但一想到銀行卡里的余額……瞬間心情就不美麗了。最近,2023年高校畢業生就業調查顯示,本科畢業月平均起薪為5825元。調查一出,便有很多同學表示自己又被平均了。看著這一資料,不免讓人想到前不久中國青年報的一項調查:近六成大學生認為畢業10年內會 ......

    uj5u.com 2023-04-20 07:44:00 more
  • 最新版本 Stable Diffusion 開源 AI 繪畫工具之中文自動提詞篇

    🎈 標簽生成器 由于輸入正向提示詞 prompt 和反向提示詞 negative prompt 都是使用英文,所以對學習母語的我們非常不友好 使用網址:https://tinygeeker.github.io/p/ai-prompt-generator 這個網址是為了讓大家在使用 AI 繪畫的時候 ......

    uj5u.com 2023-04-20 07:43:36 more
  • 漫談前端自動化測驗演進之路及測驗工具分析

    隨著前端技術的不斷發展和應用程式的日益復雜,前端自動化測驗也在不斷演進。隨著 Web 應用程式變得越來越復雜,自動化測驗的需求也越來越高。如今,自動化測驗已經成為 Web 應用程式開發程序中不可或缺的一部分,它們可以幫助開發人員更快地發現和修復錯誤,提高應用程式的性能和可靠性。 ......

    uj5u.com 2023-04-20 07:43:16 more
  • CANN開發實踐:4個DVPP記憶體問題的典型案例解讀

    摘要:由于DVPP媒體資料處理功能對存放輸入、輸出資料的記憶體有更高的要求(例如,記憶體首地址128位元組對齊),因此需呼叫專用的記憶體申請介面,那么本期就分享幾個關于DVPP記憶體問題的典型案例,并給出原因分析及解決方法。 本文分享自華為云社區《FAQ_DVPP記憶體問題案例》,作者:昇騰CANN。 DVPP ......

    uj5u.com 2023-04-20 07:43:03 more
  • msf學習

    msf學習 以kali自帶的msf為例 一、msf核心模塊與功能 msf模塊都放在/usr/share/metasploit-framework/modules目錄下 1、auxiliary 輔助模塊,輔助滲透(埠掃描、登錄密碼爆破、漏洞驗證等) 2、encoders 編碼器模塊,主要包含各種編碼 ......

    uj5u.com 2023-04-20 07:42:59 more
  • Halcon軟體安裝與界面簡介

    1. 下載Halcon17版本到到本地 2. 雙擊安裝包后 3. 步驟如下 1.2 Halcon軟體安裝 界面分為四大塊 1. Halcon的五個助手 1) 影像采集助手:與相機連接,設定相機引數,采集影像 2) 標定助手:九點標定或是其它的標定,生成標定檔案及內參外參,可以將像素單位轉換為長度單位 ......

    uj5u.com 2023-04-20 07:42:17 more
  • 在MacOS下使用Unity3D開發游戲

    第一次發博客,先發一下我的游戲開發環境吧。 去年2月份買了一臺MacBookPro2021 M1pro(以下簡稱mbp),這一年來一直在用mbp開發游戲。我大致分享一下我的開發工具以及使用體驗。 1、Unity 官網鏈接: https://unity.cn/releases 我一般使用的Apple ......

    uj5u.com 2023-04-20 07:40:19 more