一、安裝docker
1. 下載離線包
Index of linux/static/stable/x86_64/
2. 解壓
tar -xzvf docker-18.06.3-ce.tgz
(ce版本表示社區免費版,詳細說明 docker帶ce和不帶ce的區別-Docker-PHP中文網)
3. 將解壓后的檔案夾復制到 /usr/local 目錄
cp docker-18.06.3-ce /usr/local
4. 將docker注冊為系統service
創建docker.service檔案 vim /usr/lib/systemd/system/docker.service,復制下面的內容到docker.service檔案,保存退出
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
5. 啟動docker服務
systemctl start docker
6. 查看docker運行狀態
systemctl status docker
7. 設定開機啟動
systemctl enable docker.service
參考文章:離線部署docker_千里之行始于足下-CSDN博客_docker離線安裝部署
二、docker部署hadoop
部署hadoop需先自行安裝 jdk 和 ssh 軟體包
1. 下載hadoop軟體包
Index of /dist/hadoop/common
注:自己選擇合適的版本
2. 解壓離線寶
tar -xzvf hadoop-2.7.2.tar.gz
3. 將解壓后的檔案夾復制到 /usr/local 目錄
cp hadoop-2.7.2 /usr/local
4. 配置JAVA_HOME變數
進入hadoop組態檔目錄 /usr/local/hadoop/etc/hadoop,編輯hadoop-env.sh 組態檔,增加JAVA_HOME環境變數配置
export JAVA_HOME=“jdk根目錄”
保存退出
5. hadoop三種運行模式
5.1 獨立模式(默認配置)

默認情況下,Hadoop配置運行再非分布式模式,作為一個單獨的Java行程,有助于開發除錯,
此模式無需再修改組態檔,可直接執行hadoop自帶的mapreduce例子測驗,
$ mkdir input
$ cp etc/hadoop/*.xml input
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar grep input output 'dfs[a-z.]+'
$ cat output/*
5.2 偽分布模式
Hadoop也能夠偽分布模式運行在一個single-node上面,Hadoop deamon運行在不同的Java行程,
修改組態檔
Hadoop/etc/hadoop/core-site.xml
鏈接:core-site.xml 詳細配置引數說明
fs.defaultFS Hadoop檔案系統地址
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Hadoop/etc/hadoop/hdfs-site.xml:
鏈接:hdfs-site.xml 詳細配置引數說明
dfs.replication 資料塊副本數量為1
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
設定 SSH 免密碼登錄
使用以下命令檢查是否已配置免密碼模式
$ ssh localhost
如未配置,可按照以下命令進行配置
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
啟動Hadoop
1.格式化檔案系統
$ bin/hdfs namenode -format
2.啟動 NameNode 行程和 DataNode 行程
$ sbin/start-dfs.sh
hadoop默認 log 輸出寫入路徑是 Hadoop/logs 檔案夾
3.通過web瀏覽器訪問 NameNode,默認介面地址為 :http://localhost:9870
4.創建 HDFS 目錄(Hadoop檔案系統目錄),執行MapReduce Job
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username>
5.拷貝 input 到 HDFS
$ bin/hdfs dfs -mkdir input
$ bin/hdfs dfs -put etc/hadoop/*.xml input
6.運行Hadoop examples
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar grep input output 'dfs[a-z.]+'
7.檢查 output 目錄下檔案,將 output 目錄從HDFS拷貝到本地檔案系統進行檢查
$ bin/hdfs dfs -get output output
$ cat output/*
或者直接在HDFS上查看
$ bin/hdfs dfs -cat output/*
8.關閉Hadoop行程
$ sbin/stop-dfs.sh
Yarn(資源協調者) 運行在 single-node
MapReduce Job在偽分布模式下也可以通過 Yarn(資源協調者)運行任務,需要配置一些引數,還有,還需要啟動 ResourceManager 和 NodeManager 行程,
1.修改組態檔
Hadoop/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>
Hadoop/etc/hadoop/yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ,HADOOP_MAPRED_HOME</value>
</property>
</configuration>
2.啟動 ResourceManager 和 NodeManager 行程
$ sbin/start-yarn.sh
3.通過web瀏覽器訪問 ResourceManager;默認地址為:http://localhost:8088/
4.運行MapReduce Job
與Hadoop 執行 MapReduce Job 命令是一樣的,參照上面
5.關閉ResourceManager 和 NodeManager 行程
$ sbin/stop-yarn.sh
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/337637.html
標籤:其他
