Hadoop(CDH)分布式集群安装笔记(亲测)
2018-03-02 11:18
573 查看
在搭建Hadoop集群之前需要准备已经安装好的3台节点分别为:master、slave1和slave2,Linux节点安装过程这里不在赘述。
1. 集群规划
1.1 主机规划
1.2 软件规划
1.3 用户规划
1.4 目录规划
2. 环境检查
1.1 时钟同步
统一时区
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
NTP(网络时间协议)时钟同步
yum install ntp //下载安装ntp
ntpdate pool.ntp.org 同步时间
1.2 Hosts文件配置
配置集群所有节点ip与hostname的映射关系
192.168.8.130 master
192.168.8.131 slave1
192.168.8.132 slave2
1.3 关闭防火墙
查看防火墙状态
service iptables status
永久关闭防火墙
chkconfig iptables off
临时关闭防火墙
service iptables stop
1.4 ssh面密码登录首先每个节点单独配置ssh免密码登录切换到用户根目录mkdir .sshssh-keygen -t rsa进入.ssh文件cd .sshcat id_rsa.pub >> authorized_keys退回到根目录chmod 700 .sshchmod 600 .ssh/*ssh master 将slave1和slave2的共钥id_ras.pub拷贝到master中的authorized_keys文件中。cat ~/.ssh/id_rsa.pub | ssh cdh@master'cat >> ~/.ssh/authorized_keys'然后将master中的authorized_keys文件分发到slave1和slave2节点上面。scp -r authorized_keyscdh@slave1:~/.ssh/scp -r authorized_keys cdh@slave2:~/.ssh/然后master、slave1和slave2就可以免密码互通
1.5 集群脚本工具准备
创建/home/cdh/tools脚本存放目录,将以下脚本(见本地目录)上传至该目录
mkdir /home/cdh/tools
deploy.conf deploy.sh runRemoteCmd.sh
给脚本添加执行权限
chmod u+x deploy.sh
chmod u+x runRemoteCmd.sh
配置脚本环境变量
vi ~/.bashrc
PATH=/home/cdh/tools:$PATH
export PATH
批量创建各个节点相应目录
runRemoteCmd.sh "mkdir /home/cdh/app" all
runRemoteCmd.sh "mkdir /home/cdh/data" all
3. Jdk安装
1.1 下载jdk-8u51-linux-x64.tar.gz上传至master节点的/home/cdh/app目录下
1.2 解压:tar –zxvf jdk-8u51-linux-x64.tar.gz
1.3 创建软连接:ln -s jdk1.8.0_51 jdk
1.4 配置环境变量
vi ~/.bashrc
JAVA_HOME=/home/cdh/app/jdk
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:/home/cdh/tools:$PATH
export JAVA_HOME CLASSPATH PATH
1.5 保存并使得配置文件生效:source ~/.bashrc
1.6 查看jdk是否安装成功:java –version
1.7 将jdk1.8.0_51安装包分发到slave1和slave2节点
deploy.sh jdk1.8.0_51 /home/cdh/app/ slave
然后做同样的操作,确保每个节点jdk安装成功
4. Zookeeper安装
1.1 下载zookeeper-3.4.5-cdh5.13.0.tar.gz上传至master节点的/home/cdh/app目录下
1.2 解压:tar -zxvf zookeeper-3.4.5-cdh5.13.0.tar.gz
1.3 创建软连接:ln -s zookeeper-3.4.5-cdh5.13.0 zookeeper
1.4 修改zoo.cfg配置文件
复制一份zoo.cfg配置文件
cp zoo_sample.cfg zoo.cfg
修改zoo.cfg配置文件
(详情见本地文件)
1.5 将Zookeeper安装目录整体分发到slave1和slave2节点
deploy.sh zookeeper-3.4.5-cdh5.13.0 /home/cdh/app/ slave
并分别创建软连接
ln -s zookeeper-3.4.5-cdh5.13.0 zookeeper
1.6 所有节点创建zoo.cfg配置文件中的数据目录和日志目录
runRemoteCmd.sh "mkdir -p/home/cdh/data/zookeeper/zkdata" all
runRemoteCmd.sh "mkdir -p/home/cdh/data/zookeeper/zkdatalog" all
1.7 Master、slave1、slave2节点,进入/home/cdh/data/zookeeper/zkdata目录,创建文件myid,里面的内容分别填充为:1、2、3
[cdh@masterzkdata]$ vi myid
[cdh@masterzkdata]$ cat myid
1
[cdh@slave1zkdata]$ vi myid
[cdh@slave1zkdata]$ cat myid
2
[cdh@slave2zkdata]$ vi myid
[cdh@slave2zkdata]$ cat myid
3
1.8 各个节点配置Zookeeper环境变量
vi ~/.bashrc
JAVA_HOME=/home/cdh/app/jdk
ZOOKEEPER_HOME=/home/cdh/app/zookeeper
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:/home/cdh/tools:$ZOOKEEPER_HOME/bin:$PATH
export JAVA_HOME CLASSPATH PATHZOOKEEPER_HOME
保存并使之生效:source ~/.bashrc
1.9 测试运行Zookeeper
启动Zookeeper
runRemoteCmd.sh"/home/cdh/app/zookeeper/bin/zkServer.sh start" all
查看Zookeeper进程
runRemoteCmd.sh "jps" all
查看Zookeeper状态
runRemoteCmd.sh "/home/cdh/app/zookeeper/bin/zkServer.shstatus" all
5. Hdfs安装
1.1 下载hadoop-2.6.0-cdh5.13.0.tar.gz,上传至master节点的/home/cdh/app目录下
1.2 解压:tar -zxvf hadoop-2.6.0-cdh5.13.0.tar.gz
1.3 创建软连接:ln -s hadoop-2.6.0-cdh5.13.0 hadoop
1.4 修改hdfs配置文件:core-site.xml、hdfs-site.xml、slaves、hadoop-env.sh
1.5 将hadoop安装目录整体分发到slave1和slave2节点
deploy.sh hadoop-2.6.0-cdh5.13.0 /home/cdh/app/ slave
slave1和slave2分别创建软连接
ln -s hadoop-2.6.0-cdh5.13.0 hadoop
1.6 配置hadoop环境变量
vi~/.bashrc
JAVA_HOME=/home/cdh/app/jdk
ZOOKEEPER_HOME=/home/cdh/app/zookeeper
HADOOP_HOME=/home/cdh/app/hadoop
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:/home/cdh/tools:$ZOOKEEPER_HOME/bin:$HADOOP_HOME/bin:$PATH
export JAVA_HOME CLASSPATH PATHZOOKEEPER_HOME HADOOP_HOME
保存并使之生效
source ~/.bashrc
1.7 测试运行hdfs
1.7.1.1 所有节点启动Zookeeper:
runRemoteCmd.sh "/home/cdh/app/zookeeper/bin/zkServer.shstart" all
1.7.1.2 所有节点启动journalnode:
runRemoteCm
b575
d.sh"/home/cdh/app/hadoop/sbin/hadoop-daemon.sh start journalnode" all
1.7.1.3 Master节点格式化namenode
bin/hdfs namenode -format
1.7.1.4 Master节点格式化zkfc
bin/hdfs zkfc -formatZK
1.7.1.5 Master节点启动namenode
bin/hdfs namenode
1.7.1.6 Slave1节点同步master节点元数据信息
bin/hdfs namenode -bootstrapStandby
1.7.1.7 Ctrl+c 关闭master节点namenode进程
1.7.1.8 关闭所有节点journalnode
runRemoteCmd.sh"/home/cdh/app/hadoop/sbin/hadoop-daemon.sh stop journalnode" all
1.7.1.9 一键启动hdfs:sbin/start-dfs.sh
1.7.1.10一键关闭hdfs:sbin/stop-dfs.sh
1.7.1.11Web界面查看hdfs:http://master:50070
6. Yarn安装
1.1 Master节点修改yarn配置文件yarn-site.xml、mapred-site.xml(详情见本地文件)
1.2 将yarn-site.xml、mapred-site.xml分发到slave1和slave2节点
deploy.sh mapred-site.xml/home/cdh/app/hadoop-2.6.0-cdh5.13.0/etc/hadoop slave
deploy.sh yarn-site.xml/home/cdh/app/hadoop-2.6.0-cdh5.13.0/etc/hadoop slave
1.3 Master启动yarn:sbin/start-yarn.sh
1.4 Slave1节点启动ResourceManager:sbin/yarn-daemon.shstart resourcemanager
1.5 Web界面查看yarn:http://master:8088
1.6 查看ResourceManager状态
bin/yarnrmadmin -getServiceState rm1
bin/yarn rmadmin -getServiceState rm2
1.7 运行测试Wordcount
1.7.1.1 Master本地目录新建文件djt.txt
[cdh@masterhadoop]$ cat djt.txt
hadoop spark
hadoop spark
hadoop spark
1.7.1.2 Hdfs文件系统创建djt目录
[cdh@master hadoop]$ bin/hdfs mkdir /djt
[cdh@master hadoop]$ bin/hdfs dfs -ls /
1.7.1.3 将djt.txt文件上传至/djt目录下
bin/hdfs dfs -put djt.txt /djt/
bin/hdfs dfs -ls /djt
1.7.1.4 运行Wordcount
bin/hadoop jarshare/hadoop/mapreduce2/hadoop-mapreduce-examples-2.6.0-cdh5.13.0.jar wordcount/djt/djt.txt /djt/output
1.7.1.5 查看运行结果
bin/hdfs dfs -cat /djt/output/*
hadoop 3
spark 3
如需完整的安装包+配置文件,可以加QQ群:695520445
1. 集群规划
1.1 主机规划
| Master | Slave1 | Slave2 |
Namenode | 是 | 是 | |
Datanode | 是 | 是 | 是 |
ResourceManager | 是 | 是 | |
NodeManager | 是 | 是 | 是 |
Journalnode | 是 | 是 | 是 |
Zookeeper | 是 | 是 | 是 |
1.2 软件规划
软件 | 版本 | 位数 |
Jdk | 1.8 | 64 |
Centos | 6.5 | 64 |
Zookeeper | zookeeper-3.4.5-cdh5.13.0.tar.gz | |
Hadoop | hadoop-2.6.0-cdh5.13.0.tar.gz | |
1.3 用户规划
节点名称 | 用户组 | 用户 |
Master | Cdh | Cdh |
Slave1 | Cdh | Cdh |
Slave2 | Cdh | Cdh |
1.4 目录规划
名称 | 路径 |
所有软件目录 | /home/cdh/app |
脚本目录 | /home/cdh/tools |
数据目录 | /home/cdh/data |
2. 环境检查
1.1 时钟同步
统一时区
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
NTP(网络时间协议)时钟同步
yum install ntp //下载安装ntp
ntpdate pool.ntp.org 同步时间
1.2 Hosts文件配置
配置集群所有节点ip与hostname的映射关系
192.168.8.130 master
192.168.8.131 slave1
192.168.8.132 slave2
1.3 关闭防火墙
查看防火墙状态
service iptables status
永久关闭防火墙
chkconfig iptables off
临时关闭防火墙
service iptables stop
1.4 ssh面密码登录首先每个节点单独配置ssh免密码登录切换到用户根目录mkdir .sshssh-keygen -t rsa进入.ssh文件cd .sshcat id_rsa.pub >> authorized_keys退回到根目录chmod 700 .sshchmod 600 .ssh/*ssh master 将slave1和slave2的共钥id_ras.pub拷贝到master中的authorized_keys文件中。cat ~/.ssh/id_rsa.pub | ssh cdh@master'cat >> ~/.ssh/authorized_keys'然后将master中的authorized_keys文件分发到slave1和slave2节点上面。scp -r authorized_keyscdh@slave1:~/.ssh/scp -r authorized_keys cdh@slave2:~/.ssh/然后master、slave1和slave2就可以免密码互通
1.5 集群脚本工具准备
创建/home/cdh/tools脚本存放目录,将以下脚本(见本地目录)上传至该目录
mkdir /home/cdh/tools
deploy.conf deploy.sh runRemoteCmd.sh
给脚本添加执行权限
chmod u+x deploy.sh
chmod u+x runRemoteCmd.sh
配置脚本环境变量
vi ~/.bashrc
PATH=/home/cdh/tools:$PATH
export PATH
批量创建各个节点相应目录
runRemoteCmd.sh "mkdir /home/cdh/app" all
runRemoteCmd.sh "mkdir /home/cdh/data" all
3. Jdk安装
1.1 下载jdk-8u51-linux-x64.tar.gz上传至master节点的/home/cdh/app目录下
1.2 解压:tar –zxvf jdk-8u51-linux-x64.tar.gz
1.3 创建软连接:ln -s jdk1.8.0_51 jdk
1.4 配置环境变量
vi ~/.bashrc
JAVA_HOME=/home/cdh/app/jdk
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:/home/cdh/tools:$PATH
export JAVA_HOME CLASSPATH PATH
1.5 保存并使得配置文件生效:source ~/.bashrc
1.6 查看jdk是否安装成功:java –version
1.7 将jdk1.8.0_51安装包分发到slave1和slave2节点
deploy.sh jdk1.8.0_51 /home/cdh/app/ slave
然后做同样的操作,确保每个节点jdk安装成功
4. Zookeeper安装
1.1 下载zookeeper-3.4.5-cdh5.13.0.tar.gz上传至master节点的/home/cdh/app目录下
1.2 解压:tar -zxvf zookeeper-3.4.5-cdh5.13.0.tar.gz
1.3 创建软连接:ln -s zookeeper-3.4.5-cdh5.13.0 zookeeper
1.4 修改zoo.cfg配置文件
复制一份zoo.cfg配置文件
cp zoo_sample.cfg zoo.cfg
修改zoo.cfg配置文件
(详情见本地文件)
1.5 将Zookeeper安装目录整体分发到slave1和slave2节点
deploy.sh zookeeper-3.4.5-cdh5.13.0 /home/cdh/app/ slave
并分别创建软连接
ln -s zookeeper-3.4.5-cdh5.13.0 zookeeper
1.6 所有节点创建zoo.cfg配置文件中的数据目录和日志目录
runRemoteCmd.sh "mkdir -p/home/cdh/data/zookeeper/zkdata" all
runRemoteCmd.sh "mkdir -p/home/cdh/data/zookeeper/zkdatalog" all
1.7 Master、slave1、slave2节点,进入/home/cdh/data/zookeeper/zkdata目录,创建文件myid,里面的内容分别填充为:1、2、3
[cdh@masterzkdata]$ vi myid
[cdh@masterzkdata]$ cat myid
1
[cdh@slave1zkdata]$ vi myid
[cdh@slave1zkdata]$ cat myid
2
[cdh@slave2zkdata]$ vi myid
[cdh@slave2zkdata]$ cat myid
3
1.8 各个节点配置Zookeeper环境变量
vi ~/.bashrc
JAVA_HOME=/home/cdh/app/jdk
ZOOKEEPER_HOME=/home/cdh/app/zookeeper
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:/home/cdh/tools:$ZOOKEEPER_HOME/bin:$PATH
export JAVA_HOME CLASSPATH PATHZOOKEEPER_HOME
保存并使之生效:source ~/.bashrc
1.9 测试运行Zookeeper
启动Zookeeper
runRemoteCmd.sh"/home/cdh/app/zookeeper/bin/zkServer.sh start" all
查看Zookeeper进程
runRemoteCmd.sh "jps" all
查看Zookeeper状态
runRemoteCmd.sh "/home/cdh/app/zookeeper/bin/zkServer.shstatus" all
5. Hdfs安装
1.1 下载hadoop-2.6.0-cdh5.13.0.tar.gz,上传至master节点的/home/cdh/app目录下
1.2 解压:tar -zxvf hadoop-2.6.0-cdh5.13.0.tar.gz
1.3 创建软连接:ln -s hadoop-2.6.0-cdh5.13.0 hadoop
1.4 修改hdfs配置文件:core-site.xml、hdfs-site.xml、slaves、hadoop-env.sh
1.5 将hadoop安装目录整体分发到slave1和slave2节点
deploy.sh hadoop-2.6.0-cdh5.13.0 /home/cdh/app/ slave
slave1和slave2分别创建软连接
ln -s hadoop-2.6.0-cdh5.13.0 hadoop
1.6 配置hadoop环境变量
vi~/.bashrc
JAVA_HOME=/home/cdh/app/jdk
ZOOKEEPER_HOME=/home/cdh/app/zookeeper
HADOOP_HOME=/home/cdh/app/hadoop
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:/home/cdh/tools:$ZOOKEEPER_HOME/bin:$HADOOP_HOME/bin:$PATH
export JAVA_HOME CLASSPATH PATHZOOKEEPER_HOME HADOOP_HOME
保存并使之生效
source ~/.bashrc
1.7 测试运行hdfs
1.7.1.1 所有节点启动Zookeeper:
runRemoteCmd.sh "/home/cdh/app/zookeeper/bin/zkServer.shstart" all
1.7.1.2 所有节点启动journalnode:
runRemoteCm
b575
d.sh"/home/cdh/app/hadoop/sbin/hadoop-daemon.sh start journalnode" all
1.7.1.3 Master节点格式化namenode
bin/hdfs namenode -format
1.7.1.4 Master节点格式化zkfc
bin/hdfs zkfc -formatZK
1.7.1.5 Master节点启动namenode
bin/hdfs namenode
1.7.1.6 Slave1节点同步master节点元数据信息
bin/hdfs namenode -bootstrapStandby
1.7.1.7 Ctrl+c 关闭master节点namenode进程
1.7.1.8 关闭所有节点journalnode
runRemoteCmd.sh"/home/cdh/app/hadoop/sbin/hadoop-daemon.sh stop journalnode" all
1.7.1.9 一键启动hdfs:sbin/start-dfs.sh
1.7.1.10一键关闭hdfs:sbin/stop-dfs.sh
1.7.1.11Web界面查看hdfs:http://master:50070
6. Yarn安装
1.1 Master节点修改yarn配置文件yarn-site.xml、mapred-site.xml(详情见本地文件)
1.2 将yarn-site.xml、mapred-site.xml分发到slave1和slave2节点
deploy.sh mapred-site.xml/home/cdh/app/hadoop-2.6.0-cdh5.13.0/etc/hadoop slave
deploy.sh yarn-site.xml/home/cdh/app/hadoop-2.6.0-cdh5.13.0/etc/hadoop slave
1.3 Master启动yarn:sbin/start-yarn.sh
1.4 Slave1节点启动ResourceManager:sbin/yarn-daemon.shstart resourcemanager
1.5 Web界面查看yarn:http://master:8088
1.6 查看ResourceManager状态
bin/yarnrmadmin -getServiceState rm1
bin/yarn rmadmin -getServiceState rm2
1.7 运行测试Wordcount
1.7.1.1 Master本地目录新建文件djt.txt
[cdh@masterhadoop]$ cat djt.txt
hadoop spark
hadoop spark
hadoop spark
1.7.1.2 Hdfs文件系统创建djt目录
[cdh@master hadoop]$ bin/hdfs mkdir /djt
[cdh@master hadoop]$ bin/hdfs dfs -ls /
1.7.1.3 将djt.txt文件上传至/djt目录下
bin/hdfs dfs -put djt.txt /djt/
bin/hdfs dfs -ls /djt
1.7.1.4 运行Wordcount
bin/hadoop jarshare/hadoop/mapreduce2/hadoop-mapreduce-examples-2.6.0-cdh5.13.0.jar wordcount/djt/djt.txt /djt/output
1.7.1.5 查看运行结果
bin/hdfs dfs -cat /djt/output/*
hadoop 3
spark 3
如需完整的安装包+配置文件,可以加QQ群:695520445
相关文章推荐
- Hadoop(CDH)分布式集群安装笔记(亲测)
- Hadoop(CDH)分布式集群安装笔记(亲测)
- Hadoop(CDH)分布式集群安装笔记(亲测)
- Hadoop(CDH)分布式集群安装笔记(亲测)
- Hadoop(CDH)分布式集群安装笔记(亲测)
- Hadoop(CDH)分布式集群安装笔记(亲测)
- Hadoop(CDH)分布式集群安装笔记(亲测)
- Hadoop(CDH)分布式集群安装笔记(亲测)
- Hadoop(CDH)分布式集群安装笔记(亲测)
- Hadoop(CDH)分布式集群安装笔记(亲测)
- HBase(CDH)分布式集群安装文档 (笔记)
- HBase(CDH)分布式集群安装文档 (笔记)
- HBase(CDH)分布式集群安装文档 (笔记)
- Hadoop学习笔记【12】-Hadoop2.1全分布式集群安装
- HBase(CDH)分布式集群安装文档 (笔记)
- hadoop分布式安装及其集群配置笔记
- HBase(CDH)分布式集群安装文档 (笔记)
- HBase(CDH)分布式集群安装文档 (笔记)
- Hadoop学习笔记【12】-Hadoop2.1全分布式集群安装
- HBase(CDH)分布式集群安装文档 (笔记)