Centos7.4部署hadoop(HA)高可用集群(阿里云)
文章目录
- 解压hadoop(3台)
- 配置环境变量
- 同步环境变量文件/etc/profile
- 修改配置文件
- hadoop-env
- core-site.xml
- hdfs-site.xml
- yarn-env
- mapred-site.xml
- slaves
软件准备
组件
组件 | 软件包 | 网址 |
---|---|---|
Centos | Centos7.4(64位) | 阿里云服务器3台,按量付费,用完释放资源,密码:Root@123! |
JDK | jdk-8u151-linux-x64.tar.gz | https://www.oracle.com/technetwork/java/javase/downloads/index.html |
Zookeeper | zookeeper-3.4.5-cdh5.15.1.tar.gz | http://archive.cloudera.com/cdh5/cdh/5/zookeeper-3.4.5-cdh5.15.1.tar.gz |
Hadoop | hadoop-2.6.0-cdh5.15.1.tar.gz | http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.15.1.tar.gz |
主机规划
服务 | 主机 | IP | 配置 |
---|---|---|---|
QuorumPeerMain | hadoop001 | 172.26.239.216 | 2核4GB内存,带宽2M,系统盘赠送的40GB存储 |
JournalNode | |||
NameNode | |||
DataNode | |||
DFSZKFailoverController | |||
ResourceManager | |||
NodeManager | |||
JobHistoryServer | |||
QuorumPeerMain | hadoop002 | 172.26.239.214 | 2核4GB内存,带宽2M,系统盘赠送的40GB存储 |
JournalNode | |||
NameNode | |||
DataNode | |||
DFSZKFailoverController | |||
ResourceManager | |||
NodeManager | |||
QuorumPeerMain | hadoop003 | 172.26.239.215 | 2核4GB内存,带宽2M,系统盘赠送的40GB存储 |
JournalNode | |||
DataNode | |||
NodeManager |
购买阿里云服务配置,3台服务器,¥1.831/小时。
云主机,网络环境,时间同步就不需要做了,阿里云会帮我们做好。
l临时环境,可以将防火墙全部打开1/65535,授权对象0.0.0.0/0
目录规划
名称 | 路径 | 备注 |
---|---|---|
JAVA_HOME | /usr/java/jdk1.8.0_151 | 手动创建/usr/java |
$ZOOKEEPER_HOME | /opt/app/zeekeeper | |
data | $ZOOKEEPER_HOME/data | 手动创建 |
$HADOOP_HOME | /opt/app/hadoop | |
data | $ADOOP_HOME/data | |
log | $ADOOP_HOME/logs | |
hadoop.tmp.dir | $ADOOP_HOME/tmp | 手工创建,权限777,hadoop:hadoop |
software | /opt/app/software | 软件包存放路径 |
环境准备
防火墙(3台root)
查看防火墙(root用户操作)
[root@hadoop001 ~]# firewall-cmd --state not running [root@hadoop001 ~]#
关闭防火墙
如果防火墙开着,需要使用关闭防火墙
[root@hadoop001 ~]# systemctl stop firewalld
禁止开启自启防火墙
[root@hadoop001 ~]#systemctl disable firewalld
Selinux(3台root)
查看Selinux 状态
命令:getenforce 或者sestatus 命令
[root@hadoop001 ~]# getenforce Disabled [root@hadoop001 ~]#
关闭Selinux
临时生效(setenforce 0)
[root@hadoop001 ~]# setenforce 0 ###设置SELinux成为permissive模式,用于临时关闭selinux防火墙,但重启后失效。 [root@hadoop001 ~]# setenforce 1 ###设置SELinux成为enforcing模式,用于临时开启selinux防火墙,但重启后失效。
永久生效,修改/etc/selinux/config
[root@hadoop001 ~]#vim /etc/selinux/config SELINUX=disabled
创建用户(3台root)
使用root创建hadoop用户
[root@hadoop001 ~]# useradd hadoop
修改hadoop用户的密码为hadoop
[root@hadoop001 ~]# echo hadoop | passwd --stdin hadoop
返回结果如下:
[root@hadoop001 ~]# echo hadoop | passwd --stdin hadoopChanging password for user hadoop. passwd: all authentication tokens updated successfully. [root@hadoop001 ~]#
安装lrzsz软件(3台)
[root@hadoop001 ~]# yum -y install lrzsz
创建目录(3台root)
使用root用户创建目录
[root@hadoop001 ~]# mkdir -p /opt/app/software
修改/opt/app拥有者
[root@hadoop001 ~]# chown -R hadoop:hadoop /opt/app
切换hadoop( - 加载环境变量,同时切换到hadoop家目录下)
[root@hadoop001 ~]# su - hadoop [hadoop@hadoop001 ~]$
配置hosts文件
使用root用户打开/etc/hosts
[root@hadoop001 ~]# vim /etc/hosts
修改hosts文件,添加以下内容
172.26.239.216 hadoop001 172.26.239.214 hadoop002 172.26.239.215 hadoop003
scp同步到另外两台主机上
[root@hadoop001 ~]# scp /etc/hosts root@172.26.239.215:/etc/ [root@hadoop001 ~]# scp /etc/hosts root@172.26.239.214:/etc/
SSH互信
ssh-keygen创建公钥(Public Key)和密钥(Private Key)ssh-keygen -t [rsa|dsa] 默认dsa
[hadoop@hadoop001 ~]$ ssh-keygen
ssh-copy-id把本地主机的公钥复制到远程主机的 authorized_keys 文件上。
也会给远程主机的用户主目录(home)和~/.ssh, 和~/.ssh/authorized_keys 设置合适的权限。
[hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop001 [hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop002 [hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop003
返回结果
[hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop002 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub" The authenticity of host 'hadoop002 (172.26.239.214)' can't be established. ECDSA key fingerprint is SHA256:WuDCjCFcqjYk/C4Wgop9M6rIbkmnE4gn6mEHMVnBcWk. ECDSA key fingerprint is MD5:f5:1e:b4:52:47:19:d6:ce:2b:31:a0:b4:48:ee:d2:f2. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys hadoop@hadoop002's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'hadoop@hadoop002'" and check to make sure that only the key(s) you wanted were added. [hadoop@hadoop001 ~]$
client 端去登陆server 端免密码输入
[hadoop@hadoop001 ~]$ ssh hadoop@hadoop001 ls [hadoop@hadoop001 ~]$ ssh hadoop@hadoop002 ls [hadoop@hadoop001 ~]$ ssh hadoop@hadoop003 ls
上传软件包
使用hadoop用户将下载好的软件包上传到/opt/app/software目录下
切换到软件存放目录
[hadoop@hadoop001 ~]$ cd /opt/app/software
使用rz命令上传软件包
[hadoop@hadoop001 software]$ rz
使用scp将软件包同步到另外两台主机上
[hadoop@hadoop001 ~]$ scp -r /opt/app/software/* hadoop@hadoop002:/opt/app/software/ [hadoop@hadoop001 ~]$ scp -r /opt/app/software/* hadoop@hadoop003:/opt/app/software/
安装JDK
使用root用户创建/usr/java目录
[root@hadoop001 ~]# mkdir /usr/java
修改目录拥有者
[root@hadoop001 ~]# chown -R hadoop:hadoop /usr/java
使用hadoop用户解压jdk
[hadoop@hadoop001 software]$ tar -zxvf /opt/app/software/jdk-8u151-linux-x64.tar.gz -C /usr/java/
使用root用户修改环境变量
[root@hadoop001 ~]# vim /etc/profile ###ADD JDK environment export JAVA_HOME=/usr/java/jdk1.8.0_151 export PATH=$JAVA_HOME/bin:$PATH
source使环境变量生效
[root@hadoop001 ~]# source /etc/profile
验证JDK
[hadoop@hadoop001 ~]$ java -version java version "1.8.0_151" Java(TM) SE Runtime Environment (build 1.8.0_151-b12) Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode) [hadoop@hadoop001 ~]$
同步到另外两台主机
[root@hadoop001 ~]# scp /etc/profile root@hadoop002:/etc/profile [root@hadoop001 ~]# scp /etc/profile root@hadoop003:/etc/profile
集群安装部署
安装Zookeeper
解压Zookeeper(3台)
[hadoop@hadoop001 software]$ tar -zxvf /opt/app/software/zookeeper-3.4.5-cdh5.15.1.tar.gz -C /opt/app
修改配置文件
复制生成/opt/app/zookeeper-3.4.5-cdh5.15.1/conf/zoo.cfg
[hadoop@hadoop001 ~]$ cd /opt/app/zookeeper-3.4.5-cdh5.15.1/conf [hadoop@hadoop001 conf]$ cp zoo_sample.cfg zoo.cfg
修改zoo.cfg文件
[hadoop@hadoop001 conf]$ vim zoo.cfg # The number of milliseconds of each tick tickTime=2000 # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. ##这个目录存放zookeeper 的数据,以及myid 配置文件。此目录若没有则必须手动创建。 dataDir=/opt/app/zookeeper-3.4.5-cdh5.15.1/data # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 ## server 后面的数字为myid 配置文件中的数据。必须与主机名严格对应。 server.1=hadoop001:2888:3888 server.2=hadoop002:2888:3888 server.3=hadoop003:2888:3888 ##生产环境至少3 台zookeeper 服务器,并且如果集群较大可增加至奇数台。 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1
重要参数
##这个目录存放zookeeper 的数据,以及myid 配置文件。此目录若没有则必须手动创建。 dataDir=/opt/app/zookeeper-3.4.5-cdh5.15.1/data ## server 后面的数字为myid 配置文件中的数据。必须与主机名严格对应。 ##生产环境至少3 台zookeeper 服务器,并且如果集群较大可增加至奇数台。 server.1=hadoop001:2888:3888 server.2=hadoop002:2888:3888 server.3=hadoop003:2888:3888
同步配置文件
[hadoop@hadoop001 conf]$ scp zoo.cfg hadoop002:/opt/app/zookeeper-3.4.5-cdh5.15.1/conf [hadoop@hadoop001 conf]$ scp zoo.cfg hadoop003:/opt/app/zookeeper-3.4.5-cdh5.15.1/conf
创建zookeeper数据目录(3台)
[hadoop@hadoop001 conf]$ mkdir /opt/app/zookeeper-3.4.5-cdh5.15.1/data
配置/opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid
[hadoop@hadoop001 conf]$ cd /opt/app/zookeeper-3.4.5-cdh5.15.1/data [hadoop@hadoop001 data]$ echo 1 > /opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid [hadoop@hadoop001 data]$ ssh hadoop002 " echo 2 > /opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid " [hadoop@hadoop001 data]$ ssh hadoop003 " echo 3 > /opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid "
安装hadoop
解压hadoop(3台)
[hadoop@hadoop001 ~]$ tar -zxvf /opt/app/software/hadoop-2.6.0-cdh5.15.1.tar.gz -C /opt/app
配置环境变量
[root@hadoop001 ~]# vim /etc/profile ###ADD JDK environment export JAVA_HOME=/usr/java/jdk1.8.0_151 export PATH=$JAVA_HOME/bin:$PATH ###ADD JDK environment export ZOOKEEPER_HOME=/opt/app/zookeeper-3.4.5-cdh5.15.1 export PATH=$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/sbin:$PATH ###ADD HADOOP environment export HADOOP_HOME=/opt/app/hadoop-2.6.0-cdh5.15.1 export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH ###ADD CLASSPATH CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$HADOOP_HOME/lib:$CLASSPATH
同步环境变量文件/etc/profile
[root@hadoop001 ~]# scp /etc/profile root@hadoop002:/etc/profile [root@hadoop001 ~]# scp /etc/profile root@hadoop003:/etc/profile
修改配置文件
hadoop-env
[hadoop@hadoop001 hadoop]$ vim hadoop-env.sh ... export JAVA_HOME="/usr/java/jdk1.8.0_151" ...
core-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <!--Yarn 需要使用 fs.defaultFS 指定NameNode URI --> <property> <name>fs.defaultFS</name> <value>hdfs://hdp</value> </property> <!--==============================Trash机制======================================= --> <property> <!--多长时间创建CheckPoint NameNode截点上运行的CheckPointer 从Current文件夹创建CheckPoint;默认:0 由fs.trash.interval项指定 --> <name>fs.trash.checkpoint.interval</name> <value>0</value> </property> <property> <!--多少分钟.Trash下的CheckPoint目录会被删除,该配置服务器设置优先级大于客户端,默认:0 不删除 --> <name>fs.trash.interval</name> <value>1440</value> </property> <!--指定hadoop临时目录, hadoop.tmp.dir 是hadoop文件系统依赖的基础配置,很多路径都依赖它。如果hdfs-site.xml中不配 置namenode和datanode的存放位置,默认就放在这>个路径中 --> <property> <name>hadoop.tmp.dir</name> <value>/opt/app/hadoop-2.6.0-cdh5.15.1/tmp</value> </property> <!-- 指定zookeeper地址 --> <property> <name>ha.zookeeper.quorum</name> <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value> </property> <!--指定ZooKeeper超时间隔,单位毫秒 --> <property> <name>ha.zookeeper.session-timeout.ms</name> <value>2000</value> </property> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hadoop.groups</name> <value>*</value> </property> <property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec, org.apache.hadoop.io.compress.SnappyCodec </value> </property> </configuration>
hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <!--HDFS超级用户 --> <property> <name>dfs.permissions.superusergroup</name> <value>hadoop</value> </property> <!--开启web hdfs --> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/name</value> <description> namenode 存放name table(fsimage)本地目录(需要修改)</description> </property> <property> <name>dfs.namenode.edits.dir</name> <value>${dfs.namenode.name.dir}</value> <description>namenode粗放 transaction file(edits)本地目录(需要修改)</description> </property> <property> <name>dfs.datanode.data.dir</name> <value>/opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/data</value> <description>datanode存放block本地目录(需要修改)</description> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <!-- 块大小256M (默认128M) --> <property> <name>dfs.blocksize</name> <value>268435456</value> </property> <!--======================================================================= --> <!--HDFS高可用配置 --> <!--指定hdfs的nameservice为hdp,需要和core-site.xml中的保持一致 --> <property> <name>dfs.nameservices</name> <value>hdp</value> </property> <property> <!--设置NameNode IDs 此版本最大只支持两个NameNode --> <name>dfs.ha.namenodes.hdp</name> <value>nn1,nn2</value> </property> <!-- Hdfs HA: dfs.namenode.rpc-address.[nameservice ID] rpc 通信地址 --> <property> <name>dfs.namenode.rpc-address.hdp.nn1</name> <value>hadoop001:8020</value> </property> <property> <name>dfs.namenode.rpc-address.hdp.nn2</name> <value>hadoop002:8020</value> </property> <!-- Hdfs HA: dfs.namenode.http-address.[nameservice ID] http 通信地址 --> <property> <name>dfs.namenode.http-address.hdp.nn1</name> <value>hadoop001:50070</value> </property> <property> <name>dfs.namenode.http-address.hdp.nn2</name> <value>hadoop002:50070</value> </property> <!--==================Namenode editlog同步 ============================================ --> <!--保证数据恢复 --> <property> <name>dfs.journalnode.http-address</name> <value>0.0.0.0:8480</value> </property> <property> <name>dfs.journalnode.rpc-address</name> <value>0.0.0.0:8485</value> </property> <property> <!--设置JournalNode服务器地址,QuorumJournalManager 用于存储editlog --> <!--格式:qjournal://<host1:port1>;<host2:port2>;<host3:port3>/<journalId> 端口同journalnode.rpc-address --> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop001:8485;hadoop002:8485;hadoop003:8485/hdp</value> </property> <property> <!--JournalNode存放数据地址 --> <name>dfs.journalnode.edits.dir</name> <value>/opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/jn</value> </property> <!--==================DataNode editlog同步 ============================================ --> <property> <!--DataNode,Client连接Namenode识别选择Active NameNode策略 --> <!-- 配置失败自动切换实现方式 --> <name>dfs.client.failover.proxy.provider.hdp</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!--==================Namenode fencing:=============================================== --> <!--Failover后防止停掉的Namenode启动,造成两个服务 --> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <property> <!--多少milliseconds 认为fencing失败 --> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> <!--==================NameNode auto failover base ZKFC and Zookeeper====================== --> <!--开启基于Zookeeper --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!--动态许可datanode连接namenode列表 --> <property> <name>dfs.hosts</name> <value>/opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop/slaves</value> </property> </configuration>
yarn-env
[hadoop@hadoop001 hadoop]$ vim yarn-env.sh ... export YARN_LOG_DIR="/opt/app/hadoop-2.6.0-cdh5.15.1logs" ...
mapred-site.xml
[hadoop@hadoop001 hadoop]$ cd /opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop [hadoop@hadoop001 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@hadoop001 hadoop]$ vim mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <!-- 配置 MapReduce Applications --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!-- JobHistory Server ============================================================== --> <!-- 配置 MapReduce JobHistory Server 地址 ,默认端口10020 --> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop001:10020</value> </property> <!-- 配置 MapReduce JobHistory Server web ui 地址, 默认端口19888 --> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop001:19888</value> </property> <!-- 配置 Map段输出的压缩,snappy--> <property> <name>mapreduce.map.output.compress</name> <value>true</value> </property> <property> <name>mapreduce.map.output.compress.codec</name> <value>org.apache.hadoop.io.compress.SnappyCodec</value> </property> </configuration> yarn-site.xml <?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <!-- nodemanager 配置 ================================================= --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.nodemanager.localizer.address</name> <value>0.0.0.0:23344</value> <description>Address where the localizer IPC is.</description> </property> <property> <name>yarn.nodemanager.webapp.address</name> <value>0.0.0.0:23999</value> <description>NM Webapp address.</description> </property> <!-- HA 配置 =============================================================== --> <!-- Resource Manager Configs --> <property> <name>yarn.resourcemanager.connect.retry-interval.ms</name> <value>2000</value> </property> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- 使嵌入式自动故障转移。HA环境启动,与 ZKRMStateStore 配合 处理fencing --> <property> <name>yarn.resourcemanager.ha.automatic-failover.embedded</name> <value>true</value> </property> <!-- 集群名称,确保HA选举时对应的集群 --> <property> <name>yarn.resourcemanager.cluster-id</name> <value>yarn-cluster</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <!--这里RM主备结点需要单独指定,(可选) <property> <name>yarn.resourcemanager.ha.id</name> <value>rm2</value> </property> --> <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> </property> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> <property> <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name> <value>5000</value> </property> <!-- ZKRMStateStore 配置 --> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value> </property> <property> <name>yarn.resourcemanager.zk.state-store.address</name> <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value> </property> <!-- Client访问RM的RPC地址 (applications manager interface) --> <property> <name>yarn.resourcemanager.address.rm1</name> <value>hadoop001:23140</value> </property> <property> <name>yarn.resourcemanager.address.rm2</name> <value>hadoop002:23140</value> </property> <!-- AM访问RM的RPC地址(scheduler interface) --> <property> <name>yarn.resourcemanager.scheduler.address.rm1</name> <value>hadoop001:23130</value> </property> <property> <name>yarn.resourcemanager.scheduler.address.rm2</name> <value>hadoop002:23130</value> </property> <!-- RM admin interface --> <property> <name>yarn.resourcemanager.admin.address.rm1</name> <value>hadoop001:23141</value> </property> <property> <name>yarn.resourcemanager.admin.address.rm2</name> <value>hadoop002:23141</value> </property> <!--NM访问RM的RPC端口 --> <property> <name>yarn.resourcemanager.resource-tracker.address.rm1</name> <value>hadoop001:23125</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address.rm2</name> <value>hadoop002:23125</value> </property> <!-- RM web application 地址 --> <property> <name>yarn.resourcemanager.webapp.address.rm1</name> <value>hadoop001:8088</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm2</name> <value>hadoop002:8088</value> </property> <property> <name>yarn.resourcemanager.webapp.https.address.rm1</name> <value>hadoop001:23189</value> </property> <property> <name>yarn.resourcemanager.webapp.https.address.rm2</name> <value>hadoop002:23189</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log.server.url</name> <value>http://hadoop001:19888/jobhistory/logs</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>2048</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>1024</value> <discription>单个任务可申请最少内存,默认1024MB</discription> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>2048</value> <discription>单个任务可申请最大内存,默认8192MB</discription> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>2</value> </property> </configuration>
slaves
[hadoop@hadoop001 hadoop]$ vim slaves hadoop001 hadoop002 hadoop003
同步配置文件
[hadoop@hadoop001 hadoop]$ scp core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml hadoop-env.sh slaves hadoop002:/opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop [hadoop@hadoop001 hadoop]$ scp core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml hadoop-env.sh slaves hadoop003:/opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop
创建临时目录(3台)
[hadoop@hadoop001 hadoop-2.6.0-cdh5.15.1]$ mkdir /opt/app/hadoop-2.6.0-cdh5.15.1/tmp [hadoop@hadoop001 hadoop-2.6.0-cdh5.15.1]$ chmod 777 /opt/app/hadoop-2.6.0-cdh5.15.1/tmp
启动集群服务
zookper
启动zookeeper服务(3台)
[hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkServer.sh start JMX enabled by default Using config: /opt/app/zookeeper-3.4.5-cdh5.15.1/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [hadoop@hadoop001 ~]$
查看zookeeper状态(3台)
[hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkServer.sh status JMX enabled by default Using config: /opt/app/zookeeper-3.4.5-cdh5.15.1/bin/../conf/zoo.cfg Mode: follower [hadoop@hadoop001 ~]$
jps查看zookeeper进程(3台)
[hadoop@hadoop001 ~]$ jps 20483 QuorumPeerMain 20516 Jps [hadoop@hadoop001 ~]$
验证zookeeper服务
[hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkCli.sh [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper] [zk: localhost:2181(CONNECTED) 1] quit Quitting... [hadoop@hadoop001 ~]$
启动HDFS
格式化hdfs的zookeeper存储目录
[hadoop@hadoop001 ~]$ hdfs zkfc -formatZK
查看zookeeper信息
[hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkCli.sh [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper, hadoop-ha] [zk: localhost:2181(CONNECTED) 1] quit Quitting... [hadoop@hadoop001 ~]$
启动JournalNode服务(3台)
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start journalnode starting journalnode, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-journalnode-hadoop001.out [hadoop@hadoop001 ~]$
jps查看JournalNode进程(3台)
[hadoop@hadoop001 ~]$ jps 21107 JournalNode 21156 Jps 20893 QuorumPeerMain [hadoop@hadoop001 ~]$
格式化并启动第一个NameNode(hadoop001)
[hadoop@hadoop001 ~]$ hdfs namenode -format ##格式化当前节点的namenode 数据
格式化journalnode 的数据,这个是ha需要做的
[hadoop@hadoop001 ~]$ hdfs namenode -initializeSharedEdits 18/11/27 01:51:28 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode Re-format filesystem in QJM to [172.26.239.216:8485, 172.26.239.214:8485, 172.26.239.215:8485] ? (Y or N) Y 18/11/27 01:51:39 INFO namenode.FileJournalManager: Recovering unfinalized segments in /opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/name/current 18/11/27 01:51:39 INFO client.QuorumJournalManager: Starting recovery process for unclosed journal segments... 18/11/27 01:51:39 INFO client.QuorumJournalManager: Successfully started new epoch 1 18/11/27 01:51:39 INFO util.ExitUtil: Exiting with status 0 18/11/27 01:51:39 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at hadoop001/172.26.239.216 ************************************************************/ [hadoop@hadoop001 ~]$
启动当前节点的namenode 服务
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start namenode starting namenode, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-namenode-hadoop001.out [hadoop@hadoop001 ~]$
jps查看namenode进程
[hadoop@hadoop001 ~]$ jps 21107 JournalNode 21350 Jps 21276 NameNode 20893 QuorumPeerMain [hadoop@hadoop001 ~]
格式化并启动第二个NameNode(hadoop002)
[hadoop@hadoop002 ~]$ hdfs namenode -bootstrapStandby #hadoop001已经格式化过,同步至hadoop002
启动当前节点的namenode 服务(hadoop002)
[hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start namenode starting namenode, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-namenode-hadoop002.out [hadoop@hadoop002 ~]$
jps查看namenode进程(hadoop002)
[hadoop@hadoop002 ~]$ jps 20690 QuorumPeerMain 20788 JournalNode 21017 Jps 20923 NameNode [hadoop@hadoop002 ~]$
启动datanode服务(hadoop001)
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemons.sh start datanode
jps查看datanode进程(3台)
[hadoop@hadoop001 ~]$ jps 21857 Jps 21107 JournalNode 21766 DataNode 21276 NameNode 20893 QuorumPeerMain [hadoop@hadoop001 ~]$
启动ZooKeeperFailoverController(hadoop001,hadoop002)
hadoop001
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start zkfc #所有namenode节点分别执行 starting zkfc, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-zkfc-hadoop001.out [hadoop@hadoop001 ~]$
hadoop002
[hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start zkfc #所有namenode节点分别执行 starting zkfc, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-zkfc-hadoop002.out [hadoop@hadoop002 ~]$
查看DFSZKFailoverController
[hadoop@hadoop001 ~]$ jps 21107 JournalNode 21766 DataNode 21276 NameNode 20893 QuorumPeerMain 21950 DFSZKFailoverController 22030 Jps [hadoop@hadoop001 ~]$
Web页面查看
登陆http://39.98.44.126:50070(hadoop001)
其中一个为active,另一个为standby状态
登陆http://39.98.37.133:50070(hadoop002)
启动YARN
hadoop001上启动yarn
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/start-yarn.sh starting yarn daemons starting resourcemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1logs/yarn-hadoop-resourcemanager-hadoop001.out hadoop002: starting nodemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-nodemanager-hadoop002.out hadoop003: starting nodemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-nodemanager-hadoop003.out hadoop001: starting nodemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1logs/yarn-hadoop-nodemanager-hadoop001.out [hadoop@hadoop001 ~]$
jps查看yarn进程(3台)
[hadoop@hadoop001 ~]$ jps 22624 Jps 21107 JournalNode 22212 ResourceManager 21766 DataNode 22310 NodeManager 21276 NameNode 20893 QuorumPeerMain 21950 DFSZKFailoverController [hadoop@hadoop001 ~]
hadoop002上启动resourcemanager
[hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/yarn-daemon.sh start resourcemanager starting resourcemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-resourcemanager-hadoop002.out [hadoop@hadoop002 ~]$
jps查看resourcemanager进程
[hadoop@hadoop002 ~]$ jps 20690 QuorumPeerMain 20788 JournalNode 21908 Jps 21399 DFSZKFailoverController 20923 NameNode 21675 NodeManager 21117 DataNode 21853 ResourceManager [hadoop@hadoop002 ~]$
Web页面查看
登陆http://39.98.44.126:8088(hadoop001)
其中一个为active,另一个为standby状态
登陆http://39.98.37.133:8088/cluster/cluster(hadoop002)
启动jobhistory
在hadoop001上启动jobhistory服务
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver starting historyserver, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/mapred-hadoop-historyserver-hadoop001.out [hadoop@hadoop001 ~]$
jps查看JobHistoryServer进程
[hadoop@hadoop001 ~]$ jps 22785 Jps 21107 JournalNode 22212 ResourceManager 21766 DataNode 22310 NodeManager 22680 JobHistoryServer 21276 NameNode 20893 QuorumPeerMain 21950 DFSZKFailoverController [hadoop@hadoop001 ~]$
Web页面查看
登陆jobhistory服务器web端查看job状态
http://39.98.44.126:19888(hadoop001)
停止集群服务
停止YARN相关服务
停止hadoop001上historyserver
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh stop historyserver stopping historyserver [hadoop@hadoop001 ~]$
停止hadoop001上yarn任务
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/stop-yarn.sh #停止hadoop001上resourcemanager及所有的nodemanager stopping yarn daemons stopping resourcemanager hadoop003: stopping nodemanager hadoop001: stopping nodemanager hadoop002: stopping nodemanager no proxyserver to stop [hadoop@hadoop001 ~]$
停止hadoop002上resourcemanager
[hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/yarn-daemon.sh stop resourcemanager stopping resourcemanager [hadoop@hadoop002 ~]$
停止HDFS相关服务
停止namenode,datanode,journalnode,zkfc服务
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/stop-dfs.sh 18/11/27 03:32:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Stopping namenodes on [hadoop001 hadoop002] hadoop001: stopping namenode hadoop002: stopping namenode hadoop003: stopping datanode hadoop001: stopping datanode hadoop002: stopping datanode Stopping journal nodes [hadoop001 hadoop002 hadoop003] hadoop002: stopping journalnode hadoop003: stopping journalnode hadoop001: stopping journalnode 18/11/27 03:33:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Stopping ZK Failover Controllers on NN hosts [hadoop001 hadoop002] hadoop001: stopping zkfc hadoop002: stopping zkfc [hadoop@hadoop001 ~]$
停止Zookeeper相关服务
Zookeeper(3台)
[hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkServer.sh stop JMX enabled by default Using config: /opt/app/zookeeper-3.4.5-cdh5.15.1/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED [hadoop@hadoop001 ~]$
jps查看进程(3台)
[hadoop@hadoop001 ~]$ jps 23881 Jps [hadoop@hadoop001 ~]$
- linux服务器部署环境(阿里云 centos7.4 64位+ jdk+tomcat+mysql安装)
- 一:Centos多机共享及其它准备(为了Hadoop HA部署)
- CentOS 7.4 安装部署 hadoop 2.6 文档 V1.3
- centos7 hadoop HA高可用集群搭建( hadoop2.7 zookeeper3.4 )
- CentOS6.5环境部署Hadoop2.8.1集群(HA)
- [置顶] 阿里云hadoop安装教程_完全分布式_Hadoop 2.7.4/CentOS 7.4
- 简明的hadoop 2.5 HA 基于centos6.5 安装部署文档(hdfs,mapreduce,hbase)
- centOS6.5安装hadoop2.7的分布式部署(三台主机)
- 【4】搭建HA高可用hadoop-2.3(部署配置HBase)
- Hadoop学习笔记-009-CentOS_6.5_64_HA高可用-Hadoop2.6+Zookeeper3.4.5安装Hive1.1.0
- Hadoop大数据框架研究(3)——Spark的HA高可用性集群环境部署
- 阿里云ecs centos7.4 不卸载python2的情况下安装python3 及踩过的的坑
- CentOS64位6.4下Hadoop2.7.1、Mysql5.5.46、Hive1.2.1、Spark1.5.0的集群环境部署
- centos 6.6 hadoop 2.7.1 完全分布式安装部署
- 阿里云centos7.2部署笔录--安装mysql5.7(三)
- Hadoop 2.6.0 HA高可用集群配置详解
- 实战CentOS系统部署Hadoop集群服务
- 阿里云服务器连接以及centos 搭建 web java环境(linux java部署 tomcat部署)
- HA HADOOP集群和HIVE部署
- centos-7 部署hadoop2.5.1 >>>> 分布式 HDFS(三)