您的位置:首页 > 大数据 > Hadoop

2018-07-17期 Hadoop HA安装配置(二)

2018-07-17 09:17 375 查看
(11)启动HDFS和YARN
--启动hdfs
--在hadoop-namenode01或者hadoop-namenode02任意一台执行
[root@hadoop-namenode01 sbin]# pwd
/usr/local/apps/hadoop-2.4.1/sbin
[root@hadoop-namenode01 sbin]# ./start-dfs.sh
Starting namenodes on [hadoop-namenode01 hadoop-namenode02]
--代表启动两个namenode
hadoop-namenode01: starting namenode, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-namenode-hadoop-namenode01.out
hadoop-namenode02: starting namenode, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-namenode-hadoop-namenode02.out
--代表启动3个datanode
hadoop-datanode02: starting datanode, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-datanode-hadoop-datanode02.out
hadoop-datanode01: starting datanode, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-datanode-hadoop-datanode01.out
hadoop-datanode03: starting datanode, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-datanode-hadoop-datanode03.out
--代表启动3个journalnode,这里已经启动不用管
Starting journal nodes [hadoop-zknode01 hadoop-zknode02 hadoop-zknode03]
hadoop-zknode01: journalnode running as process 25652. Stop it first.
hadoop-zknode03: journalnode running as process 4209. Stop it first.
hadoop-zknode02: journalnode running as process 4128. Stop it first.
--代表启动两个zkfc
Starting ZK Failover Controllers on NN hosts [hadoop-namenode01 hadoop-namenode02]
hadoop-namenode02: starting zkfc, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-zkfc-hadoop-namenode02.out
hadoop-namenode01: starting zkfc, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-zkfc-hadoop-namenode01.out
[root@hadoop-namenode01 sbin]# jps
26512 Jps
26415 DFSZKFailoverController
NameNode
[root@hadoop-namenode02 current]# jps
25599 NameNode
25690 DFSZKFailoverController
--启动yarn
--在hadoop-resourcemanager01执行
[root@hadoop-resourcemanager01 ~]# cd /usr/local/apps/hadoop-2.4.1/sbin/
[root@hadoop-resourcemanager01 sbin]# ./start-yarn.sh
[root@hadoop-resourcemanager01 sbin]# jps
25989 Jps
25726 ResourceManager
--在第二台也启动resourcemanager进程【由于一些缺陷,第一个节点不能启动第二个节点的resourcemanager】
[root@hadoop-resourcemanager02 sbin]# ./yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /usr/local/apps/hadoop-2.4.1/logs/yarn-root-resourcemanager-hadoop-resourcemanager02.out
[root@hadoop-resourcemanager02 sbin]# jps
25845 Jps
25616 ResourceManager
[root@hadoop-datanode01 ~]# jps
25596 NodeManager
25698 Jps
25434 DataNode
(12)检查namenode节点角色
http://192.168.1.31:50070





说明当前hadoop-namenode01为主节点
http://192.168.1.32:50070



说明hadoop-namenode02为备用节点,状态为standby
(14)检查yarn节点角色
http://192.168.1.41:8088



说明hadoop-resourcemanager02为active
http://192.168.1.42:8088



说明hadoop-resourcemanager02为standby
七、集群HA高可用测试
1、NameNode高可用测试
----场景A:
步骤1:检查Active节点





上图可以看出当前Active节点在192.168.1.31上,192.168.1.32为Standby
步骤2:模拟192.168.1.31宕机,直接kill 192.168.1.31上的namenode进程
[root@hadoop-namenode01 hadoop]# jps
26979 Jps
26415 DFSZKFailoverController
26132 NameNode
[root@hadoop-namenode01 hadoop]# kill -9 26132
步骤3:检查192.168.1.32状态



杀掉192.168.1.31上的namenode进程后,192.168.1.32上namenode变为active状态,故障切换成功。
步骤4:把192.168.1.31上namenode启动起来
[root@hadoop-namenode01 sbin]# ./hadoop-daemon.sh start namenode
[root@hadoop-namenode01 sbin]# jps
26415 DFSZKFailoverController
27044 NameNode



启动之后变为standby

----场景B:
模拟Active节点断电
步骤1:直接将当前Active的192.168.1.32节点poweroff
步骤2:观察Standby节点的变为Active状态的时间



通过观察,standby节点过一段时间状态才变为Active,时间要比之前直接杀死namenode切换时间长,原因为poweroff之后zkfc通过ssh 到宕机节点后迟迟得不到响应,超过配置文件里面指定的30秒后,执行自定义的shell /bin/true脚本后得到响应后才将standby节点切换为Active。
这种情况下Standby节点也能正常进行故障切换。

package hdfsutil;import java.io.IOException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.FileSystem;import org.apache.hadoop.fs.Path;public class HAHdfsTest { public static void main(String[] args) throws IOException { /** * 在HA模式下,需要把core-site.xml和hdfs-site.xml配置文件放到源码目录下 里面的所有参赛会被自动加载 * 由于配置文件里面都是主机名称,因此需要配置hosts主机映射 */ Configuration conf = new Configuration(); // conf.set("fs.defaultFS", "hdfs://ns1/"); /** * 防止jar报冲突报错 conf.set("fs.hdfs.impl", * "org.apache.hadoop.hdfs.DistributedFileSystem"); conf.set("fs.file.impl", * "org.apache.hadoop.fs.LocalFileSystem"); 或者在core-site.xml中加入 <property> * <name>fs.hdfs.impl</name> * <value>org.apache.hadoop.hdfs.DistributedFileSystem</value> </property> * <property> <name>fs.file.impl</name> * <value>org.apache.hadoop.fs.LocalFileSystem</value> </property> */ FileSystem fs = FileSystem.get(conf); fs.copyFromLocalFile(new Path(args[0]), new Path(args[1])); System.out.println("Upload Complete!"); fs.close(); }}步骤2:执行文件上传过程中,模拟Active节点宕机
--执行文件上传
[root@hadoop-zkfcnode01 ~]# java -jar hdfsha.jar "/root/jdk-7u65-linux-i586.tar.gz" "/"
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Upload Complete!
--kill掉active状态的namenode
通过观察,kill掉active状态的namenode后,standby立即接管,变为active,且文件正常上传成功。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  Hadoop HA Active Standby