Hadoop HA on Yarn——集群配置
2017-10-19 12:00
441 查看
集群搭建
因为服务器数量有限,这里服务器开启的进程有点多:
说明[2]:
在hadoop2.X中通常由两个NameNode组成,一个处于active状态,另一个处于standby状态。Active NameNode对外提供服务,而Standby NameNode则不对外提供服务,仅同步active namenode的状态,以便能够在它失败时快速进行切换。
hadoop2.0官方提供了两种HDFS HA的解决方案,一种是NFS,另一种是QJM(由cloudra提出,原理类似zookeeper)。这里我使用QJM完成。主备NameNode之间通过一组JournalNode同步元数据信息,一条数据只要成功写入多数JournalNode即认为写入成功。通常配置奇数个JournalNode
这里略去jdk,Hadoop,Zookeeper的安装过程和环境变量配置。
这里要非常注意无密码登陆的配置:
在~/.ssh/目录中生成两个文件id_rsa和id_rsa.pub
如果想从hadoop001免密码登录到hadoop002中要在hadoop001中执行
这里为了实现任何机器之间都可以免密码登陆,所以在hadoop001中再执行两遍上面的操作(把@后面的机器名分别改成hadoop001和hadoop003),最后把生成的authorized_keys复制所有的节点上
core-site.xml
hdfs-site.xml
mapred-site.xml
yarn-site.xml
hadoop-env.sh & mapred-env.sh & yarn-env.sh
因为服务器数量有限,这里服务器开启的进程有点多:
机器名 | 安装软件 | 运行进程 |
hadoop001 | Hadoop,Zookeeper | NameNode, DFSZKFailoverController, ResourceManager DataNode, NodeManager QuorumPeerMain JournalNode |
hadoop002 | Hadoop,Zookeeper | NameNode, DFSZKFailoverController, ResourceManager DataNode, NodeManager QuorumPeerMain JournalNode |
hadoop003 | Hadoop,Zookeeper | DataNode, NodeManager QuorumPeerMain |
说明[2]:
在hadoop2.X中通常由两个NameNode组成,一个处于active状态,另一个处于standby状态。Active NameNode对外提供服务,而Standby NameNode则不对外提供服务,仅同步active namenode的状态,以便能够在它失败时快速进行切换。
hadoop2.0官方提供了两种HDFS HA的解决方案,一种是NFS,另一种是QJM(由cloudra提出,原理类似zookeeper)。这里我使用QJM完成。主备NameNode之间通过一组JournalNode同步元数据信息,一条数据只要成功写入多数JournalNode即认为写入成功。通常配置奇数个JournalNode
这里略去jdk,Hadoop,Zookeeper的安装过程和环境变量配置。
无密码登陆
这里要非常注意无密码登陆的配置:ssh-keygen -t rsa
在~/.ssh/目录中生成两个文件id_rsa和id_rsa.pub
如果想从hadoop001免密码登录到hadoop002中要在hadoop001中执行
ssh-copy-id -i ~/.ssh/id_rsa.pub [用户名]@hadoop002
这里为了实现任何机器之间都可以免密码登陆,所以在hadoop001中再执行两遍上面的操作(把@后面的机器名分别改成hadoop001和hadoop003),最后把生成的authorized_keys复制所有的节点上
Hadoop配置
core-site.xml<configuration> <!-- --> <property> <name>fs.defaultFS</name> <value>hdfs://appcluster</value> </property> <!-- 指定hadoop临时目录 --> <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop/storage/tmp</value> </property> <!-- 指定zookeeper地址 --> <property> <name>ha.zookeeper.quorum</name> <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value> </property> <property> <name>ha.zookeeper.session-timeout.ms</name> <value>2000</value> </property> </configuration>
hdfs-site.xml
<configuration> <!--指定namenode名称空间的存储地址--> <property> <name>dfs.namenode.name.dir</name> <value>file:///data/hadoop/storage/hdfs/name</value> </property> <!--指定datanode数据存储地址--> <property> <name>dfs.datanode.data.dir</name> <value>file:///data/hadoop/storage/hdfs/data</value> </property> <!--指定数据冗余份数--> <property> <name>dfs.replication</name> <value>2</value> </property> <!--指定hdfs的nameservice为appcluster,需要和core-site.xml中的保持一致 --> <property> <name>dfs.nameservices</name> <value>appcluster</value> </property> <!-- appcluster下面有两个NameNode,分别是nn1,nn2 --> <property> <name>dfs.ha.namenodes.appcluster</name> <value>nn1,nn2</value> </property> <!-- nn1的RPC通信地址 --> <property> <name>dfs.namenode.rpc-address.appcluster.nn1</name> <value>hadoop001:8020</value> </property> <!-- nn2的RPC通信地址 --> <property> <name>dfs.namenode.rpc-address.appcluster.nn2</name> <value>hadoop002:8020</value> </property> <!-- nn1的http通信地址 --> <property> <name>dfs.namenode.http-address.appcluster.nn1</name> <value>hadoop001:50070</value> </property> <!-- nn2的http通信地址 --> <property> <name>dfs.namenode.http-address.appcluster.nn2</name> <value>hadoop002:50070</value> </property> <!-- 指定NameNode的元数据在JournalNode上的存放位置 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop001:8485;hadoop002:8485;hadoop003:8485/appcluster</value> </property> <property> <name>dfs.ha.automatic-failover.enabled.appcluster</name> <value>true</value> </property> <!-- 配置失败自动切换实现方式 --> <property> <name>dfs.client.failover.proxy.provider.appcluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- 配置隔离机制 --> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <!-- 使用隔离机制时需要ssh免密码登陆 --> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/[用户名]/.ssh/id_rsa</value> </property> <!-- --> <property> <name>dfs.journalnode.edits.dir</name> <value>/data/hadoop/tmp/journal</value> </property> </configuration>
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!-- 配置 MapReduce JobHistory Server 地址 ,默认端口10020 --> <property> <name>mapreduce.jobhistory.address</name> <value>0.0.0.0:10020</value> </property> <!-- 配置 MapReduce JobHistory Server web ui 地址, 默认端口19888 --> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>0.0.0.0:19888</value> </property> </configuration>
yarn-site.xml
<?xml version="1.0"?> <configuration> <!--rm失联后重新链接的时间--> <property> <name>yarn.resourcemanager.connect.retry-interval.ms</name> <value>2000</value> </property> <!--开启resourcemanagerHA,默认为false--> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!--配置resourcemanager--> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value> </property> <!--开启故障自动切换--> <property> <name>yarn.resourcemanager.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>hadoop001</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>hadoop002</value> </property> <!-- 在hadoop001上配置rm1,在hadoop002上配置rm2, 注意:一般都喜欢把配置好的文件远程复制到其它机器上,但这个在YARN的另一个机器上一定要修改 --> <property> <name>yarn.resourcemanager.ha.id</name> <value>rm1</value> <description>If we want to launch more than one RM in single node,we need this configuration</description> </property> <!--开启自动恢复功能--> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> <!--配置与zookeeper的连接地址--> <property> <name>yarn.resourcemanager.zk-state-store.address</name> <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value> </property> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>appcluster-yarn</value> </property> <!--schelduler失联等待连接时间--> <property> <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name> <value>5000</value> </property> <!--配置rm1--> <property> <name>yarn.resourcemanager.address.rm1</name> <value>hadoop001:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address.rm1</name> <value>hadoop001:8030</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm1</name> <value>hadoop001:8088</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address.rm1</name> <value>hadoop001:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address.rm1</name> <value>hadoop001:8033</value> </property> <property> <name>yarn.resourcemanager.ha.admin.address.rm1</name> <value>hadoop001:23142</value> </property> <!--配置rm2--> <property> <name>yarn.resourcemanager.address.rm2</name> <value>hadoop002:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address.rm2</name> <value>hadoop002:8030</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm2</name> <value>hadoop002:8088</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address.rm2</name> <value>hadoop002:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address.rm2</name> <value>hadoop002:8033</value> </property> <property> <name>yarn.resourcemanager.ha.admin.address.rm2</name> <value>hadoop002:23142</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/data/hadoop/yarn/local</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>/data/hadoop/yarn/log</value> </property> <property> <name>mapreduce.shuffle.port</name> <value>23080</value> </property> <!--故障处理类--> <property> <name>yarn.client.failover-proxy-provider</name> <value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value> </property> <property> <name>yarn.resourcemanager.ha.automatic-failover.zk-base-path</name> <value>/yarn-leader-election</value> <description>Optionalsetting.Thedefaultvalueis/yarn-leader-election</description> </property> </configuration>
hadoop-env.sh & mapred-env.sh & yarn-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_60 export CLASS_PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib export HADOOP_HOME=/data/hadoop-2.6.0 export HADOOP_PID_DIR=/data/hadoop/pids export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="$HADOOP_OPTS-Djava.library.path=$HADOOP_HOME/lib/native" export HADOOP_PREFIX=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export YARN_HOME=$HADOOP_HOME export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
相关文章推荐
- Hadoop HA on Yarn——集群配置
- hadoop-HA集群搭建,启动DataNode,检测启动状态,执行HDFS命令,启动YARN,HDFS权限配置,C++客户端编程,常见错误
- (9)Hadoop 2.6.1 集群部署——未配置HA
- zookeeper集群的搭建以及hadoop ha的相关配置
- 在Hadoop YARN之上配置Spark集群(二)
- HBase+ZooKeeper+Hadoop2.6.0的ResourceManager HA集群高可用配置
- 通过tarball形式安装HBASE Cluster(CDH5.0.2)——配置分布式集群中的YARN ResourceManager 的HA
- hadoop2.7.3+HA+YARN+zookeeper高可用集群部署
- Spark On Yarn(HDFS HA)详细配置
- hadoop HA集群环境配置
- 详细讲解hadoop2的automatic HA+Federation+Yarn配置的教程
- Centos 6.5 下hadoop2.5.2的HA集群原理讲解以及详细配置(自动切换)
- 国内第一篇详细讲解hadoop2的automatic HA+Federation+Yarn配置的教程
- hadoop2的automatic HA+Federation+Yarn配置的教程
- Hadoop v2(Yarn)集群配置(ubuntu 12.04)
- 最详细的hadoop2.2.0集群的HA高可靠的最简单配置
- 国内最全最详细的hadoop2.2.0集群的HA高可靠的最简单配置
- Flink on Yarn(HA配置)
- 在Hadoop YARN之上配置Spark集群(一)
- hadoop系列文档4-配置Yarn高可用HA