您的位置:首页 > 大数据 > Hadoop

Centos7.4部署hadoop(HA)高可用集群(阿里云)

2018-11-29 16:55 375 查看

文章目录

  • 环境准备
  • Selinux(3台root)
  • 创建用户(3台root)
  • 安装lrzsz软件(3台)
  • 创建目录(3台root)
  • 配置hosts文件
  • SSH互信
  • 上传软件包
  • 安装JDK
  • 集群安装部署
  • 安装hadoop
  • 同步配置文件
  • 创建临时目录(3台)
  • 启动集群服务
  • 启动HDFS
  • 启动JournalNode服务(3台)
  • 格式化并启动第一个NameNode(hadoop001)
  • 格式化并启动第二个NameNode(hadoop002)
  • 启动datanode服务(hadoop001)
  • 启动ZooKeeperFailoverController(hadoop001,hadoop002)
  • Web页面查看
  • 启动YARN
  • hadoop002上启动resourcemanager
  • Web页面查看
  • 启动jobhistory
  • Web页面查看
  • 停止集群服务
  • < 4000 ul>
  • 停止YARN相关服务
  • 停止HDFS相关服务
  • 停止Zookeeper相关服务
  • jps查看进程(3台)
  • 软件准备

    组件

    组件 软件包 网址
    Centos Centos7.4(64位) 阿里云服务器3台,按量付费,用完释放资源,密码:Root@123!
    JDK jdk-8u151-linux-x64.tar.gz https://www.oracle.com/technetwork/java/javase/downloads/index.html
    Zookeeper zookeeper-3.4.5-cdh5.15.1.tar.gz http://archive.cloudera.com/cdh5/cdh/5/zookeeper-3.4.5-cdh5.15.1.tar.gz
    Hadoop hadoop-2.6.0-cdh5.15.1.tar.gz http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.15.1.tar.gz

    主机规划

    服务 主机 IP 配置
    QuorumPeerMain hadoop001 172.26.239.216 2核4GB内存,带宽2M,系统盘赠送的40GB存储
    JournalNode
    NameNode
    DataNode
    DFSZKFailoverController
    ResourceManager
    NodeManager
    JobHistoryServer
    QuorumPeerMain hadoop002 172.26.239.214 2核4GB内存,带宽2M,系统盘赠送的40GB存储
    JournalNode
    NameNode
    DataNode
    DFSZKFailoverController
    ResourceManager
    NodeManager
    QuorumPeerMain hadoop003 172.26.239.215 2核4GB内存,带宽2M,系统盘赠送的40GB存储
    JournalNode
    DataNode
    NodeManager

    购买阿里云服务配置,3台服务器,¥1.831/小时。
    云主机,网络环境,时间同步就不需要做了,阿里云会帮我们做好。


    l临时环境,可以将防火墙全部打开1/65535,授权对象0.0.0.0/0

    目录规划

    名称 路径 备注
    JAVA_HOME /usr/java/jdk1.8.0_151 手动创建/usr/java
    $ZOOKEEPER_HOME /opt/app/zeekeeper
    data $ZOOKEEPER_HOME/data 手动创建
    $HADOOP_HOME /opt/app/hadoop
    data $ADOOP_HOME/data
    log $ADOOP_HOME/logs
    hadoop.tmp.dir $ADOOP_HOME/tmp 手工创建,权限777,hadoop:hadoop
    software /opt/app/software 软件包存放路径

    环境准备

    防火墙(3台root)

    查看防火墙(root用户操作)

    [root@hadoop001 ~]# firewall-cmd --state
    not running
    [root@hadoop001 ~]#

    关闭防火墙

    如果防火墙开着,需要使用关闭防火墙

    [root@hadoop001 ~]# systemctl stop firewalld

    禁止开启自启防火墙

    [root@hadoop001 ~]#systemctl disable firewalld

    Selinux(3台root)

    查看Selinux 状态

    命令:getenforce 或者sestatus 命令

    [root@hadoop001 ~]# getenforce
    Disabled
    [root@hadoop001 ~]#

    关闭Selinux

    临时生效(setenforce 0)

    [root@hadoop001 ~]# setenforce 0 ###设置SELinux成为permissive模式,用于临时关闭selinux防火墙,但重启后失效。
    [root@hadoop001 ~]# setenforce 1 ###设置SELinux成为enforcing模式,用于临时开启selinux防火墙,但重启后失效。

    永久生效,修改/etc/selinux/config

    [root@hadoop001 ~]#vim /etc/selinux/config
    SELINUX=disabled

    创建用户(3台root)

    使用root创建hadoop用户

    [root@hadoop001 ~]# useradd hadoop

    修改hadoop用户的密码为hadoop

    [root@hadoop001 ~]# echo hadoop | passwd --stdin hadoop

    返回结果如下:

    [root@hadoop001 ~]# echo hadoop | passwd --stdin hadoopChanging password for user hadoop.
    passwd: all authentication tokens updated successfully.
    [root@hadoop001 ~]#
    

    安装lrzsz软件(3台)

    [root@hadoop001 ~]# yum -y install lrzsz

    创建目录(3台root)

    使用root用户创建目录

    [root@hadoop001 ~]# mkdir -p /opt/app/software

    修改/opt/app拥有者

    [root@hadoop001 ~]# chown -R hadoop:hadoop /opt/app

    切换hadoop( - 加载环境变量,同时切换到hadoop家目录下)

    [root@hadoop001 ~]# su - hadoop
    [hadoop@hadoop001 ~]$

    配置hosts文件

    使用root用户打开/etc/hosts

    [root@hadoop001 ~]# vim /etc/hosts

    修改hosts文件,添加以下内容

    172.26.239.216 hadoop001
    172.26.239.214 hadoop002
    172.26.239.215 hadoop003

    scp同步到另外两台主机上

    [root@hadoop001 ~]# scp /etc/hosts root@172.26.239.215:/etc/
    [root@hadoop001 ~]# scp /etc/hosts root@172.26.239.214:/etc/

    SSH互信

    ssh-keygen创建公钥(Public Key)和密钥(Private Key)ssh-keygen -t [rsa|dsa] 默认dsa

    [hadoop@hadoop001 ~]$ ssh-keygen

    ssh-copy-id把本地主机的公钥复制到远程主机的 authorized_keys 文件上。
    也会给远程主机的用户主目录(home)和~/.ssh, 和~/.ssh/authorized_keys 设置合适的权限。

    [hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop001
    [hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop002
    [hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop003

    返回结果

    [hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop002
    /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
    The authenticity of host 'hadoop002 (172.26.239.214)' can't be established.
    ECDSA key fingerprint is SHA256:WuDCjCFcqjYk/C4Wgop9M6rIbkmnE4gn6mEHMVnBcWk.
    ECDSA key fingerprint is MD5:f5:1e:b4:52:47:19:d6:ce:2b:31:a0:b4:48:ee:d2:f2.
    Are you sure you want to continue connecting (yes/no)? yes
    /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
    /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
    hadoop@hadoop002's password:
    
    Number of key(s) added: 1
    
    Now try logging into the machine, with:   "ssh 'hadoop@hadoop002'"
    and check to make sure that only the key(s) you wanted were added.
    
    [hadoop@hadoop001 ~]$

    client 端去登陆server 端免密码输入

    [hadoop@hadoop001 ~]$ ssh hadoop@hadoop001 ls
    [hadoop@hadoop001 ~]$ ssh hadoop@hadoop002 ls
    [hadoop@hadoop001 ~]$ ssh hadoop@hadoop003 ls

    上传软件包

    使用hadoop用户将下载好的软件包上传到/opt/app/software目录下
    切换到软件存放目录

    [hadoop@hadoop001 ~]$ cd /opt/app/software

    使用rz命令上传软件包

    [hadoop@hadoop001 software]$ rz

    使用scp将软件包同步到另外两台主机上

    [hadoop@hadoop001 ~]$ scp -r /opt/app/software/* hadoop@hadoop002:/opt/app/software/
    [hadoop@hadoop001 ~]$ scp -r /opt/app/software/* hadoop@hadoop003:/opt/app/software/

    安装JDK

    使用root用户创建/usr/java目录

    [root@hadoop001 ~]# mkdir /usr/java

    修改目录拥有者

    [root@hadoop001 ~]# chown -R hadoop:hadoop /usr/java

    使用hadoop用户解压jdk

    [hadoop@hadoop001 software]$ tar -zxvf /opt/app/software/jdk-8u151-linux-x64.tar.gz -C /usr/java/

    使用root用户修改环境变量

    [root@hadoop001 ~]# vim /etc/profile
    ###ADD JDK environment
    export JAVA_HOME=/usr/java/jdk1.8.0_151
    export PATH=$JAVA_HOME/bin:$PATH

    source使环境变量生效

    [root@hadoop001 ~]# source /etc/profile

    验证JDK

    [hadoop@hadoop001 ~]$ java -version
    java version "1.8.0_151"
    Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
    Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
    [hadoop@hadoop001 ~]$

    同步到另外两台主机

    [root@hadoop001 ~]# scp /etc/profile root@hadoop002:/etc/profile
    [root@hadoop001 ~]# scp /etc/profile root@hadoop003:/etc/profile

    集群安装部署

    安装Zookeeper

    解压Zookeeper(3台)

    [hadoop@hadoop001 software]$ tar -zxvf /opt/app/software/zookeeper-3.4.5-cdh5.15.1.tar.gz -C /opt/app

    修改配置文件

    复制生成/opt/app/zookeeper-3.4.5-cdh5.15.1/conf/zoo.cfg

    [hadoop@hadoop001 ~]$ cd /opt/app/zookeeper-3.4.5-cdh5.15.1/conf
    [hadoop@hadoop001 conf]$ cp zoo_sample.cfg zoo.cfg

    修改zoo.cfg文件

    [hadoop@hadoop001 conf]$ vim zoo.cfg
    # The number of milliseconds of each tick
    tickTime=2000
    # The number of milliseconds of each tick
    tickTime=2000
    # The number of ticks that the initial
    # synchronization phase can take
    initLimit=10
    # The number of ticks that can pass between
    # sending a request and getting an acknowledgement
    syncLimit=5
    # the directory where the snapshot is stored.
    # do not use /tmp for storage, /tmp here is just
    # example sakes.
    ##这个目录存放zookeeper 的数据,以及myid 配置文件。此目录若没有则必须手动创建。
    dataDir=/opt/app/zookeeper-3.4.5-cdh5.15.1/data
    # the port at which the clients will connect
    clientPort=2181
    # the maximum number of client connections.
    # increase this if you need to handle more clients
    #maxClientCnxns=60
    ## server 后面的数字为myid 配置文件中的数据。必须与主机名严格对应。
    server.1=hadoop001:2888:3888
    server.2=hadoop002:2888:3888
    server.3=hadoop003:2888:3888
    ##生产环境至少3 台zookeeper 服务器,并且如果集群较大可增加至奇数台。
    #
    # Be sure to read the maintenance section of the
    # administrator guide before turning on autopurge.
    #
    # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
    #
    # The number of snapshots to retain in dataDir
    #autopurge.snapRetainCount=3
    # Purge task interval in hours
    # Set to "0" to disable auto purge feature
    #autopurge.purgeInterval=1

    重要参数

    ##这个目录存放zookeeper 的数据,以及myid 配置文件。此目录若没有则必须手动创建。
    dataDir=/opt/app/zookeeper-3.4.5-cdh5.15.1/data
    ## server 后面的数字为myid 配置文件中的数据。必须与主机名严格对应。
    ##生产环境至少3 台zookeeper 服务器,并且如果集群较大可增加至奇数台。
    server.1=hadoop001:2888:3888
    server.2=hadoop002:2888:3888
    server.3=hadoop003:2888:3888

    同步配置文件

    [hadoop@hadoop001 conf]$ scp zoo.cfg hadoop002:/opt/app/zookeeper-3.4.5-cdh5.15.1/conf
    [hadoop@hadoop001 conf]$ scp zoo.cfg hadoop003:/opt/app/zookeeper-3.4.5-cdh5.15.1/conf

    创建zookeeper数据目录(3台)

    [hadoop@hadoop001 conf]$ mkdir /opt/app/zookeeper-3.4.5-cdh5.15.1/data

    配置/opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid

    [hadoop@hadoop001 conf]$ cd /opt/app/zookeeper-3.4.5-cdh5.15.1/data
    [hadoop@hadoop001 data]$ echo 1 > /opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid
    [hadoop@hadoop001 data]$ ssh hadoop002 " echo 2 > /opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid "
    [hadoop@hadoop001 data]$ ssh hadoop003 " echo 3 > /opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid "

    安装hadoop

    解压hadoop(3台)

    [hadoop@hadoop001 ~]$ tar -zxvf /opt/app/software/hadoop-2.6.0-cdh5.15.1.tar.gz -C /opt/app

    配置环境变量

    [root@hadoop001 ~]# vim /etc/profile
    ###ADD JDK environment
    export JAVA_HOME=/usr/java/jdk1.8.0_151
    export PATH=$JAVA_HOME/bin:$PATH
    ###ADD JDK environment
    export ZOOKEEPER_HOME=/opt/app/zookeeper-3.4.5-cdh5.15.1
    export PATH=$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/sbin:$PATH
    
    ###ADD HADOOP environment
    export HADOOP_HOME=/opt/app/hadoop-2.6.0-cdh5.15.1
    export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
    
    ###ADD CLASSPATH
    CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$HADOOP_HOME/lib:$CLASSPATH
    

    同步环境变量文件/etc/profile

    [root@hadoop001 ~]# scp /etc/profile root@hadoop002:/etc/profile
    [root@hadoop001 ~]# scp /etc/profile root@hadoop003:/etc/profile

    修改配置文件

    hadoop-env

    [hadoop@hadoop001 hadoop]$ vim hadoop-env.sh
    ...
    export JAVA_HOME="/usr/java/jdk1.8.0_151"
    ...

    core-site.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at
    
    http://www.apache.org/licenses/LICENSE-2.0
    
    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    <configuration>
    <!--Yarn 需要使用 fs.defaultFS 指定NameNode URI -->
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://hdp</value>
    </property>
    <!--==============================Trash机制======================================= -->
    <property>
    <!--多长时间创建CheckPoint NameNode截点上运行的CheckPointer 从Current文件夹创建CheckPoint;默认:0 由fs.trash.interval项指定 -->
    <name>fs.trash.checkpoint.interval</name>
    <value>0</value>
    </property>
    <property>
    <!--多少分钟.Trash下的CheckPoint目录会被删除,该配置服务器设置优先级大于客户端,默认:0 不删除 -->
    <name>fs.trash.interval</name>
    <value>1440</value>
    </property>
    
    <!--指定hadoop临时目录, hadoop.tmp.dir 是hadoop文件系统依赖的基础配置,很多路径都依赖它。如果hdfs-site.xml中不配 置namenode和datanode的存放位置,默认就放在这>个路径中 -->
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/opt/app/hadoop-2.6.0-cdh5.15.1/tmp</value>
    </property>
    
    <!-- 指定zookeeper地址 -->
    <property>
    <name>ha.zookeeper.quorum</name>
    <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
    </property>
    <!--指定ZooKeeper超时间隔,单位毫秒 -->
    <property>
    <name>ha.zookeeper.session-timeout.ms</name>
    <value>2000</value>
    </property>
    
    <property>
    <name>hadoop.proxyuser.hadoop.hosts</name>
    <value>*</value>
    </property>
    <property>
    <name>hadoop.proxyuser.hadoop.groups</name>
    <value>*</value>
    </property>
    <property>
    <name>io.compression.codecs</name>
    <value>org.apache.hadoop.io.compress.GzipCodec,
    org.apache.hadoop.io.compress.DefaultCodec,
    org.apache.hadoop.io.compress.BZip2Codec,
    org.apache.hadoop.io.compress.SnappyCodec
    </value>
    </property>
    </configuration>

    hdfs-site.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at
    
    http://www.apache.org/licenses/LICENSE-2.0
    
    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
    <!--HDFS超级用户 -->
    <property>
    <name>dfs.permissions.superusergroup</name>
    <value>hadoop</value>
    </property>
    
    <!--开启web hdfs -->
    <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
    </property>
    <property>
    <name>dfs.namenode.name.dir</name>
    <value>/opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/name</value>
    <description> namenode 存放name table(fsimage)本地目录(需要修改)</description>
    </property>
    <property>
    <name>dfs.namenode.edits.dir</name>
    <value>${dfs.namenode.name.dir}</value>
    <description>namenode粗放 transaction file(edits)本地目录(需要修改)</description>
    </property>
    <property>
    <name>dfs.datanode.data.dir</name>
    <value>/opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/data</value>
    <description>datanode存放block本地目录(需要修改)</description>
    </property>
    <property>
    <name>dfs.replication</name>
    <value>3</value>
    </property>
    <!-- 块大小256M (默认128M) -->
    <property>
    <name>dfs.blocksize</name>
    <value>268435456</value>
    </property>
    <!--======================================================================= -->
    <!--HDFS高可用配置 -->
    <!--指定hdfs的nameservice为hdp,需要和core-site.xml中的保持一致 -->
    <property>
    <name>dfs.nameservices</name>
    <value>hdp</value>
    </property>
    <property>
    <!--设置NameNode IDs 此版本最大只支持两个NameNode -->
    <name>dfs.ha.namenodes.hdp</name>
    <value>nn1,nn2</value>
    </property>
    
    <!-- Hdfs HA: dfs.namenode.rpc-address.[nameservice ID] rpc 通信地址 -->
    <property>
    <name>dfs.namenode.rpc-address.hdp.nn1</name>
    <value>hadoop001:8020</value>
    </property>
    <property>
    <name>dfs.namenode.rpc-address.hdp.nn2</name>
    <value>hadoop002:8020</value>
    </property>
    
    <!-- Hdfs HA: dfs.namenode.http-address.[nameservice ID] http 通信地址 -->
    <property>
    <name>dfs.namenode.http-address.hdp.nn1</name>
    <value>hadoop001:50070</value>
    </property>
    <property>
    <name>dfs.namenode.http-address.hdp.nn2</name>
    <value>hadoop002:50070</value>
    </property>
    
    <!--==================Namenode editlog同步 ============================================ -->
    <!--保证数据恢复 -->
    <property>
    <name>dfs.journalnode.http-address</name>
    <value>0.0.0.0:8480</value>
    </property>
    <property>
    <name>dfs.journalnode.rpc-address</name>
    <value>0.0.0.0:8485</value>
    </property>
    <property>
    <!--设置JournalNode服务器地址,QuorumJournalManager 用于存储editlog -->
    <!--格式:qjournal://<host1:port1>;<host2:port2>;<host3:port3>/<journalId> 端口同journalnode.rpc-address -->
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://hadoop001:8485;hadoop002:8485;hadoop003:8485/hdp</value>
    </property>
    
    <property>
    <!--JournalNode存放数据地址 -->
    <name>dfs.journalnode.edits.dir</name>
    <value>/opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/jn</value>
    </property>
    <!--==================DataNode editlog同步 ============================================ -->
    <property>
    <!--DataNode,Client连接Namenode识别选择Active NameNode策略 -->
    <!-- 配置失败自动切换实现方式 -->
    <name>dfs.client.failover.proxy.provider.hdp</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <!--==================Namenode fencing:=============================================== -->
    <!--Failover后防止停掉的Namenode启动,造成两个服务 -->
    <property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
    </property>
    <property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/home/hadoop/.ssh/id_rsa</value>
    </property>
    <property>
    <!--多少milliseconds 认为fencing失败 -->
    <name>dfs.ha.fencing.ssh.connect-timeout</name>
    <value>30000</value>
    </property>
    
    <!--==================NameNode auto failover base ZKFC and Zookeeper====================== -->
    <!--开启基于Zookeeper  -->
    <property>
    <name>dfs.ha.automatic-failover.enabled</name>
    <value>true</value>
    </property>
    <!--动态许可datanode连接namenode列表 -->
    <property>
    <name>dfs.hosts</name>
    <value>/opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop/slaves</value>
    </property>
    </configuration>

    yarn-env

    [hadoop@hadoop001 hadoop]$ vim yarn-env.sh
    ...
    export YARN_LOG_DIR="/opt/app/hadoop-2.6.0-cdh5.15.1logs"
    ...

    mapred-site.xml

    [hadoop@hadoop001 hadoop]$ cd /opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop
    [hadoop@hadoop001 hadoop]$ cp mapred-site.xml.template mapred-site.xml
    [hadoop@hadoop001 hadoop]$ vim mapred-site.xml
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at
    
    http://www.apache.org/licenses/LICENSE-2.0
    
    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
    <!-- 配置 MapReduce Applications -->
    <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    </property>
    <!-- JobHistory Server ============================================================== -->
    <!-- 配置 MapReduce JobHistory Server 地址 ,默认端口10020 -->
    <property>
    <name>mapreduce.jobhistory.address</name>
    <value>hadoop001:10020</value>
    </property>
    <!-- 配置 MapReduce JobHistory Server web ui 地址, 默认端口19888 -->
    <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>hadoop001:19888</value>
    </property>
    
    <!-- 配置 Map段输出的压缩,snappy-->
    <property>
    <name>mapreduce.map.output.compress</name>
    <value>true</value>
    </property>
    
    <property>
    <name>mapreduce.map.output.compress.codec</name>
    <value>org.apache.hadoop.io.compress.SnappyCodec</value>
    </property>
    
    </configuration>
    yarn-site.xml
    
    <?xml version="1.0"?>
    <!--
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at
    
    http://www.apache.org/licenses/LICENSE-2.0
    
    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License. See accompanying LICENSE file.
    -->
    
    <configuration>
    <!-- nodemanager 配置 ================================================= -->
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>
    <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
    <name>yarn.nodemanager.localizer.address</name>
    <value>0.0.0.0:23344</value>
    <description>Address where the localizer IPC is.</description>
    </property>
    <property>
    <name>yarn.nodemanager.webapp.address</name>
    <value>0.0.0.0:23999</value>
    <description>NM Webapp address.</description>
    </property>
    
    <!-- HA 配置 =============================================================== -->
    <!-- Resource Manager Configs -->
    <property>
    <name>yarn.resourcemanager.connect.retry-interval.ms</name>
    <value>2000</value>
    </property>
    <property>
    <name>yarn.resourcemanager.ha.enabled</name>
    <value>true</value>
    </property>
    <property>
    <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
    <value>true</value>
    </property>
    <!-- 使嵌入式自动故障转移。HA环境启动,与 ZKRMStateStore 配合 处理fencing -->
    <property>
    <name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
    <value>true</value>
    </property>
    <!-- 集群名称,确保HA选举时对应的集群 -->
    <property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>yarn-cluster</value>
    </property>
    <property>
    <name>yarn.resourcemanager.ha.rm-ids</name>
    <value>rm1,rm2</value>
    </property>
    
    <!--这里RM主备结点需要单独指定,(可选)
    <property>
    <name>yarn.resourcemanager.ha.id</name>
    <value>rm2</value>
    </property>
    -->
    
    <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
    </property>
    <property>
    <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
    </property>
    <property>
    <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
    <value>5000</value>
    </property>
    <!-- ZKRMStateStore 配置 -->
    <property>
    <name>yarn.resourcemanager.store.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>
    <property>
    <name>yarn.resourcemanager.zk-address</name>
    <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
    </property>
    <property>
    <name>yarn.resourcemanager.zk.state-store.address</name>
    <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
    </property>
    <!-- Client访问RM的RPC地址 (applications manager interface) -->
    <property>
    <name>yarn.resourcemanager.address.rm1</name>
    <value>hadoop001:23140</value>
    </property>
    <property>
    <name>yarn.resourcemanager.address.rm2</name>
    <value>hadoop002:23140</value>
    </property>
    <!-- AM访问RM的RPC地址(scheduler interface) -->
    <property>
    <name>yarn.resourcemanager.scheduler.address.rm1</name>
    <value>hadoop001:23130</value>
    </property>
    <property>
    <name>yarn.resourcemanager.scheduler.address.rm2</name>
    <value>hadoop002:23130</value>
    </property>
    <!-- RM admin interface -->
    <property>
    <name>yarn.resourcemanager.admin.address.rm1</name>
    <value>hadoop001:23141</value>
    </property>
    <property>
    <name>yarn.resourcemanager.admin.address.rm2</name>
    <value>hadoop002:23141</value>
    </property>
    <!--NM访问RM的RPC端口 -->
    <property>
    <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
    <value>hadoop001:23125</value>
    </property>
    <property>
    <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
    <value>hadoop002:23125</value>
    </property>
    <!-- RM web application 地址 -->
    <property>
    <name>yarn.resourcemanager.webapp.address.rm1</name>
    <value>hadoop001:8088</value>
    </property>
    <property>
    <name>yarn.resourcemanager.webapp.address.rm2</name>
    <value>hadoop002:8088</value>
    </property>
    <property>
    <name>yarn.resourcemanager.webapp.https.address.rm1</name>
    <value>hadoop001:23189</value>
    </property>
    <property>
    <name>yarn.resourcemanager.webapp.https.address.rm2</name>
    <value>hadoop002:23189</value>
    </property>
    
    <property>
    <name>yarn.log-aggregation-enable</name>
    <value>true</value>
    </property>
    <property>
    <name>yarn.log.server.url</name>
    <value>http://hadoop001:19888/jobhistory/logs</value>
    </property>
    
    <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>2048</value>
    </property>
    <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>1024</value>
    <discription>单个任务可申请最少内存,默认1024MB</discription>
    </property>
    
    <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>2048</value>
    <discription>单个任务可申请最大内存,默认8192MB</discription>
    </property>
    
    <property>
    <name>yarn.nodemanager.resource.cpu-vcores</name>
    <value>2</value>
    </property>
    
    </configuration>

    slaves

    [hadoop@hadoop001 hadoop]$ vim slaves
    hadoop001
    hadoop002
    hadoop003

    同步配置文件

    [hadoop@hadoop001 hadoop]$ scp core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml hadoop-env.sh slaves hadoop002:/opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop
    [hadoop@hadoop001 hadoop]$ scp core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml hadoop-env.sh slaves hadoop003:/opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop

    创建临时目录(3台)

    [hadoop@hadoop001 hadoop-2.6.0-cdh5.15.1]$ mkdir /opt/app/hadoop-2.6.0-cdh5.15.1/tmp
    [hadoop@hadoop001 hadoop-2.6.0-cdh5.15.1]$ chmod 777 /opt/app/hadoop-2.6.0-cdh5.15.1/tmp

    启动集群服务

    zookper

    启动zookeeper服务(3台)

    [hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkServer.sh start
    JMX enabled by default
    Using config: /opt/app/zookeeper-3.4.5-cdh5.15.1/bin/../conf/zoo.cfg
    Starting zookeeper ... STARTED
    [hadoop@hadoop001 ~]$

    查看zookeeper状态(3台)

    [hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkServer.sh status
    JMX enabled by default
    Using config: /opt/app/zookeeper-3.4.5-cdh5.15.1/bin/../conf/zoo.cfg
    Mode: follower
    [hadoop@hadoop001 ~]$

    jps查看zookeeper进程(3台)

    [hadoop@hadoop001 ~]$ jps
    20483 QuorumPeerMain
    20516 Jps
    [hadoop@hadoop001 ~]$

    验证zookeeper服务

    [hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkCli.sh
    [zk: localhost:2181(CONNECTED) 0] ls /
    [zookeeper]
    [zk: localhost:2181(CONNECTED) 1] quit
    Quitting...
    [hadoop@hadoop001 ~]$

    启动HDFS

    格式化hdfs的zookeeper存储目录

    [hadoop@hadoop001 ~]$ hdfs zkfc -formatZK

    查看zookeeper信息

    [hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkCli.sh
    [zk: localhost:2181(CONNECTED) 0] ls /
    [zookeeper, hadoop-ha]
    [zk: localhost:2181(CONNECTED) 1] quit
    Quitting...
    
    [hadoop@hadoop001 ~]$

    启动JournalNode服务(3台)

    [hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start journalnode
    starting journalnode, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-journalnode-hadoop001.out
    [hadoop@hadoop001 ~]$

    jps查看JournalNode进程(3台)

    [hadoop@hadoop001 ~]$ jps
    21107 JournalNode
    21156 Jps
    20893 QuorumPeerMain
    [hadoop@hadoop001 ~]$

    格式化并启动第一个NameNode(hadoop001)

    [hadoop@hadoop001 ~]$ hdfs namenode -format   ##格式化当前节点的namenode 数据

    格式化journalnode 的数据,这个是ha需要做的

    [hadoop@hadoop001 ~]$ hdfs namenode -initializeSharedEdits
    18/11/27 01:51:28 INFO namenode.NameNode: STARTUP_MSG:
    /************************************************************
    STARTUP_MSG: Starting NameNode
    
    Re-format filesystem in QJM to [172.26.239.216:8485, 172.26.239.214:8485, 172.26.239.215:8485] ? (Y or N) Y
    18/11/27 01:51:39 INFO namenode.FileJournalManager: Recovering unfinalized segments in /opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/name/current
    18/11/27 01:51:39 INFO client.QuorumJournalManager: Starting recovery process for unclosed journal segments...
    18/11/27 01:51:39 INFO client.QuorumJournalManager: Successfully started new epoch 1
    18/11/27 01:51:39 INFO util.ExitUtil: Exiting with status 0
    18/11/27 01:51:39 INFO namenode.NameNode: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at hadoop001/172.26.239.216
    ************************************************************/
    [hadoop@hadoop001 ~]$

    启动当前节点的namenode 服务

    [hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start namenode
    starting namenode, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-namenode-hadoop001.out
    [hadoop@hadoop001 ~]$

    jps查看namenode进程

    [hadoop@hadoop001 ~]$ jps
    21107 JournalNode
    21350 Jps
    21276 NameNode
    20893 QuorumPeerMain
    [hadoop@hadoop001 ~]

    格式化并启动第二个NameNode(hadoop002)

    [hadoop@hadoop002 ~]$ hdfs namenode -bootstrapStandby  #hadoop001已经格式化过,同步至hadoop002

    启动当前节点的namenode 服务(hadoop002)

    [hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start namenode
    starting namenode, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-namenode-hadoop002.out
    [hadoop@hadoop002 ~]$

    jps查看namenode进程(hadoop002)

    [hadoop@hadoop002 ~]$ jps
    20690 QuorumPeerMain
    20788 JournalNode
    21017 Jps
    20923 NameNode
    [hadoop@hadoop002 ~]$

    启动datanode服务(hadoop001)

    [hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemons.sh start datanode

    jps查看datanode进程(3台)

    [hadoop@hadoop001 ~]$ jps
    21857 Jps
    21107 JournalNode
    21766 DataNode
    21276 NameNode
    20893 QuorumPeerMain
    [hadoop@hadoop001 ~]$

    启动ZooKeeperFailoverController(hadoop001,hadoop002)

    hadoop001

    [hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start zkfc  #所有namenode节点分别执行
    starting zkfc, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-zkfc-hadoop001.out
    [hadoop@hadoop001 ~]$

    hadoop002

    [hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start zkfc  #所有namenode节点分别执行
    starting zkfc, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-zkfc-hadoop002.out
    [hadoop@hadoop002 ~]$

    查看DFSZKFailoverController

    [hadoop@hadoop001 ~]$ jps
    21107 JournalNode
    21766 DataNode
    21276 NameNode
    20893 QuorumPeerMain
    21950 DFSZKFailoverController
    22030 Jps
    [hadoop@hadoop001 ~]$

    Web页面查看

    登陆http://39.98.44.126:50070(hadoop001)

    其中一个为active,另一个为standby状态

    登陆http://39.98.37.133:50070(hadoop002)

    启动YARN

    hadoop001上启动yarn

    [hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/start-yarn.sh
    starting yarn daemons
    starting resourcemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1logs/yarn-hadoop-resourcemanager-hadoop001.out
    hadoop002: starting nodemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-nodemanager-hadoop002.out
    hadoop003: starting nodemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-nodemanager-hadoop003.out
    hadoop001: starting nodemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1logs/yarn-hadoop-nodemanager-hadoop001.out
    [hadoop@hadoop001 ~]$

    jps查看yarn进程(3台)

    [hadoop@hadoop001 ~]$ jps
    22624 Jps
    21107 JournalNode
    22212 ResourceManager
    21766 DataNode
    22310 NodeManager
    21276 NameNode
    20893 QuorumPeerMain
    21950 DFSZKFailoverController
    [hadoop@hadoop001 ~]

    hadoop002上启动resourcemanager

    [hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/yarn-daemon.sh start resourcemanager
    starting resourcemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-resourcemanager-hadoop002.out
    [hadoop@hadoop002 ~]$

    jps查看resourcemanager进程

    [hadoop@hadoop002 ~]$ jps
    20690 QuorumPeerMain
    20788 JournalNode
    21908 Jps
    21399 DFSZKFailoverController
    20923 NameNode
    21675 NodeManager
    21117 DataNode
    21853 ResourceManager
    [hadoop@hadoop002 ~]$

    Web页面查看

    登陆http://39.98.44.126:8088(hadoop001)

    其中一个为active,另一个为standby状态

    登陆http://39.98.37.133:8088/cluster/cluster(hadoop002)

    启动jobhistory

    在hadoop001上启动jobhistory服务

    [hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
    starting historyserver, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/mapred-hadoop-historyserver-hadoop001.out
    [hadoop@hadoop001 ~]$

    jps查看JobHistoryServer进程

    [hadoop@hadoop001 ~]$ jps
    22785 Jps
    21107 JournalNode
    22212 ResourceManager
    21766 DataNode
    22310 NodeManager
    22680 JobHistoryServer
    21276 NameNode
    20893 QuorumPeerMain
    21950 DFSZKFailoverController
    [hadoop@hadoop001 ~]$

    Web页面查看

    登陆jobhistory服务器web端查看job状态

    http://39.98.44.126:19888(hadoop001)

    停止集群服务

    停止YARN相关服务

    停止hadoop001上historyserver

    [hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh stop historyserver
    stopping historyserver
    [hadoop@hadoop001 ~]$

    停止hadoop001上yarn任务

    [hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/stop-yarn.sh  #停止hadoop001上resourcemanager及所有的nodemanager
    stopping yarn daemons
    stopping resourcemanager
    hadoop003: stopping nodemanager
    hadoop001: stopping nodemanager
    hadoop002: stopping nodemanager
    no proxyserver to stop
    [hadoop@hadoop001 ~]$

    停止hadoop002上resourcemanager

    [hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/yarn-daemon.sh stop resourcemanager
    stopping resourcemanager
    [hadoop@hadoop002 ~]$

    停止HDFS相关服务

    停止namenode,datanode,journalnode,zkfc服务

    [hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/stop-dfs.sh
    18/11/27 03:32:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Stopping namenodes on [hadoop001 hadoop002]
    hadoop001: stopping namenode
    hadoop002: stopping namenode
    hadoop003: stopping datanode
    hadoop001: stopping datanode
    hadoop002: stopping datanode
    Stopping journal nodes [hadoop001 hadoop002 hadoop003]
    hadoop002: stopping journalnode
    hadoop003: stopping journalnode
    hadoop001: stopping journalnode
    18/11/27 03:33:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Stopping ZK Failover Controllers on NN hosts [hadoop001 hadoop002]
    hadoop001: stopping zkfc
    hadoop002: stopping zkfc
    [hadoop@hadoop001 ~]$

    停止Zookeeper相关服务

    Zookeeper(3台)

    [hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkServer.sh stop
    JMX enabled by default
    Using config: /opt/app/zookeeper-3.4.5-cdh5.15.1/bin/../conf/zoo.cfg
    Stopping zookeeper ... STOPPED
    [hadoop@hadoop001 ~]$

    jps查看进程(3台)

    [hadoop@hadoop001 ~]$ jps
    23881 Jps
    [hadoop@hadoop001 ~]$
    内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
    标签: