您的位置:首页 > 运维架构 > 网站架构

Hadoop基础教程-第9章 HA高可用(9.3 HDFS 高可用运行)(草稿)

2017-07-12 22:21 555 查看

第9章 HA高可用

9.3 HDFS 高可用运行

9.3.1 HA节点规划

节点IPZookeeperNameNodeJournalNodeDataNode
node1192.168.80.131YYYY
node2192.168.80.132YYYY
node3192.168.80.133YYY

9.3.2 启动JournalNode

第一次格式化HDFS的过程中,HA会journalnode通讯,所以需要先把三个节点的journalnode启动。

在node1节点上执行
hadoop-daemons.sh start journalnode


[root@node1 ~]# hadoop-daemons.sh start journalnode
node1: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-node1.out
node2: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-node2.out
node3: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-node3.out
[root@node1 ~]#


发现node2和node3节点上的JournalNode也连带被启动了

[root@node1 ~]# jps
2146 QuorumPeerMain
2441 Jps
2333 JournalNode


[root@node2 ~]# jps
2372 JournalNode
2458 Jps
2143 QuorumPeerMain


[root@node3 ~]# jps
2400 Jps
2133 QuorumPeerMain
2318 JournalNode


提示:也可以通过命令
hadoop-daemon.sh start journalnode
启动单个节点的JournalNode,注意是hadoop-daemon.sh,不是hadoop-daemons.sh,不会连带启动其他节点的JournalNode。

[root@node1 ~]# hadoop-daemon.sh start journalnode
starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-node1.out
[root@node1 ~]# jps
2309 JournalNode
2358 Jps
2142 QuorumPeerMain
[root@node1 ~]#


[root@node2~]# hadoop-daemon.sh start journalnode
starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-node2.out
[root@node2~]# jps
2321 JournalNode
2370 Jps
2140 QuorumPeerMain
[root@node2 ~]#


[root@node3 ~]# hadoop-daemon.sh start journalnode
starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-node3.out
[root@node3 ~]# jps
2138 QuorumPeerMain
2254 JournalNode
2303 Jps
[root@node3 ~]#


9.3.3 格式化NameNode

在其中一个namenode(任选1个)上格式化,比如此处选择在node1节点上格式化namenode。

[root@node1 ~]# hdfs namenode -format
17/07/22 06:02:16 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = node1/192.168.80.131
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.7.3
STARTUP_MSG:   classpath = /opt/hadoop-2.7.3/etc/hadoop:/opt/hadoop-2.7.3/share/hadoop/common/lib/jsr305-3.0.0.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/commons-cli-1.2.jar:/opt/hadoop-

.....

2.7.3/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/lib/guice-3.0.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/lib/javax.inject-1.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/lib/junit-4.11.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.3-tests.jar:/opt/hadoop-2.7.3/contrib/capacity-scheduler/*.jar
STARTUP_MSG:   build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff; compiled by 'root' on 2016-08-18T01:41Z
STARTUP_MSG:   java = 1.8.0_112
************************************************************/
17/07/22 06:02:16 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
17/07/22 06:02:16 INFO namenode.NameNode: createNameNode [-format]
Formatting using clusterid: CID-a421f4a5-32bc-4937-b89e-10ee124c71bc
17/07/22 06:02:18 INFO namenode.FSNamesystem: No KeyProvider found.
17/07/22 06:02:18 INFO namenode.FSNamesystem: fsLock is fair:true
17/07/22 06:02:18 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
17/07/22 06:02:18 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
17/07/22 06:02:18 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
17/07/22 06:02:18 INFO blockmanagement.BlockManager: The block deletion will start around 2017 Jul 22 06:02:18
17/07/22 06:02:18 INFO util.GSet: Computing capacity for map BlocksMap
17/07/22 06:02:18 INFO util.GSet: VM type       = 64-bit
17/07/22 06:02:18 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB
17/07/22 06:02:18 INFO util.GSet: capacity      = 2^21 = 2097152 entries
17/07/22 06:02:18 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
17/07/22 06:02:18 INFO blockmanagement.BlockManager: defaultReplication         = 3
17/07/22 06:02:18 INFO blockmanagement.BlockManager: maxReplication             = 512
17/07/22 06:02:18 INFO blockmanagement.BlockManager: minReplication             = 1
17/07/22 06:02:18 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
17/07/22 06:02:18 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
17/07/22 06:02:18 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
17/07/22 06:02:18 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
17/07/22 06:02:18 INFO namenode.FSNamesystem: fsOwner             = root (auth:SIMPLE)
17/07/22 06:02:18 INFO namenode.FSNamesystem: supergroup          = supergroup
17/07/22 06:02:18 INFO namenode.FSNamesystem: isPermissionEnabled = true
17/07/22 06:02:18 INFO namenode.FSNamesystem: Determined nameservice ID: cetc
17/07/22 06:02:18 INFO namenode.FSNamesystem: HA Enabled: true
17/07/22 06:02:18 INFO namenode.FSNamesystem: Append Enabled: true
17/07/22 06:02:19 INFO util.GSet: Computing capacity for map INodeMap
17/07/22 06:02:19 INFO util.GSet: VM type       = 64-bit
17/07/22 06:02:19 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB
17/07/22 06:02:19 INFO util.GSet: capacity      = 2^20 = 1048576 entries
17/07/22 06:02:19 INFO namenode.FSDirectory: ACLs enabled? false
17/07/22 06:02:19 INFO namenode.FSDirectory: XAttrs enabled? true
17/07/22 06:02:19 INFO namenode.FSDirectory: Maximum size of an xattr: 16384
17/07/22 06:02:19 INFO namenode.NameNode: Caching file names occuring more than 10 times
17/07/22 06:02:19 INFO util.GSet: Computing capacity for map cachedBlocks
17/07/22 06:02:19 INFO util.GSet: VM type       = 64-bit
17/07/22 06:02:19 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
17/07/22 06:02:19 INFO util.GSet: capacity      = 2^18 = 262144 entries
17/07/22 06:02:19 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
17/07/22 06:02:19 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
17/07/22 06:02:19 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000
17/07/22 06:02:19 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
17/07/22 06:02:19 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
17/07/22 06:02:19 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
17/07/22 06:02:19 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
17/07/22 06:02:19 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
17/07/22 06:02:19 INFO util.GSet: Computing capacity for map NameNodeRetryCache
17/07/22 06:02:19 INFO util.GSet: VM type       = 64-bit
17/07/22 06:02:19 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
17/07/22 06:02:19 INFO util.GSet: capacity      = 2^15 = 32768 entries
17/07/22 06:02:21 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1179450700-192.168.80.131-1500717741509
17/07/22 06:02:21 INFO common.Storage: Storage directory /hadoop/dfs/name has been successfully formatted.
17/07/22 06:02:22 INFO namenode.FSImageFormatProtobuf: Saving image file /hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
17/07/22 06:02:22 INFO namenode.FSImageFormatProtobuf: Image file /hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 351 bytes saved in 0 seconds.
17/07/22 06:02:22 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
17/07/22 06:02:22 INFO util.ExitUtil: Exiting with status 0
17/07/22 06:02:22 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node1/192.168.80.131
************************************************************/
[root@node1 ~]#


namenode格式化结果中出现
has been successfully formatted.
说明格式化成功了。

然后执行在node1节点
hadoop-daemon.sh start namenode
命令,启动namenode

[root@node1 ~]# hadoop-daemon.sh start namenode
starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-node1.out
[root@node1 ~]# jps
2528 Jps
2309 JournalNode
2455 NameNode
2142 QuorumPeerMain
[root@node1 ~]#


注意:这时如果查看namenode日志,可能存在错误“java.net.ConnectException: Connection refused”

和“StandbyException”等,因为HA模式还没有搭建完成。

9.3.4 NameNode同步

按照规划,另一个namenode位于node2,所以需要在node2节点上进行namenode同步操作(实际上就是将namenode数据复制到 本节点)

[root@node2 ~]# hdfs namenode -bootstrapStandby
17/07/22 06:07:16 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = node2/192.168.80.132
STARTUP_MSG:   args = [-bootstrapStandby]
STARTUP_MSG:   version = 2.7.3
STARTUP_MSG:   classpath = /opt/hadoop-2.7.3/etc/hadoop:/opt/hadoop-2.7.3/share/hadoop/common/lib/jsr305-3.0.0.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/commons-cli-1.2.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/commons-math3-3.1.1.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/xmlenc-0.52.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/commons-httpclient-3.1.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/commons-logging-1.1.3.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/commons-codec-1.4.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/commons-io-2.4.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/commons-net-3.1.jar:/opt/hadoop-

......

2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar:/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.3-tests.jar:/opt/hadoop-2.7.3/contrib/capacity-scheduler/*.jar
STARTUP_MSG:   build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff; compiled by 'root' on 2016-08-18T01:41Z
STARTUP_MSG:   java = 1.8.0_112
************************************************************/
17/07/22 06:07:16 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
17/07/22 06:07:16 INFO namenode.NameNode: createNameNode [-bootstrapStandby]
=====================================================
About to bootstrap Standby ID nn2 from:
Nameservice ID: cetc
Other Namenode ID: nn1
Other NN's HTTP address: http://node1:50070 Other NN's IPC  address: node1/192.168.80.131:8020
Namespace ID: 876549235
Block pool ID: BP-1179450700-192.168.80.131-1500717741509
Cluster ID: CID-a421f4a5-32bc-4937-b89e-10ee124c71bc
Layout version: -63
isUpgradeFinalized: true
=====================================================
17/07/22 06:07:19 INFO common.Storage: Storage directory /hadoop/dfs/name has been successfully formatted.
17/07/22 06:07:20 INFO namenode.TransferFsImage: Opening connection to http://node1:50070/imagetransfer?getimage=1&txid=0&storageInfo=-63:876549235:0:CID-a421f4a5-32bc-4937-b89e-10ee124c71bc 17/07/22 06:07:20 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds
17/07/22 06:07:21 INFO namenode.TransferFsImage: Transfer took 0.01s at 0.00 KB/s
17/07/22 06:07:21 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000000 size 351 bytes.
17/07/22 06:07:21 INFO util.ExitUtil: Exiting with status 0
17/07/22 06:07:21 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node2/192.168.80.132
************************************************************/
[root@node2 ~]#


启动namenode

[root@node2 ~]# hadoop-daemon.sh start namenode
starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-node2.out
[root@node2 ~]# jps
2321 JournalNode
2455 NameNode
2140 QuorumPeerMain
2495 Jps
[root@node2 ~]#


9.3.5 NameNode ZKFC

在其中一个namenode上初始化zkfc:hdfs zkfc -formatZK

[root@node1 ~]# hdfs zkfc -formatZK
17/07/22 06:10:21 INFO tools.DFSZKFailoverController: Failover controller configured for NameNode NameNode at node1/192.168.80.131:8020
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:host.name=node1
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_112
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:java.home=/opt/jdk1.8.0_112/jre
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/opt/hadoop-2.7.3/etc/hadoop:/opt/hadoop-2.7.3/share/hadoop/common/lib/jsr305-3.0.0.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/commons-cli-1.2.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/commons-math3-3.1.1.jar:/opt/hadoop-2.7.3/share/hadoop/common/lib/xmlenc-0.52.jar:/opt/hadoop-

.....

2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.3-tests.jar:/opt/hadoop-2.7.3/contrib/capacity-scheduler/*.jar
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/opt/hadoop-2.7.3/lib/native
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:os.version=3.10.0-514.el7.x86_64
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:user.name=root
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Client environment:user.dir=/root
17/07/22 06:10:21 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=node1:2181,node2:2181,node2:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@436813f3
17/07/22 06:10:22 INFO zookeeper.ClientCnxn: Opening socket connection to server node2/192.168.80.132:2181. Will not attempt to authenticate using SASL (unknown error)
17/07/22 06:10:22 INFO zookeeper.ClientCnxn: Socket connection established to node2/192.168.80.132:2181, initiating session
17/07/22 06:10:22 INFO zookeeper.ClientCnxn: Session establishment complete on server node2/192.168.80.132:2181, sessionid = 0x25d69aece0b0000, negotiated timeout = 5000
17/07/22 06:10:22 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/cetc in ZK.
17/07/22 06:10:22 INFO zookeeper.ZooKeeper: Session: 0x25d69aece0b0000 closed
17/07/22 06:10:22 WARN ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x25d69aece0b0000
17/07/22 06:10:22 INFO zookeeper.ClientCnxn: EventThread shut down
[root@node1 ~]#


结果中出现
Successfully created /hadoop-ha/hdfs1 in ZK.
说明ZK格式化成功!

9.3.6 停止已启动的HDFS

[root@node1 ~]# stop-dfs.sh
Stopping namenodes on [node1 node2]
node2: stopping namenode
node1: stopping namenode
node2: no datanode to stop
node3: no datanode to stop
node1: no datanode to stop
Stopping journal nodes [node1 node2 node3]
node2: stopping journalnode
node3: stopping journalnode
node1: stopping journalnode
Stopping ZK Failover Controllers on NN hosts [node1 node2]
node2: no zkfc to stop
node1: no zkfc to stop
[root@node1 ~]#


可见停止HA的顺序

1)停止2个namenode

2)停止所有datanode

3)停止所有 journalnode

4)停止2个zkfc

9.3.7 全面启动HDFS

[root@node1 ~]# start-dfs.sh
Starting namenodes on [node1 node2]
node2: starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-node2.out
node1: starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-node1.out
node3: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-node3.out
node1: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-node1.out
node2: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-node2.out
Starting journal nodes [node1 node2 node3]
node3: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-node3.out
node1: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-node1.out
node2: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-node2.out
Starting ZK Failover Controllers on NN hosts [node1 node2]
node1: starting zkfc, logging to /opt/hadoop-2.7.3/logs/hadoop-root-zkfc-node1.out
node2: starting zkfc, logging to /opt/hadoop-2.7.3/logs/hadoop-root-zkfc-node2.out
[root@node1 ~]#


node1

[root@node1 ~]# jps
3426 JournalNode
3219 DataNode
3595 DFSZKFailoverController
3116 NameNode
2142 QuorumPeerMain
3662 Jps
[root@node1 ~]#


node2

[root@node2 ~]# jps
2882 JournalNode
2691 NameNode
2760 DataNode
2140 QuorumPeerMain
2972 DFSZKFailoverController
3020 Jps
[root@node2 ~]#


node3

[root@node3 ~]# jps
2562 Jps
2503 JournalNode
2409 DataNode
2138 QuorumPeerMain
[root@node3 ~]#


9.3.7 Web端

http://192.168.80.131:50070



注意到页面中有“active”,说明node1节点的namenode处于active态



http://192.168.80.132:50070



注意到页面中有“standby”,说明node2节点的namenode处于standby态(备用)



9.3.8 测试HA

直接kill掉处于active状态的namenode(node1),测试另一处于standby状态的namenode(node2)是否能接管HDFS相关服务。

[root@node1 ~]# jps
3426 JournalNode
3219 DataNode
3595 DFSZKFailoverController
3116 NameNode
2142 QuorumPeerMain
3662 Jps
[root@node1 ~]# kill 3116
[root@node1 ~]#






重新启动node1节点上的namenode

[root@node1 ~]# hadoop-daemon.sh start namenode
starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-node1.out
[root@node1 ~]#




可能存在的问题

两个namenode状态不能自动切换,或者都处于standby状态。

请认真检查hdfs-site.xml和core-site.xml配置文件,可能存在配置项错误,比如某一项参数我们认为配置正确了,可能某个单词拼写错了,或者缺少
<name>
元素

9.3.9 Zookeeper

[root@node1 ~]# zkCli.sh
Connecting to localhost:2181
2017-07-22 10:04:57,523 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2017-07-22 10:04:57,537 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=node1
2017-07-22 10:04:57,537 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_112
2017-07-22 10:04:57,540 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2017-07-22 10:04:57,540 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/opt/jdk1.8.0_112/jre
2017-07-22 10:04:57,540 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/opt/zookeeper-3.4.10/bin/../build/classes:/opt/zookeeper-3.4.10/bin/../build/lib/*.jar:/opt/zookeeper-3.4.10/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/zookeeper-3.4.10/bin/../lib/slf4j-api-1.6.1.jar:/opt/zookeeper-3.4.10/bin/../lib/netty-3.10.5.Final.jar:/opt/zookeeper-3.4.10/bin/../lib/log4j-1.2.16.jar:/opt/zookeeper-3.4.10/bin/../lib/jline-0.9.94.jar:/opt/zookeeper-3.4.10/bin/../zookeeper-3.4.10.jar:/opt/zookeeper-3.4.10/bin/../src/java/lib/*.jar:/opt/zookeeper-3.4.10/bin/../conf:.::/opt/jdk1.8.0_112/lib
2017-07-22 10:04:57,540 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2017-07-22 10:04:57,541 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2017-07-22 10:04:57,541 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2017-07-22 10:04:57,541 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2017-07-22 10:04:57,541 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2017-07-22 10:04:57,541 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=3.10.0-514.el7.x86_64
2017-07-22 10:04:57,541 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root
2017-07-22 10:04:57,542 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root
2017-07-22 10:04:57,542 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/root
2017-07-22 10:04:57,543 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@506c589e
Welcome to ZooKeeper!
JLine support is enabled
2017-07-22 10:04:57,647 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-07-22 10:04:57,839 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2017-07-22 10:04:57,870 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15d6a8d3c460002, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, test, hadoop-ha]
[zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha
[cetc]
[zk: localhost:2181(CONNECTED) 2] ls /hadoop-ha/cetc
[ActiveBreadCrumb, ActiveStandbyElectorLock]
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐