您的位置:首页 > Web前端 > Node.js

hdfs datanode节点死掉

2018-03-28 19:35 519 查看
1 现象一个节点dead,日志如下:

GC pool 'PS MarkSweep' had collection(s): count=13 time=24256ms

2018-03-28 17:52:51,624 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-229667812-10.100.208.74-1520931759525:blk_1078749177_5008401 received exception java.io.IOException: Premature EOF from inputStream

2018-03-28 17:52:59,065 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 6944ms

GC pool 'PS MarkSweep' had collection(s): count=15 time=27935ms

2018-03-28 17:54:59,834 WARN org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 14932ms

GC pool 'PS MarkSweep' had collection(s): count=20 time=37701ms

2018-03-28 17:56:10,452 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: tf73:50010:DataXceiver error processing WRITE_BLOCK operation  src: /10.100.208.68:46164 dst: /10.100.208.73:50010

java.io.IOException: Premature EOF from inputStream

        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:202)

        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)

        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)

        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)

        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:503)

        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:903)

        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:805)

        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)

        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)

        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)

        at java.lang.Thread.run(Thread.java:748)

2018-03-28 17:58:03,025 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow flushOrSync took 13437ms (threshold=300ms), isSync:false, flushTotalNanos=13436879837ns

2018-03-28 17:56:42,980 WARN org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 25844ms

GC pool 'PS MarkSweep' had collection(s): count=63 time=118826ms

2018-03-28 17:59:02,118 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3395ms

GC pool 'PS MarkSweep' had collection(s): count=108 time=206400ms

2018-03-28 17:59:07,882 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3312ms

GC pool 'PS MarkSweep' had collection(s): count=4 time=7664ms

2018-03-28 17:59:13,599 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400 src: /10.100.208.74:51732 dest: /10.100.208.73:50010

2018-03-28 17:59:21,115 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 6989ms

GC pool 'PS MarkSweep' had collection(s): count=7 time=13229ms

2018-03-28 17:59:40,030 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow flushOrSync took 5667ms (threshold=300ms), isSync:false, flushTotalNanos=5666402795ns

2018-03-28 17:59:41,916 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749180_5008405 src: /10.100.208.74:51746 dest: /10.100.208.73:50010

2018-03-28 17:59:49,420 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3333ms

GC pool 'PS MarkSweep' had collection(s): count=10 time=18909ms

2018-03-28 18:00:01,035 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749179_5008404 src: /10.100.208.68:46194 dest: /10.100.208.73:50010

2018-03-28 18:00:55,777 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400 received exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400
already exists in state TEMPORARY and thus cannot be created.

2018-03-28 18:01:37,902 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:13334ms (threshold=300ms)

2018-03-28 18:02:26,014 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: tf73:50010:DataXceiver error processing WRITE_BLOCK operation  src: /10.100.208.74:51732 dst: /10.100.208.73:50010; org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException:
Block BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400 already exists in state TEMPORARY and thus cannot be created.

2018-03-28 18:02:26,014 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7111ms

GC pool 'PS MarkSweep' had collection(s): count=80 time=152336ms

2018-03-28 18:03:38,774 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1366ms

GC pool 'PS MarkSweep' had collection(s): count=11 time=21098ms

2018-03-28 18:03:53,946 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:13320ms (threshold=300ms)

2018-03-28 18:04:32,159 WARN org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 13024ms

GC pool 'PS MarkSweep' had collection(s): count=61 time=116740ms

2018-03-28 18:06:50,214 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-229667812-10.100.208.74-1520931759525:blk_1078749180_5008405

java.io.IOException: Premature EOF from inputStream

        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:202)

        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)

        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)

        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)

        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:503)

        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:903)

        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:805)

        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)

        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)

        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)

        at java.lang.Thread.run(Thread.java:748)

2018-03-28 18:07:43,834 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:11756ms (threshold=300ms)

2018-03-28 18:07:43,834 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749182_5008407 src: /10.100.208.74:51760 dest: /10.100.208.73:50010

2018-03-28 18:07:17,288 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 5282ms

                                                                                                                                                                                                 

2 原因:

3 解决办法
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: