hdfs datanode节点死掉
2018-03-28 19:35
519 查看
1 现象一个节点dead,日志如下:
GC pool 'PS MarkSweep' had collection(s): count=13 time=24256ms
2018-03-28 17:52:51,624 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-229667812-10.100.208.74-1520931759525:blk_1078749177_5008401 received exception java.io.IOException: Premature EOF from inputStream
2018-03-28 17:52:59,065 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 6944ms
GC pool 'PS MarkSweep' had collection(s): count=15 time=27935ms
2018-03-28 17:54:59,834 WARN org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 14932ms
GC pool 'PS MarkSweep' had collection(s): count=20 time=37701ms
2018-03-28 17:56:10,452 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: tf73:50010:DataXceiver error processing WRITE_BLOCK operation src: /10.100.208.68:46164 dst: /10.100.208.73:50010
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:202)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:503)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:903)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:805)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
at java.lang.Thread.run(Thread.java:748)
2018-03-28 17:58:03,025 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow flushOrSync took 13437ms (threshold=300ms), isSync:false, flushTotalNanos=13436879837ns
2018-03-28 17:56:42,980 WARN org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 25844ms
GC pool 'PS MarkSweep' had collection(s): count=63 time=118826ms
2018-03-28 17:59:02,118 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3395ms
GC pool 'PS MarkSweep' had collection(s): count=108 time=206400ms
2018-03-28 17:59:07,882 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3312ms
GC pool 'PS MarkSweep' had collection(s): count=4 time=7664ms
2018-03-28 17:59:13,599 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400 src: /10.100.208.74:51732 dest: /10.100.208.73:50010
2018-03-28 17:59:21,115 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 6989ms
GC pool 'PS MarkSweep' had collection(s): count=7 time=13229ms
2018-03-28 17:59:40,030 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow flushOrSync took 5667ms (threshold=300ms), isSync:false, flushTotalNanos=5666402795ns
2018-03-28 17:59:41,916 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749180_5008405 src: /10.100.208.74:51746 dest: /10.100.208.73:50010
2018-03-28 17:59:49,420 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3333ms
GC pool 'PS MarkSweep' had collection(s): count=10 time=18909ms
2018-03-28 18:00:01,035 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749179_5008404 src: /10.100.208.68:46194 dest: /10.100.208.73:50010
2018-03-28 18:00:55,777 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400 received exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400
already exists in state TEMPORARY and thus cannot be created.
2018-03-28 18:01:37,902 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:13334ms (threshold=300ms)
2018-03-28 18:02:26,014 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: tf73:50010:DataXceiver error processing WRITE_BLOCK operation src: /10.100.208.74:51732 dst: /10.100.208.73:50010; org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException:
Block BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400 already exists in state TEMPORARY and thus cannot be created.
2018-03-28 18:02:26,014 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7111ms
GC pool 'PS MarkSweep' had collection(s): count=80 time=152336ms
2018-03-28 18:03:38,774 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1366ms
GC pool 'PS MarkSweep' had collection(s): count=11 time=21098ms
2018-03-28 18:03:53,946 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:13320ms (threshold=300ms)
2018-03-28 18:04:32,159 WARN org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 13024ms
GC pool 'PS MarkSweep' had collection(s): count=61 time=116740ms
2018-03-28 18:06:50,214 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-229667812-10.100.208.74-1520931759525:blk_1078749180_5008405
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:202)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:503)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:903)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:805)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
at java.lang.Thread.run(Thread.java:748)
2018-03-28 18:07:43,834 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:11756ms (threshold=300ms)
2018-03-28 18:07:43,834 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749182_5008407 src: /10.100.208.74:51760 dest: /10.100.208.73:50010
2018-03-28 18:07:17,288 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 5282ms
2 原因:
3 解决办法
GC pool 'PS MarkSweep' had collection(s): count=13 time=24256ms
2018-03-28 17:52:51,624 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-229667812-10.100.208.74-1520931759525:blk_1078749177_5008401 received exception java.io.IOException: Premature EOF from inputStream
2018-03-28 17:52:59,065 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 6944ms
GC pool 'PS MarkSweep' had collection(s): count=15 time=27935ms
2018-03-28 17:54:59,834 WARN org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 14932ms
GC pool 'PS MarkSweep' had collection(s): count=20 time=37701ms
2018-03-28 17:56:10,452 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: tf73:50010:DataXceiver error processing WRITE_BLOCK operation src: /10.100.208.68:46164 dst: /10.100.208.73:50010
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:202)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:503)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:903)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:805)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
at java.lang.Thread.run(Thread.java:748)
2018-03-28 17:58:03,025 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow flushOrSync took 13437ms (threshold=300ms), isSync:false, flushTotalNanos=13436879837ns
2018-03-28 17:56:42,980 WARN org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 25844ms
GC pool 'PS MarkSweep' had collection(s): count=63 time=118826ms
2018-03-28 17:59:02,118 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3395ms
GC pool 'PS MarkSweep' had collection(s): count=108 time=206400ms
2018-03-28 17:59:07,882 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3312ms
GC pool 'PS MarkSweep' had collection(s): count=4 time=7664ms
2018-03-28 17:59:13,599 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400 src: /10.100.208.74:51732 dest: /10.100.208.73:50010
2018-03-28 17:59:21,115 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 6989ms
GC pool 'PS MarkSweep' had collection(s): count=7 time=13229ms
2018-03-28 17:59:40,030 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow flushOrSync took 5667ms (threshold=300ms), isSync:false, flushTotalNanos=5666402795ns
2018-03-28 17:59:41,916 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749180_5008405 src: /10.100.208.74:51746 dest: /10.100.208.73:50010
2018-03-28 17:59:49,420 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3333ms
GC pool 'PS MarkSweep' had collection(s): count=10 time=18909ms
2018-03-28 18:00:01,035 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749179_5008404 src: /10.100.208.68:46194 dest: /10.100.208.73:50010
2018-03-28 18:00:55,777 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400 received exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400
already exists in state TEMPORARY and thus cannot be created.
2018-03-28 18:01:37,902 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:13334ms (threshold=300ms)
2018-03-28 18:02:26,014 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: tf73:50010:DataXceiver error processing WRITE_BLOCK operation src: /10.100.208.74:51732 dst: /10.100.208.73:50010; org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException:
Block BP-229667812-10.100.208.74-1520931759525:blk_1078749174_5008400 already exists in state TEMPORARY and thus cannot be created.
2018-03-28 18:02:26,014 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7111ms
GC pool 'PS MarkSweep' had collection(s): count=80 time=152336ms
2018-03-28 18:03:38,774 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1366ms
GC pool 'PS MarkSweep' had collection(s): count=11 time=21098ms
2018-03-28 18:03:53,946 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:13320ms (threshold=300ms)
2018-03-28 18:04:32,159 WARN org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 13024ms
GC pool 'PS MarkSweep' had collection(s): count=61 time=116740ms
2018-03-28 18:06:50,214 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-229667812-10.100.208.74-1520931759525:blk_1078749180_5008405
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:202)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:503)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:903)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:805)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
at java.lang.Thread.run(Thread.java:748)
2018-03-28 18:07:43,834 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:11756ms (threshold=300ms)
2018-03-28 18:07:43,834 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-229667812-10.100.208.74-1520931759525:blk_1078749182_5008407 src: /10.100.208.74:51760 dest: /10.100.208.73:50010
2018-03-28 18:07:17,288 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 5282ms
2 原因:
3 解决办法
相关文章推荐
- HDFS中datanode节点block损坏后的自动恢复过程
- 集群datanode节点失败导致hdfs写失败
- flume 集群datanode节点失败导致hdfs写失败(转)
- 在HDFS集群中优化secondary namenode到datanode1节点上,并做重启hdfs集群后,datanode1启动失败
- HDFS集群的启动(2)——DataNode节点的注册
- HDFS正常启动,DataNode节点个数为0的问题
- hadoop配置新节点后,出现 org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible n
- HDFS的dataNode节点启动不起来
- 后端分布式系列:分布式存储-HDFS DataNode 设计实现解析
- Hadoop 五:Hadoop-Hdfs DataNode
- hadoop datanode节点超时时间设置
- HDFS学习(三) – Namenode and Datanode
- HDFS学习– Namenode and Datanode
- hdfs某台服务器datanode服务占用cpu过高
- hdfs某台服务器datanode服务占用cpu过高
- Hdfs(NameNode&DataNode)和Hive迁移总结
- hadoop集群运行jps命令以后Datanode节点未启动的解决办法
- hadoop集群运行jps命令以后Datanode节点未启动的解决办法
- hadoop集群中动态增加新的DataNode节点
- DataNode节点的数据块管理 FSVolumeSet、FSVolume