您的位置:首页 > 运维架构

关于使用hadoop出各种错的一些积累

2017-04-26 11:00 267 查看
转载自该博客:http://www.linuxdiyf.com/linux/15750.html

一.运行本地的MapReduce程序卡着不动,hadoop日志中显示End of File Exception between local host is: "master/192.168.1.102"; destination host is: "master":9000;这个异常的错误:

1.修改机器/etc/hostname分别为

master

slave1

slave2

重启

2.修改/etc/hosts,删除多余项

192.168.2.223 master

192.168.2.222 slave2

192.168.2.224 slave1

注释掉127.0.0.1 localhost回环地址,及其它127.0.0.1地址和包含[master|slave1|slave2]的项

3.注意删除hadoop目录下dfs/data、tmp目录的所有文件

其他出错:

2015-11-11 01:26:28,191 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to master/172.17.0.27:9000. Exiting.

java.io.IOException: Incompatible clusterIDs in /usr/local/hadoop-2.6.0/dfs/data: namenode clusterID = CID-8a0b27a3-b53c-4e97-81e4-6937db707fc9; datanode clusterID = CID-b948da8a-e05a-4c45-97ff-7c2446000fb0

解决:查看dfs/name/current/VERSION中的clusterID ,将其余节点都改为这个值。

2015-11-11 02:36:18,651 FATAL org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getDatanode: Data node DatanodeRegistration(172.17.0.24, datanodeUuid=6892adc4-0d44-4124-ba2a-927eee507954, infoPort=50075, ipcPort=50020, storageInfo=lv=-56;cid=CID-9cb6ddf6-1e61-4b34-906b-690ede335e8e;nsid=1714433657;c=0)
is attempting to report storage ID 6892adc4-0d44-4124-ba2a-927eee507954. Node 172.17.0.28:50010 is expected to serve this storage.

解决:

删除tmp文件夹里文件

删除dfs/data

bin/hadoop datanode -format

2015/11/11 03:26:03 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/root/.staging/job_1447211927171_0001

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot delete /user/root/grep-temp-909165866. Name node is in safe mode.

The reported blocks 27 needs additional 8 blocks to reach the threshold 0.9990 of total blocks 35.

The number of live datanodes 3 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.

解决:

yarn-site.xml添加:

<property>

<name>yarn.resourcemanager.hostname</name>

<value>master</value>

</property>

强制关闭

bin/hadoop dfsadmin -safemode leave

用户可以通过dfsadmin -safemode value   来操作安全模式,参数value的说明如下:

enter - 进入安全模式

leave - 强制NameNode离开安全模式

get -   返回安全模式是否开启的信息

wait - 等待,一直到安全模式结束。

2015-11-11 01:56:55,795 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: IOException in offerService

java.io.EOFException: End of File Exception between local host is: "slave2/172.17.0.25"; destination host is: "master":9000; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException
解决:重新格式化NameNode

MapReduce卡住:

解决方法:

删除/etc/hosts中多余的项,比如:

127.0.0.1 localhost

127.0.0.1 slave1

不能用127.0.0.1代替本机地址,对应的错误:

2013-10-15 09:52:31,351 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: sunliang/192.168.1.232:9000. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

2013-10-15 09:52:32,352 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: sunliang/192.168.1.232:9000. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

解决:

删除/etc/hosts中多余的项

删除所用的tmp文件夹,然后执行hadoop namenode -format 进行格式化,在重新启动start-all.sh

还有个问题就是有防火墙,关闭防火墙

该文章为转载内容,如有转载请注明转载处。

4000
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: