您的位置:首页 > 运维架构

Hadoop集群环境搭建中一个错误的解决方案

2010-10-27 21:26 513 查看
本文转自我的ChinaUnix博客: http://blog.chinaunix.net/u3/107162/showart_2204785.html

环境已经搭好并可以启动了,如下:

maohong@maohong-desktop:~/Software/Development/Hadoop/hadoop-0.20.2$ bin/start-all.sh

starting namenode, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-namenode-maohong-desktop.out

slave1: starting datanode, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-datanode-debian.out

slave2: starting datanode, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-datanode-node2.out

master: starting datanode, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-datanode-maohong-desktop.out

master: starting secondarynamenode, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-secondarynamenode-maohong-desktop.out

starting jobtracker, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-jobtracker-maohong-desktop.out

slave1: starting tasktracker, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-tasktracker-debian.out

slave2: starting tasktracker, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-tasktracker-node2.out

master: starting tasktracker, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-tasktracker-maohong-desktop.out

maohong@maohong-desktop:~/Software/Development/Hadoop/hadoop-0.20.2$ jps

22565 SecondaryNameNode

22646 JobTracker

22342 DataNode

22907 Jps

22115 NameNode

22861 TaskTracker

但是执行wordcount程序到时候出现Error: java.lang.NullPointerException错误如下:

maohong@maohong-desktop:~/Software/Development/Hadoop/hadoop-0.20.2$ bin/hadoop jar hadoop-0.20.2-examples.jar wordcount test-in test-out

10/03/25 19:40:05 INFO input.FileInputFormat: Total input paths to process : 4

10/03/25 19:40:05 INFO mapred.JobClient: Running job: job_201003251936_0001

10/03/25 19:40:06 INFO mapred.JobClient: map 0% reduce 0%

10/03/25 19:40:13 INFO mapred.JobClient: map 50% reduce 0%

10/03/25 19:40:14 INFO mapred.JobClient: map 100% reduce 0%

10/03/25 19:40:21 INFO mapred.JobClient: Task Id : attempt_201003251936_0001_r_000000_0, Status : FAILED

Error: java.lang.NullPointerException

at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

10/03/25 19:40:21 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&taskid=attempt_201003251936_0001_r_000000_0&filter=stdout

10/03/25 19:40:21 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&taskid=attempt_201003251936_0001_r_000000_0&filter=stderr

10/03/25 19:40:27 INFO mapred.JobClient: Task Id : attempt_201003251936_0001_r_000000_1, Status : FAILED

Error: java.lang.NullPointerException

at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

10/03/25 19:40:27 WARN mapred.JobClient: Error reading task outputnode2.1036dhcp

10/03/25 19:40:27 WARN mapred.JobClient: Error reading task outputnode2.1036dhcp

10/03/25 19:40:36 INFO mapred.JobClient: Task Id : attempt_201003251936_0001_r_000000_2, Status : FAILED

Error: java.lang.NullPointerException

at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

10/03/25 19:40:45 INFO mapred.JobClient: Job complete: job_201003251936_0001

10/03/25 19:40:45 INFO mapred.JobClient: Counters: 12

10/03/25 19:40:45 INFO mapred.JobClient: Job Counters

10/03/25 19:40:45 INFO mapred.JobClient: Launched reduce tasks=4

10/03/25 19:40:45 INFO mapred.JobClient: Launched map tasks=4

10/03/25 19:40:45 INFO mapred.JobClient: Data-local map tasks=4

10/03/25 19:40:45 INFO mapred.JobClient: Failed reduce tasks=1

10/03/25 19:40:45 INFO mapred.JobClient: FileSystemCounters

10/03/25 19:40:45 INFO mapred.JobClient: HDFS_BYTES_READ=8637

10/03/25 19:40:45 INFO mapred.JobClient: FILE_BYTES_WRITTEN=11495

10/03/25 19:40:45 INFO mapred.JobClient: Map-Reduce Framework

10/03/25 19:40:45 INFO mapred.JobClient: Combine output records=900

10/03/25 19:40:45 INFO mapred.JobClient: Map input records=83

10/03/25 19:40:45 INFO mapred.JobClient: Spilled Records=900

10/03/25 19:40:45 INFO mapred.JobClient: Map output bytes=14697

10/03/25 19:40:45 INFO mapred.JobClient: Combine input records=1525

10/03/25 19:40:45 INFO mapred.JobClient: Map output records=1525

maohong@maohong-desktop:~/Software/Development/Hadoop/hadoop-0.20.2$

jobtracker的log文件如下:

2010-03-25 19:40:09,447 INFO org.apache.hadoop.mapred.JobInProgress: Choosing data-local task task_201003251936_0001_m_000003

2010-03-25 19:40:12,268 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201003251936_0001_m_000000_0' has completed task_201003251936_0001_m_000000 successfully.

2010-03-25 19:40:12,268 INFO org.apache.hadoop.mapred.ResourceEstimator: completedMapsUpdates:1 completedMapsInputSize:4275 completedMapsOutputSize:5190

2010-03-25 19:40:12,271 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201003251936_0001_m_000001_0' has completed task_201003251936_0001_m_000001 successfully.

2010-03-25 19:40:12,271 INFO org.apache.hadoop.mapred.ResourceEstimator: completedMapsUpdates:2 completedMapsInputSize:5745 completedMapsOutputSize:7302

2010-03-25 19:40:12,288 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201003251936_0001_r_000000_0' to tip task_201003251936_0001_r_000000, for tracker 'tracker_localhost:localhost/127.0.0.1:38831'

2010-03-25 19:40:12,522 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201003251936_0001_m_000002_0' has completed task_201003251936_0001_m_000002 successfully.

2010-03-25 19:40:12,522 INFO org.apache.hadoop.mapred.ResourceEstimator: completedMapsUpdates:3 completedMapsInputSize:7215 completedMapsOutputSize:9414

2010-03-25 19:40:12,524 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201003251936_0001_m_000003_0' has completed task_201003251936_0001_m_000003 successfully.

2010-03-25 19:40:12,524 INFO org.apache.hadoop.mapred.ResourceEstimator: completedMapsUpdates:4 completedMapsInputSize:8641 completedMapsOutputSize:11367

2010-03-25 19:40:18,300 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201003251936_0001_r_000000_0: Error: java.lang.NullPointerException

at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

2010-03-25 19:40:18,301 INFO org.apache.hadoop.mapred.JobTracker: Adding task (cleanup)'attempt_201003251936_0001_r_000000_0' to tip task_201003251936_0001_r_000000, for tracker 'tracker_localhost:localhost/127.0.0.1:38831'

2010-03-25 19:40:21,307 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_0' from 'tracker_localhost:localhost/127.0.0.1:38831'

2010-03-25 19:40:21,559 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201003251936_0001_r_000000_1' to tip task_201003251936_0001_r_000000, for tracker 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'

2010-03-25 19:40:24,599 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201003251936_0001_r_000000_1: Error: java.lang.NullPointerException

at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

2010-03-25 19:40:24,600 INFO org.apache.hadoop.mapred.JobTracker: Adding task (cleanup)'attempt_201003251936_0001_r_000000_1' to tip task_201003251936_0001_r_000000, for tracker 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'

2010-03-25 19:40:27,607 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_1' from 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'

2010-03-25 19:40:30,201 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201003251936_0001_r_000000_2' to tip task_201003251936_0001_r_000000, for tracker 'tracker_maohong-desktop:localhost/127.0.0.1:60931'

2010-03-25 19:40:33,260 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201003251936_0001_r_000000_2: Error: java.lang.NullPointerException

at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

2010-03-25 19:40:33,261 INFO org.apache.hadoop.mapred.JobTracker: Adding task (cleanup)'attempt_201003251936_0001_r_000000_2' to tip task_201003251936_0001_r_000000, for tracker 'tracker_maohong-desktop:localhost/127.0.0.1:60931'

2010-03-25 19:40:36,266 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201003251936_0001_r_000000_3' to tip task_201003251936_0001_r_000000, for tracker 'tracker_maohong-desktop:localhost/127.0.0.1:60931'

2010-03-25 19:40:36,266 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_2' from 'tracker_maohong-desktop:localhost/127.0.0.1:60931'

2010-03-25 19:40:39,270 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201003251936_0001_r_000000_3: Error: java.lang.NullPointerException

at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

2010-03-25 19:40:39,271 INFO org.apache.hadoop.mapred.JobTracker: Adding task (cleanup)'attempt_201003251936_0001_r_000000_3' to tip task_201003251936_0001_r_000000, for tracker 'tracker_maohong-desktop:localhost/127.0.0.1:60931'

2010-03-25 19:40:42,278 INFO org.apache.hadoop.mapred.TaskInProgress: TaskInProgress task_201003251936_0001_r_000000 has failed 4 times.

2010-03-25 19:40:42,278 INFO org.apache.hadoop.mapred.JobInProgress: Aborting job job_201003251936_0001

2010-03-25 19:40:42,279 INFO org.apache.hadoop.mapred.JobInProgress: Killing job 'job_201003251936_0001'

2010-03-25 19:40:42,279 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201003251936_0001_m_000004_0' to tip task_201003251936_0001_m_000004, for tracker 'tracker_maohong-desktop:localhost/127.0.0.1:60931'

2010-03-25 19:40:42,279 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_3' from 'tracker_maohong-desktop:localhost/127.0.0.1:60931'

2010-03-25 19:40:45,288 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201003251936_0001_m_000004_0' has completed task_201003251936_0001_m_000004 successfully.

2010-03-25 19:40:45,333 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000004_0' from 'tracker_maohong-desktop:localhost/127.0.0.1:60931'

2010-03-25 19:40:45,333 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_2' from 'tracker_maohong-desktop:localhost/127.0.0.1:60931'

2010-03-25 19:40:45,334 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_3' from 'tracker_maohong-desktop:localhost/127.0.0.1:60931'

2010-03-25 19:40:45,334 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000000_0' from 'tracker_localhost:localhost/127.0.0.1:38831'

2010-03-25 19:40:45,334 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000001_0' from 'tracker_localhost:localhost/127.0.0.1:38831'

2010-03-25 19:40:45,334 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000005_0' from 'tracker_localhost:localhost/127.0.0.1:38831'

2010-03-25 19:40:45,335 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_0' from 'tracker_localhost:localhost/127.0.0.1:38831'

2010-03-25 19:40:45,694 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000002_0' from 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'

2010-03-25 19:40:45,694 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000003_0' from 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'

2010-03-25 19:40:45,694 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_1' from 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'

其中一个tasktracker的log如下,另外两个tasktracker的log也有与此同样的错误:

2010-03-25 19:40:30,249 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201003251936_0001_r_000000_2

2010-03-25 19:40:30,249 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201003251936_0001_r_000000_2

2010-03-25 19:40:30,587 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201003251936_0001_r_1711860611

2010-03-25 19:40:30,588 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201003251936_0001_r_1711860611 spawned.

2010-03-25 19:40:31,057 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_201003251936_0001_r_1711860611 given task: attempt_201003251936_0001_r_000000_2

2010-03-25 19:40:31,437 FATAL org.apache.hadoop.mapred.TaskTracker: Task: attempt_201003251936_0001_r_000000_2 - Killed : java.lang.NullPointerException

at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

问题的原因在于master和slave节点的/etc/hosts文件~~~~~~
/etc/hosts文件中的主机名一定要是机器名,而不是master、slave1、slave2,否则不能正确解析。这就是结症所在!

改了之后就解决了,呵呵
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: