《Hadoop The Definitive Guide》ch05 Developing a MapReduce Application
2012-07-07 15:43
411 查看
1. 介绍
MapReduce应用开发包含特定的流程。首先,编写map和reduce函数,最好能进行单元测试以保证它们能如期运行。然后写一个驱动程序来运行作业,可以使用数据集中的少量数据从IDE运行,看它是否能够正常运行。
2. GenericOptionsParser, Tool和ToolRunner
3. 问题:老是抱怨找不到class?
解决办法:
1. stop-all.sh
2. rm -rf /tmp/hadoop-nomad2/*
3. hadoop namenode -format
4. start-all.sh
5. jps 确认datanode进程起来
6. 重新运行程序,注意这里的jar文件是在HADOOP_CLASSPATH中的,而不是在hdfs中。
4. MapReduce web用户界面
http://server:50030
Job的详细信息,
MapReduce应用开发包含特定的流程。首先,编写map和reduce函数,最好能进行单元测试以保证它们能如期运行。然后写一个驱动程序来运行作业,可以使用数据集中的少量数据从IDE运行,看它是否能够正常运行。
2. GenericOptionsParser, Tool和ToolRunner
[ate: /local/nomad2/hadoop/tomwhite-hadoop-book-32dae01 ] >> hadoop ConfigurationPrinter -conf conf/hadoop-localhost.xml |grep mapred.job.tracker= mapred.job.tracker=localhost:8021
3. 问题:老是抱怨找不到class?
解决办法:
1. stop-all.sh
2. rm -rf /tmp/hadoop-nomad2/*
3. hadoop namenode -format
4. start-all.sh
5. jps 确认datanode进程起来
6. 重新运行程序,注意这里的jar文件是在HADOOP_CLASSPATH中的,而不是在hdfs中。
[ate: /local/nomad2/hadoop/tomwhite-hadoop-book-32dae01 ] >> hadoop jar ch05.jar v3.MaxTemperatureDriver input/ncdc/all max-temp 12/07/03 01:33:40 INFO mapred.FileInputFormat: Total input paths to process : 2 12/07/03 01:33:40 INFO mapred.JobClient: Running job: job_201207030133_0001 12/07/03 01:33:41 INFO mapred.JobClient: map 0% reduce 0% 12/07/03 01:33:55 INFO mapred.JobClient: map 100% reduce 0% 12/07/03 01:34:07 INFO mapred.JobClient: map 100% reduce 100% 12/07/03 01:34:12 INFO mapred.JobClient: Job complete: job_201207030133_0001 12/07/03 01:34:12 INFO mapred.JobClient: Counters: 26 12/07/03 01:34:12 INFO mapred.JobClient: Job Counters 12/07/03 01:34:12 INFO mapred.JobClient: Launched reduce tasks=1 12/07/03 01:34:12 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=16347 12/07/03 01:34:12 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/07/03 01:34:12 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/07/03 01:34:12 INFO mapred.JobClient: Launched map tasks=2 12/07/03 01:34:12 INFO mapred.JobClient: Data-local map tasks=2 12/07/03 01:34:12 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=10004 12/07/03 01:34:12 INFO mapred.JobClient: File Input Format Counters 12/07/03 01:34:12 INFO mapred.JobClient: Bytes Read=147972 12/07/03 01:34:12 INFO mapred.JobClient: File Output Format Counters 12/07/03 01:34:12 INFO mapred.JobClient: Bytes Written=18 12/07/03 01:34:12 INFO mapred.JobClient: FileSystemCounters 12/07/03 01:34:12 INFO mapred.JobClient: FILE_BYTES_READ=28 12/07/03 01:34:12 INFO mapred.JobClient: HDFS_BYTES_READ=148184 12/07/03 01:34:12 INFO mapred.JobClient: FILE_BYTES_WRITTEN=62923 12/07/03 01:34:12 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=18 12/07/03 01:34:12 INFO mapred.JobClient: Map-Reduce Framework 12/07/03 01:34:12 INFO mapred.JobClient: Map output materialized bytes=34 12/07/03 01:34:12 INFO mapred.JobClient: Map input records=13130 12/07/03 01:34:12 INFO mapred.JobClient: Reduce shuffle bytes=34 12/07/03 01:34:12 INFO mapred.JobClient: Spilled Records=4 12/07/03 01:34:12 INFO mapred.JobClient: Map output bytes=118161 12/07/03 01:34:12 INFO mapred.JobClient: Map input bytes=1777168 12/07/03 01:34:12 INFO mapred.JobClient: Combine input records=13129 12/07/03 01:34:12 INFO mapred.JobClient: SPLIT_RAW_BYTES=212 12/07/03 01:34:12 INFO mapred.JobClient: Reduce input records=2 12/07/03 01:34:12 INFO mapred.JobClient: Reduce input groups=2 12/07/03 01:34:12 INFO mapred.JobClient: Combine output records=2 12/07/03 01:34:12 INFO mapred.JobClient: Reduce output records=2 12/07/03 01:34:12 INFO mapred.JobClient: Map output records=13129
4. MapReduce web用户界面
http://server:50030
Job的详细信息,
相关文章推荐
- 《Hadoop The Definitive Guide》ch02 MapReduce
- 《Hadoop The Definitive Guide》ch06 How MapReduce Works
- 《Hadoop: The Definitive Guide》读书笔记 -- Chapter 2 MapReduce
- 《Hadoop The Definitive Guide》ch07 MapReduce Types and Formats
- Hadoop: the definitive guide 第三版 拾遗 第四章 之SequenceFile操作
- Hadoop: the definitive guide 第三版 拾遗 第十一章 之Pig
- Professional Excel Development: The Definitive Guide to Developing Applications Using Microsoft(R) E
- Hadoop:The Definitive Guid 总结 Chapter 7 MapReduce的类型与格式
- Hadoop: the definitive guide 第三版 拾遗 第十二章 之Hive初步
- 《Hadoop The Definitive Guide》ch03 The Hadoop Distributed Filesystem
- Hadoop- The Definitive Guide 笔记二
- Hadoop:The Definitive Guid 总结 Chapter 1~2 初识Hadoop、MapReduce
- Hadoop: the definitive guide 第三版 拾遗 第四章
- 《Hadoop:The Definitive Guide 4th Edition》Chapter 17 Hive——B部分
- Hadoop: the definitive guide 第三版 拾遗 第五章 之MRUnit
- 《Hadoop The Definitive Guide》ch04 Hadoop I/O
- Hadoop:The Definitive Guid 总结 Chapter 1~2 初识Hadoop、MapReduce
- Hadoop: the definitive guide 第三版 拾遗 第十二章 之Hive分区表、桶
- Hadoop:The Definitive Guid 总结 Chapter 5 MapReduce应用开发
- 第一章 遇见HADOOP 第五节 写其它系统比较(hadoop:the definitive guide)