精通HADOOP(九) - MAPREDUCE任务的基础知识 - 执行作业
2010-11-30 01:53
573 查看
1.1 执行作业
配置你的MapReduce作业的最终目标是执行作业。MapReduceIntro.java样例程序阐述了一个简单的方式执行一个作业,如列表2-1所示,logger .info("Launching the job."); /** Send the job configuration to the framework * and request that the job be run. */ final RunningJob job = JobClient.runJob(conf); logger.info("The job has completed.");
runJob()方法向框架提交配置信息,然后,等待框架完成作业后返回。job对象引用包含着相应结果信息。
RunningJob类提供了许多检查响应的方法。可能最有用的就是job.isSuccessful()。
下面执行MapReduceIntro.java(使用本书附带代码中的CH2.jar文件):
hadoop jar DOWNLOAD_PATH/ch2.jar ➥
com.apress.hadoopbook.examples.ch2.MapReduceIntro
相应如下:
ch2.MapReduceIntroConfig: Generating 3 input files of random data, each record
is a random number TAB the input file name
ch2.MapReduceIntro: Launching the job.
jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
mapred.JobClient: Use GenericOptionsParser for parsing the arguments.
Applications should implement Tool for the same.
mapred.FileInputFormat: Total input paths to process : 3
mapred.FileInputFormat: Total input paths to process : 3
mapred.FileInputFormat: Total input paths to process : 3
mapred.FileInputFormat: Total input paths to process : 3
mapred.JobClient: Running job: job_local_0001
mapred.MapTask: numReduceTasks: 1
mapred.MapTask: io.sort.mb = 1
mapred.MapTask: data buffer = 796928/996160
mapred.MapTask: record buffer = 2620/3276
mapred.MapTask: Starting flush of map output
mapred.MapTask: bufstart = 0; bufend = 664; bufvoid = 996160
mapred.MapTask: kvstart = 0; kvend = 14; length = 3276
mapred.MapTask: Index: (0, 694, 694)
mapred.MapTask: Finished spill 0
mapred.LocalJobRunner: file:/tmp/MapReduceIntroInput/file-2:0+664
mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
mapred.TaskRunner: Saved output of task 'attempt_local_0001_m_000000_0' to
file:/tmp/MapReduceIntroOutput
mapred.MapTask: numReduceTasks: 1
mapred.MapTask: io.sort.mb = 1
mapred.MapTask: data buffer = 796928/996160
mapred.MapTask: record buffer = 2620/3276
mapred.MapTask: Starting flush of map output
mapred.MapTask: bufstart = 0; bufend = 3418; bufvoid = 996160
mapred.MapTask: kvstart = 0; kvend = 72; length = 3276
mapred.MapTask: Index: (0, 3564, 3564)
mapred.MapTask: Finished spill 0
mapred.LocalJobRunner: file:/tmp/MapReduceIntroInput/file-1:0+3418
mapred.TaskRunner: Task 'attempt_local_0001_m_000001_0' done.
mapred.TaskRunner: Saved output of task 'attempt_local_0001_m_000001_0' to
file:/tmp/MapReduceIntroOutput
mapred.MapTask: numReduceTasks: 1
mapred.MapTask: io.sort.mb = 1
mapred.MapTask: data buffer = 796928/996160
mapred.MapTask: record buffer = 2620/3276
mapred.MapTask: Starting flush of map output
mapred.MapTask: bufstart = 0; bufend = 3986; bufvoid = 996160
mapred.MapTask: kvstart = 0; kvend = 84; length = 3276
mapred.MapTask: Index: (0, 4156, 4156)
mapred.MapTask: Finished spill 0
mapred.LocalJobRunner: file:/tmp/MapReduceIntroInput/file-0:0+3986
mapred.TaskRunner: Task 'attempt_local_0001_m_000002_0' done.
mapred.TaskRunner: Saved output of task 'attempt_local_0001_m_000002_0' to
file:/tmp/MapReduceIntroOutput
mapred.ReduceTask: Initiating final on-disk merge with 3 files
mapred.Merger: Merging 3 sorted segments
mapred.Merger: Down to the last merge-pass, with 3 segments left of total size:
8414 bytes
mapred.LocalJobRunner: reduce > reduce
mapred.TaskRunner: Task 'attempt_local_0001_r_000000_0' done.
mapred.TaskRunner: Saved output of task 'attempt_local_0001_r_000000_0' to
file:/tmp/MapReduceIntroOutput
mapred.JobClient: Job complete: job_local_0001
mapred.JobClient: Counters: 11
mapred.JobClient: File Systems
mapred.JobClient: Local bytes read=230060
mapred.JobClient: Local bytes written=319797
mapred.JobClient: Map-Reduce Framework
mapred.JobClient: Reduce input groups=170
mapred.JobClient: Combine output records=0
mapred.JobClient: Map input records=170
mapred.JobClient: Reduce output records=170
mapred.JobClient: Map output bytes=8068
mapred.JobClient: Map input bytes=8068
mapred.JobClient: Combine input records=0
mapred.JobClient: Map output records=170
mapred.JobClient: Reduce input records=170
ch2.MapReduceIntro: The job has completed.
ch2.MapReduceIntro: The job completed successfully.
恭喜,你已经成功的执行了MapReduce作业了。
Reduce任务仅仅有一个输出文件/tmp/MapReduceIntroOutput/part-00000,这包含一些列的行数据,每一行的格式如下:
Number TAB file:/tmp/MapReduceIntroInput/file-N
首先你会看到这个序号不是连续的。产生输入的代码为输入的每一行的关键字产生一个随机数,但是这个样例程序告诉框架关键字是Text类型。所以,框架对这些数字进行字符排序,并非我们想要的数字排序。
相关文章推荐
- 精通HADOOP(八) - MAPREDUCE任务的基础知识 - 配置作业
- 精通HADOOP(七) - MAPREDUCE任务的基础知识 - Hadoop MapReduce任务的基本构成要素
- 精通HADOOP(十) - MAPREDUCE任务的基础知识 - 创建客户化的Mapper和Reducer
- 精通HADOOP(十一) - MAPREDUCE任务的基础知识 - 总结
- 一个MapReuce作业的从开始到结束--第6章Hadoop以Jar包的方式执行MapReduce任务
- 【Hadoop】MapReduce笔记(一):MapReduce作业运行过程、任务执行
- hadoop MapReduce - 从作业、任务(task)、管理员角度调优
- 记Hadoop2.5.0线上mapreduce任务执行map任务划分的一次问题解决
- Hadoop MapReduce之MapTask任务执行(一)
- Hadoop MapReduce执行框架作业调度方法 组件和执行流程
- Hadoop MapReduce 任务执行流程源代码详细解析(转载)
- hadoop MapReduce - 从作业、任务(task)、管理员角度调优
- 浅析Hadoop中MapReduce任务执行流程
- hadoop 8088 看不到mapreduce 任务的执行状态
- hadoop MapReduce - 从作业、任务(task)、管理员角度调优
- Hadoop MapReduce之ReduceTask任务执行(二):GetMapEventsThread线程
- Hadoop MapReduce 任务执行流程源代码详细解析
- Hadoop MapReduce之MapTask任务执行(三)
- Hadoop MapReduce之ReduceTask任务执行(一)
- Hadoop MapReduce之ReduceTask任务执行(五)