您的位置:首页 > 运维架构

Hadoop中的辅助类ToolRunner和Configured的用法详解

2015-06-03 15:17 1001 查看
在开始学习hadoop时,最痛苦的一件事就是难以理解所写程序的执行过程,让我们先来看这个实例,这个测试类ToolRunnerTest继承Configured的基础上实现了Tool接口,下面对其用到的基类源码进行分析,就可以理解其执行过程是如此简单。。。。。。



import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.conf.Configured;

import org.apache.hadoop.util.Tool;

import org.apache.hadoop.util.ToolRunner;

public class ToolRunnerTest extends Configured implements Tool {

public int run(String[] arg0) throws Exception {

// 调用基类Configured的getConf获取环境变量实例

Configuration conf = getConf();

// 获取属性值

System.out.println("flower is " + conf.get("flower"));

System.out.println("color id " + conf.get("color"));

System.out.println("blossom ? " + conf.get("blossom"));

System.out.println("this is the host default name ="

+ conf.get("fs.default.name"));

return 0;

}

/**

* @param args

* @throws Exception

*/

public static void main(String[] args) throws Exception {

// 获取当前环境变量

Configuration conf = new Configuration();

// 使用ToolRunner的run方法对自定义的类型进行处理

ToolRunner.run(conf, new ToolRunnerTest(), args);

}

}



基类Configured实现了Configurable接口,而Configurable接口源码如下


public interface Configurable

{

public abstract void setConf(Configuration configuration);

public abstract Configuration getConf();

}


Configured则必须实现Configurable类的两个方法,源码如下


public class Configured

implements Configurable

{

public Configured()

{

this(null);

}

public Configured(Configuration conf)

{

setConf(conf);

}

public void setConf(Configuration conf)

{

this.conf = conf;

}

public Configuration getConf()

{

return conf;

}

private Configuration conf;

}


Tool的源码如下所示:

public interface Tool extends Configurable {
int run(String [] args) throws Exception;
}
就这么一点点

ToolRunner类的源码如下


public class ToolRunner

{

public ToolRunner()

{

}

public static int run(Configuration conf, Tool tool, String args[])

throws Exception

{

if(conf == null)

conf = new Configuration();

GenericOptionsParser parser = new GenericOptionsParser(conf, args);

tool.setConf(conf);

String toolArgs[] = parser.getRemainingArgs();

return tool.run(toolArgs);

}

public static int run(Tool tool, String args[])

throws Exception

{

return run(tool.getConf(), tool, args);

}

public static void printGenericCommandUsage(PrintStream out)

{

GenericOptionsParser.printGenericCommandUsage(out);

}

public static boolean confirmPrompt(String prompt)

throws IOException

{

do

{

...

} while(true);

}

}


解析:当程序执行ToolRunner.run(conf, new ToolRunnerTest(), args);时,会转到ToolRunner类的run方法部分,因为Configuration已经实例,所以直至执行到tool.run(toolArgs);又因为Tool是一个只含有一个run方法框架的接口,所以将执行实现这个接口的类ToolRunnerTest的run方法。完成其输出。其实在看完这几个类的源码后,其执行过程是很简单的
该实例的运行结果如下:

master:/opt>hadoop jar ToolRunnerTest.jar

flower is null

color id null

blossom ? null

14/09/30 10:13:21 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS

this is the host default name =hdfs://master:8020

源地址:http://www.it165.net/admin/html/201410/3821.html
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: