Hadoop中的辅助类ToolRunner和Configured的用法详解
2015-06-03 15:17
1001 查看
在开始学习hadoop时,最痛苦的一件事就是难以理解所写程序的执行过程,让我们先来看这个实例,这个测试类ToolRunnerTest继承Configured的基础上实现了Tool接口,下面对其用到的基类源码进行分析,就可以理解其执行过程是如此简单。。。。。。
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class ToolRunnerTest extends Configured implements Tool {
public int run(String[] arg0) throws Exception {
// 调用基类Configured的getConf获取环境变量实例
Configuration conf = getConf();
// 获取属性值
System.out.println("flower is " + conf.get("flower"));
System.out.println("color id " + conf.get("color"));
System.out.println("blossom ? " + conf.get("blossom"));
System.out.println("this is the host default name ="
+ conf.get("fs.default.name"));
return 0;
}
/**
* @param args
* @throws Exception
*/
public static void main(String[] args) throws Exception {
// 获取当前环境变量
Configuration conf = new Configuration();
// 使用ToolRunner的run方法对自定义的类型进行处理
ToolRunner.run(conf, new ToolRunnerTest(), args);
}
}
基类Configured实现了Configurable接口,而Configurable接口源码如下
public interface Configurable
{
public abstract void setConf(Configuration configuration);
public abstract Configuration getConf();
}
Configured则必须实现Configurable类的两个方法,源码如下
public class Configured
implements Configurable
{
public Configured()
{
this(null);
}
public Configured(Configuration conf)
{
setConf(conf);
}
public void setConf(Configuration conf)
{
this.conf = conf;
}
public Configuration getConf()
{
return conf;
}
private Configuration conf;
}
Tool的源码如下所示:
public interface Tool extends Configurable {
int run(String [] args) throws Exception;
}
就这么一点点
ToolRunner类的源码如下
public class ToolRunner
{
public ToolRunner()
{
}
public static int run(Configuration conf, Tool tool, String args[])
throws Exception
{
if(conf == null)
conf = new Configuration();
GenericOptionsParser parser = new GenericOptionsParser(conf, args);
tool.setConf(conf);
String toolArgs[] = parser.getRemainingArgs();
return tool.run(toolArgs);
}
public static int run(Tool tool, String args[])
throws Exception
{
return run(tool.getConf(), tool, args);
}
public static void printGenericCommandUsage(PrintStream out)
{
GenericOptionsParser.printGenericCommandUsage(out);
}
public static boolean confirmPrompt(String prompt)
throws IOException
{
do
{
...
} while(true);
}
}
解析:当程序执行ToolRunner.run(conf, new ToolRunnerTest(), args);时,会转到ToolRunner类的run方法部分,因为Configuration已经实例,所以直至执行到tool.run(toolArgs);又因为Tool是一个只含有一个run方法框架的接口,所以将执行实现这个接口的类ToolRunnerTest的run方法。完成其输出。其实在看完这几个类的源码后,其执行过程是很简单的
该实例的运行结果如下:
master:/opt>hadoop jar ToolRunnerTest.jar
flower is null
color id null
blossom ? null
14/09/30 10:13:21 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
this is the host default name =hdfs://master:8020
源地址:http://www.it165.net/admin/html/201410/3821.html
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class ToolRunnerTest extends Configured implements Tool {
public int run(String[] arg0) throws Exception {
// 调用基类Configured的getConf获取环境变量实例
Configuration conf = getConf();
// 获取属性值
System.out.println("flower is " + conf.get("flower"));
System.out.println("color id " + conf.get("color"));
System.out.println("blossom ? " + conf.get("blossom"));
System.out.println("this is the host default name ="
+ conf.get("fs.default.name"));
return 0;
}
/**
* @param args
* @throws Exception
*/
public static void main(String[] args) throws Exception {
// 获取当前环境变量
Configuration conf = new Configuration();
// 使用ToolRunner的run方法对自定义的类型进行处理
ToolRunner.run(conf, new ToolRunnerTest(), args);
}
}
基类Configured实现了Configurable接口,而Configurable接口源码如下
public interface Configurable
{
public abstract void setConf(Configuration configuration);
public abstract Configuration getConf();
}
Configured则必须实现Configurable类的两个方法,源码如下
public class Configured
implements Configurable
{
public Configured()
{
this(null);
}
public Configured(Configuration conf)
{
setConf(conf);
}
public void setConf(Configuration conf)
{
this.conf = conf;
}
public Configuration getConf()
{
return conf;
}
private Configuration conf;
}
Tool的源码如下所示:
public interface Tool extends Configurable {
int run(String [] args) throws Exception;
}
就这么一点点
ToolRunner类的源码如下
public class ToolRunner
{
public ToolRunner()
{
}
public static int run(Configuration conf, Tool tool, String args[])
throws Exception
{
if(conf == null)
conf = new Configuration();
GenericOptionsParser parser = new GenericOptionsParser(conf, args);
tool.setConf(conf);
String toolArgs[] = parser.getRemainingArgs();
return tool.run(toolArgs);
}
public static int run(Tool tool, String args[])
throws Exception
{
return run(tool.getConf(), tool, args);
}
public static void printGenericCommandUsage(PrintStream out)
{
GenericOptionsParser.printGenericCommandUsage(out);
}
public static boolean confirmPrompt(String prompt)
throws IOException
{
do
{
...
} while(true);
}
}
解析:当程序执行ToolRunner.run(conf, new ToolRunnerTest(), args);时,会转到ToolRunner类的run方法部分,因为Configuration已经实例,所以直至执行到tool.run(toolArgs);又因为Tool是一个只含有一个run方法框架的接口,所以将执行实现这个接口的类ToolRunnerTest的run方法。完成其输出。其实在看完这几个类的源码后,其执行过程是很简单的
该实例的运行结果如下:
master:/opt>hadoop jar ToolRunnerTest.jar
flower is null
color id null
blossom ? null
14/09/30 10:13:21 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
this is the host default name =hdfs://master:8020
源地址:http://www.it165.net/admin/html/201410/3821.html
相关文章推荐
- CentOS下Crontab安装使用详细说明(转)
- Linux上Simplescalar/ARM的安装和运行文档
- Linux_C编程实例
- linux档案对比diff / meld
- spark与hadoop集成详解
- 重启Windows的PowerShell
- HDFS文件读取流程
- iTOP-4412开发板HDMI转VGA修改方法
- 采用dlopen、dlsym、dlclose加载动态链接库【总结】
- linux命令-du
- 让Windows加入域的PowerShell
- shell编程——变量的数值计算
- 【opencv】hog做行人检测
- shell编程——变量的数值计算
- linux 命令之 apt-get
- 使用Tql执行一个job,并监控job的运行状态
- 关掉Windows Firewall的PowerShell
- iis网站发布相关问题
- 修改IP地址的PowerShell
- 在Linux里设置环境变量的方法(export PATH)