您的位置：首页 > 运维架构

hadoop-common源码分析之-Configuration

2016-04-21 16:05 459 查看

Configuration类实现了Iterable、Writable接口，使得可以遍历和序列化（hadoop自己序列化）

配置文件格式

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>lala</name>
<value>${user.home}/hadoopdata</value>
<final>true</final>
</property>
</configuration>

hadoop是通过xml进行配置的，同时支持属性扩展，user.home当调用get的时候，会首先通过System.getProperty()判断是否是系统参数，例中${user.home}就被替换成当前用户的path。所以，当我们在配置hadoop时，可以直接用一些系统属性，增强可移植性。
当一个属性被生命为final时，后面添加配置，不会覆盖先加在的配置。
同时，因为使用的是java的DOM解析，所以支持XML的包涵，在配置文件中可以用

<xi:include href="" />

来分类管理。

代码分析

私有内部类Resource

private static class Resource {
//私有内部类，标记资源名字和资源对象
private final Object resource;
private final String name;
...
}

私有内部类DeprecatedKeyInfo、DeprecationDelta、DeprecationContext

private static class DeprecatedKeyInfo {
private final String[] newKeys;
private final String customMessage;
private final AtomicBoolean accessed = new AtomicBoolean(false);
private final String getWarningMessage(String key) {
}
}

public static class DeprecationDelta {
private final String key;
private final String[] newKeys;
private final String customMessage;
}

private static class DeprecationContext {
//存放oldkey－newkeys
private final Map<String, DeprecatedKeyInfo> deprecatedKeyMap;
//存放newkeys－oldkey，提供反查功能
private final Map<String, String> reverseDeprecatedKeyMap;
DeprecationContext(DeprecationContext other, DeprecationDelta[] deltas) {
...
this.deprecatedKeyMap = UnmodifiableMap.decorate(newDeprecatedKeyMap);
this.reverseDeprecatedKeyMap =UnmodifiableMap.decorate(newReverseDeprecatedKeyMap);
}
}

DeprecatedKeyInfo保存了新的key和信息，如果customMessage为空，在调用getWarningMessage会自动生成默认的信息。
DeprecationDelta 保存了被遗弃的key 和建议用的新key。
DeprecationContext封装讲被遗弃的key和推荐使用的keys、提示封装在一起。

private static AtomicReference<DeprecationContext> deprecationContext =
new AtomicReference<DeprecationContext>(
new DeprecationContext(null, defaultDeprecations));

<
3ff0
p>一个全局的DeprecationContext对象，原子的，并且将默认被遗弃的key加载进去。

静态addDeprecations方法

值得一提的是此方法很巧妙的使用无锁的方法，但是，保证了数据的安全性，看具体代码：

public static void addDeprecations(DeprecationDelta[] deltas) {
DeprecationContext prev, next;
do {
prev = deprecationContext.get();
next = new DeprecationContext(prev, deltas);
} while (!deprecationContext.compareAndSet(prev, next));
}

compareAndSet方法是当前对象和prev相等（==）时，更新当前对象为next

setDeprecatedProperties

分析源码，我们发现，setDeprecatedProperties的作用就是为了更新overlay和properties，使得，我们在获得key时，能得到最新的状态，看下面例子：

configuration.addDeprecation("xx", new String[]{"xx1","xx2","xx3"});
//configuration.setDeprecatedProperties();
System.out.println(configuration.get("xx"));

当注释掉

configuration.setDeprecatedProperties

后，我get时，获得的事null值，所以我们要遍历已经被遗弃的key时，需要更新setDeprecatedProperties，可以使得被遗弃的key依旧可以被使用。

handleDeprecation

首先判断该key是否是被遗弃的，如果是，将得到建议用的key，否则更新overlay、properties，并返回建议使用的key数组。

用样handleDeprecation方法是，执行刷新操作。具体用在asXmlDocument中。

static{}静态代码块

分析代码我们可以得到一下几点：

如果在classpath下存在hadoop-site.xml，会log4j会打印警告信息，没有加载到defaultResources。

默认加载两个核心配置文件core-default.xml、core-site.xml

addResourceObject以及若干方法

不管用何种addResource，最终都是调用了addResourceObject(Resource resource)，他首先将资源添加到一个全局的List集合，然后调用reloadConfiguration来触发刷新properties并且标记为final的key失效。

findSubVariable substituteVars

在hadoop-2.7之前，只有一个substituteVars方法，使用java自身的正则表达式来匹配获得

${user.home }

中间的值(user.home)。

hadoop-2.7版本之后,为了提升性能，自己实现了匹配获取中间值的方法(

findSubVariable

) ps:可能是因为，由于java自身的正则表达式方式过于消耗性能，所以，通过自己手动匹配，降低性能的消耗。

//此方法，将字符串中${}中间的位置的区间获取到，详细看代码
private static int[] findSubVariable(String eval) {
...
}
//1.将获取key进行替换，如果System.getProperty()存在，替换
//2.不存在，查找properties，如果存在替换，不存在，原样保留
private String substituteVars(String expr) {
...
}

set方法

//通过程序设置key-value，source允许为空，当用户不设置源时，程序自动将programatically这是为source，
//当值为被遗弃的，此方法会先将新key的到，并设置source为 because old key is deprecated
public void set(String name, String value, String source) {
...
}

loadResource

该方法是解析xml的，采用了DOM解析，分析代码我们知道，xml格式需要和上面写到的格式，同时DOM解析，支持xml文件引入。

和以前版本相比，xml配置文件中，在property中可以声明source标签，声明资源的信息？

if ("source".equals(field.getTagName()) && field.hasChildNodes())
source.add(StringInterner.weakIntern(
((Text)field.getFirstChild()).getData()));

hadoop还提供了一些方法，如，asXmlDocument等，不一一分析，不知道源码分析文章应该怎么写，我觉得还是要自己先读源码，看不懂的过来参考，由于此类有3000多行，不太方便阅读，理解错的地方还请各位指出一起探讨。

总结：

1.配置hadoop的时候，用${user.home}来换成当前用户目录是不是很高大上呢。

2.再配置xml中2.7版本增加了解析source标签，可以存储源的信息，具体在后续源码分析中，在研究它的作用

3.设计的很巧妙，处处为了性能着想啊，还要好好分析剩下源码，等全部分析完成后，再来续写此文章。

版权声明：本文为博主原创文章，未经博主允许不得转载。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： hadoop源码分析 hadoop

相关文章推荐

新的分享

章节导航