您的位置:首页 > 理论基础 > 计算机网络

hadoop 2.4.0 使用distcp有关问题解决

2014-05-29 16:32 197 查看
hadoop distcp hftp://nn.xxx.xx.com:50070/user/nlp/warehouse/t_m_user_key_action /user/nlp/warehouse/dw1

出现

Caused by: java.io.IOException: Check-sum mismatch between hftp://xxx:50070/foo/yyy.yy and hdfs://dst:8020/foo/xxx.xx

引用

— Distcp using MRv2 (YARN) from a CDH3 cluster to a CDH4 cluster may fail with CRC mismatch errors

Running distcp on a CDH4 YARN cluster with a CDH3 hftp source will fail if the CRC checksum type being used is the CDH4 default (CRC32C). This is because the default checksum type was changed in CDH4 from the CDH3 default of CRC32.

Bug: HADOOP-8060

Severity: Medium

Anticipated Resolution: To be fixed in an upcoming release

Workaround: You can work around this issue by changing the CRC checksum type on the CDH4 cluster to the CDH3 default, CRC32. To do this set dfs.checksum.type to CRC32 in hdfs-site.xml.

在hdfs-site.xml文件里面添加:

<property>

<name>dfs.checksum.type</name>

<value>CRC32</value>

</property>

注意执行命令的集群已经要有另一个集群的所有hosts文件。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: