您的位置:首页 > 运维架构 > Apache

Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hado

2014-12-02 09:00 435 查看
今天在数据合并的时候发现map 成功但是reduce总是失败,

问题简单描述:把每天采集的数据,合并都汇总表中,按天为分区。

如下看到map在执行时均成功:但是在最终的 reduce阶段失败,查找原因:

Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException):
The maximum path component name limit of query_date=

hive> from log_169_searchd_pro_20141122 insert into table searchd_pro1 PARTITION (query_date)

    > select to_date(query_date),real_time,wall_time,match_mode,filters_count,sort_mode,total_matches,offset,index_name,query;

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks not specified. Estimated from input data size: 1

In order to change the average load for a reducer (in bytes):

  set hive.exec.reducers.bytes.per.reducer=<number>

In order to limit the maximum number of reducers:

  set hive.exec.reducers.max=<number>

In order to set a constant number of reducers:

  set mapreduce.job.reduces=<number>

Starting Job = job_1417056252041_0024, Tracking URL = http://master:8088/proxy/application_1417056252041_0024/
Kill Command = /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop job  -kill job_1417056252041_0024

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1

2014-12-02 08:22:48,257 Stage-1 map = 0%,  reduce = 0%

2014-12-02 08:23:10,167 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 17.47 sec

2014-12-02 08:23:24,801 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 22.34 sec

2014-12-02 08:23:27,935 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 17.47 sec

2014-12-02 08:23:39,476 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 22.16 sec

2014-12-02 08:23:43,664 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 17.47 sec

2014-12-02 08:23:55,185 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 22.22 sec

2014-12-02 08:23:58,325 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 17.47 sec

2014-12-02 08:24:09,757 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 22.25 sec

2014-12-02 08:24:12,887 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 17.47 sec

MapReduce Total cumulative CPU time: 17 seconds 470 msec

Ended Job = job_1417056252041_0024 with errors

Error during job, obtaining debugging information...

Examining task ID: task_1417056252041_0024_m_000000 (and more) from job job_1417056252041_0024

Task with the most failures(4): 

-----

Task ID:

  task_1417056252041_0024_r_000000

URL:

  http://master:8088/taskdetails.jsp?jobid=job_1417056252041_0024&tipid=task_1417056252041_0024_r_000000
-----

Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException):
The maximum path component name limit of query_date=%2212 位 太阳能 计算器 , 8 位 礼品 计算器 , 语音 计算器 12 位 , 8 位 太阳能 计算器 , 12 位 白色 计算器 , 8 位 数字 计算器 , 8 位 硅胶 计算器 , 8 位 翻盖 计算器 , 8 位 塑胶 计算器 , 12 位 台式 计算器 , 8 位 台式 计算器 , 8 位数 显 计算器%22%2F24 in directory /tmp/hive-hdfs/hive_2014-12-02_08-22-34_112_762235043055488695-1/_task_tmp.-ext-10000
is exceeded: limit=255 length=329

        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxComponentLength(FSDirectory.java:1915)

        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1989)

        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1759)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4149)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2625)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2509)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2397)

        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:550)

        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:108)

        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:388)

        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)

        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

        at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:283)

        at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)

        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)

        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)

        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

问题排查定位:从上面看,明显是数据合并阶段超过了最大值,我们这次采集2条记录。

insert into table searchd_pro1 PARTITION (query_date) 

select to_date(query_date),real_time,wall_time,match_mode,filters_count,sort_mode,total_matches,offset,index_name,query from log_170_searchd_pro_20141130 limit 2;

MapReduce Jobs Launched: 

Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 4.75 sec   HDFS Read: 65802 HDFS Write: 261 SUCCESS

Stage-Stage-2: Map: 1  Reduce: 1   Cumulative CPU: 3.45 sec   HDFS Read: 621 HDFS Write: 324 SUCCESS

Total MapReduce CPU Time Spent: 8 seconds 200 msec

OK

Time taken: 71.38 seconds

成功了。查看数据发现:

hive> select *

    >  from searchd_pro1;

OK

NULL    0.05    0.051   NULL    0       NULL    1       (0,4000)        product_distri  "遂宁" "仔猪" "行情"

NULL    0.055   0.055   NULL    0       NULL    0       (0,4000)        product_new_distri      wb 200 f "充电器" "包邮"

Time taken: 0.178 seconds, Fetched: 2 row(s)

有空值null,而且明显数据错误,在合并的过程中日期值,获取的时最后一个字段,有大量的值,明显的数据倾斜。所以会失败。

解决办法:把分区字段放在最后,各个列对应正确的位置。

hive> insert into table searchd_pro PARTITION (query_date)

    > select real_time,wall_time,match_mode,filters_count,sort_mode,total_matches,offset,index_name,query,to_date(query_date) from log_170_searchd_pro_20141130;

另外一种方法:增加足够多的reduce也能成功,不过在这里数据是错误的。所以不是本身reduce不足的问题。

完毕。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐