hive表导入数据:FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
2020-07-13 05:54
651 查看
hive程序报FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
欢迎使用Markdown编辑器
在进行拉链表练习的时候,练习代码如下:
数据
2019-08-21订单数据表(order) 1,2019-08-18,2019-08-18,创建 2,2019-08-18,2019-08-18,创建 3,2019-08-19,2019-08-21,支付 4,2019-08-19,2019-08-21,完成 5,2019-08-19,2019-08-20,支付 6,2019-08-20,2019-08-20,创建 7,2019-08-20,2019-08-21,支付 8,2019-08-21,2019-08-21,创建 2019-08-22订单数据表(order) *1,2019-08-18,2019-08-22,支付 *2,2019-08-18,2019-08-22,完成 3,2019-08-19,2019-08-21,支付 4,2019-08-19,2019-08-21,完成 5,2019-08-19,2019-08-20,支付 *6,2019-08-20,2019-08-22,支付 7,2019-08-20,2019-08-21,支付 *8,2019-08-21,2019-08-22,支付 *9,2019-08-22,2019-08-22,创建 *10,2019-08-22,2019-08-22,支付 2019-08-23订单数据表 1,2019-08-18,2019-08-23,完成 2,2019-08-18,2019-08-22,完成 3,2019-08-19,2019-08-23,完成 4,2019-08-19,2019-08-21,完成 5,2019-08-19,2019-08-23,完成 6,2019-08-20,2019-08-22,支付 7,2019-08-20,2019-08-21,支付 8,2019-08-21,2019-08-23,完成 9,2019-08-22,2019-08-22,创建 10,2019-08-22,2019-08-22,支付 11,2019-08-23,2019-08-23,创建 12,2019-08-23,2019-08-23,创建 13,2019-08-23,2019-08-23,支付
建表
CREATE TABLE orders_20190821 ( orderid INT, createtime STRING, modifiedtime STRING, status STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' stored AS textfile; CREATE TABLE orders_20190822 ( orderid INT, createtime STRING, modifiedtime STRING, status STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' stored AS textfile; CREATE TABLE orders_20190823 ( orderid INT, createtime STRING, modifiedtime STRING, status STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' stored AS textfile; 数据导入: load data local inpath '/root/hive/order_2019-08-21.txt' into table orders_20190821; load data local inpath '/root/hive/order_2019-08-22.txt' into table orders_20190822; load data local inpath '/root/hive/order_2019-08-23.txt' into table orders_20190823;
ODS层设计:
在数据仓库的ODS层,有一张订单的增量数据表,按天分区,存放每天的增量数据:
CREATE TABLE ods_orders_inc ( orderid INT, createtime STRING, modifiedtime STRING, status STRING )PARTITIONED BY (day STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' stored AS textfile;
DW层设计:
在数据仓库的DW层,有一张订单的历史数据拉链表,存放订单的历史状态数据:
CREATE TABLE dw_orders_his ( orderid INT, createtime STRING, modifiedtime STRING, status STRING, dw_start_date STRING, dw_end_date STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' stored AS textfile;
第一步,抽取全量数据到ODS:
INSERT overwrite TABLE ods_orders_inc PARTITION (day = '2019-08-20') SELECT orderid,createtime,modifiedtime,status FROM orders_20190821 WHERE createtime <= '2019-08-20';
第二步,从ODS刷新到DW:
INSERT overwrite TABLE dw_orders_his SELECT orderid,createtime,modifiedtime,status, createtime AS dw_start_date, '9999-12-31' AS dw_end_date FROM ods_orders_inc WHERE day = '2019-08-20';
增量数据导入
从2019-08-22开始,需要每天正常刷新前一天(2019-08-21)的增量数据到历史表。
第一步,通过增量抽取,将2019-08-21的数据抽取到ODS:
INSERT overwrite TABLE ods_orders_inc PARTITION (day = '2019-08-21') SELECT orderid,createtime,modifiedtime,status FROM orders_20190821 WHERE modifiedtime = '2019-08-21' or createtime = '2019-08-21';
第二步,通过DW历史数据(数据日期为2019-08-20),和ODS增量数据(2019-08-21),刷新历史表:
先把数据放到一张临时表中:临时存储,防止出错有地方追溯。
DROP TABLE IF EXISTS dw_orders_his_tmp; CREATE TABLE dw_orders_his_tmp AS SELECT orderid, createtime, modifiedtime, status, dw_start_date, dw_end_date FROM ( SELECT a.orderid, a.createtime, a.modifiedtime, a.status, a.dw_start_date, CASE WHEN b.orderid IS NOT NULL AND a.dw_end_date > '2019-08-21' THEN '2019-08-20' ELSE a.dw_end_date END AS dw_end_date FROM dw_orders_his a left outer join (SELECT * FROM ods_orders_inc WHERE day = '2019-08-21') b ON (a.orderid = b.orderid) UNION ALL SELECT orderid, createtime, modifiedtime, status, modifiedtime AS dw_start_date, '9999-12-31' AS dw_end_date FROM ods_orders_inc WHERE day = '2019-08-21' ) x ORDER BY orderid,dw_start_date;
最后overwrite到dw历史表:
INSERT overwrite TABLE dw_orders_his SELECT * FROM dw_orders_his_tmp;
解决办法
修改yarn-site.xml中的yarn.nodemanager.resource.memory-mb值大小(2G->4G)
原先设置
<property> <name>yarn.nodemanager.resource.memory-mb</name> <value>2048</value> </property>
修改后设置
<property> <name>yarn.nodemanager.resource.memory-mb</name> <value>4096</value> </property>
hive 1.x版本 不支持 not in +子查询(SubQuery)
借鉴:https://www.jianshu.com/p/c6543b65cb77
深入了解:https://blog.csdn.net/qq_26442553/article/details/80143559
相关文章推荐
- FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
- Hive报错:FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
- FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
- FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
- FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
- 解决hiveserver2报错:java.io.IOException: Job status not available - Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
- FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
- hive报FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
- FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
- CDH5 中使用HIVE的问题总结 return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
- hive-05-Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
- Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
- FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask
- FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
- Hive 新建分区异常处理:FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
- hive在操作表时(如insert into table)关于return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask 错误
- hive创建表失败:FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
- FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException
- Hive code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask问题修复
- 一次hive reduce oom 处理:Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTas