您的位置:首页 > 其它

hive 同一个表不同分区的数据求和

2017-05-12 16:53 246 查看
业务表如下:

hive> desc sc_kekaoxing;
OK
id                      string                  可靠小表id
car_id                  string                  car表id
month                   string                  月份
name                    string                  可靠名称父级
count                   string                  可靠数量
dt                      string
dp                      string

# Partition Information
# col_name              data_type               comment

dt                      string
dp                      string
Time taken: 0.128 seconds, Fetched: 13 row(s)


hive> select * from sc_kekaoxing limit 5;
OK
1       10      201612  车身外观        5       201612  qiche
1       11      201612  车身外观        6       201612  qiche
1       12      201612  车身外观        7       201612  qiche
1       13      201612  车身外观        2       201612  qiche
1       14      201612  车身外观        17      201612  qiche
Time taken: 0.386 seconds, Fetched: 5 row(s)


业务需求:不同dt分区下的字段count的值需要累加:

INSERT OVERWRITE local directory '/tmp/t01'
row format delimited fields terminated by '\t'
select
if(aa.id is null,bb.id,aa.id),
if(aa.car_id is null,bb.car_id,aa.car_id),
'201702',
if(aa.name is null,bb.name,aa.name),
case
when aa.count is not null and bb.count is not null
then aa.count+bb.count
when aa.count is null and bb.count is not null
then bb.count
when aa.count is not null and bb.count is null
then aa.count
else 0 end
from
(select * from sc_kekaoxing where dt='201612') aa
full outer join
(select * from sc_kekaoxing where dt='201702') bb
on aa.id=bb.id and aa.car_id=bb.car_id and aa.name=bb.name;
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  hive 数据