hive中求top k的两种方式
2015-06-10 19:20
531 查看
一、用rank() over()
二、用row_number()
select event_id,
event_name,
channel,
pv,
uv
from (
select event_id,
event_name,
channel,
pv,
uv
from (
select event_id,
channel,
event_name,
sum(pv) pv,
sum(uv) uv
from tablename
where hp_cal_dt = '2015-06-09'
group by event_id,
channel,
event_name
)a
distribute by channel
sort by channel,pv desc, uv desc
)a
where row_number(channel)< 4
select * from ( select event_id, event_name, channel, pv, uv, rank() over (partition by channel order by pv desc,uv desc) as rank from ( select event_id, channel, event_name, sum(pv) pv, sum(uv) uv from tablename where hp_cal_dt = '2015-06-09' group by event_id, channel, event_name ) a )a where rank < 4
二、用row_number()
select event_id,
event_name,
channel,
pv,
uv
from (
select event_id,
event_name,
channel,
pv,
uv
from (
select event_id,
channel,
event_name,
sum(pv) pv,
sum(uv) uv
from tablename
where hp_cal_dt = '2015-06-09'
group by event_id,
channel,
event_name
)a
distribute by channel
sort by channel,pv desc, uv desc
)a
where row_number(channel)< 4
相关文章推荐
- apache2.4 虚拟主机配置
- NGINX、PHP-FPM开机自动启动
- Nginx和PHP-FPM的启动/重启脚本 [转发]
- Linux iptables:规则原理和基础
- linux系统性能监控工具--htop与dstat介绍
- apache反向代理负载均衡请求至tomcat
- 通过Sahara部署Hadoop集群分类
- 使用PowerShell修改Server20102R2系统配置
- tomcat应答代码(经常出现的问题,看到好的总结就转来了)
- linux 校验工具
- Linux 条件测试语句
- linux sync命令
- 为品牌管理增加检索名称和状态项
- centos 单网卡批量添加不同IP段
- centos/debian配置gitlab 7.1x来搭建自建的git仓库.
- linux ssh port forward
- linux添加环境变量
- shell 空格问题
- Centos 6.5 64bit 安装 nginx+php
- 配置xmanager5连接虚拟机中的linux(基于centos5.5 32位)