您的位置:首页 > 其它

rank,dense_rank,row_number,ntile 的比对和使用场景

2017-12-26 17:09 357 查看
1. 准备数据

1,a,10
2,a,12
3,b,13
4,b,12
5,a,14
6,a,15
7,a,13
8,b,11
9,a,16
10,b,17
11,a,14


2.表创建

create table t_ntile(id int,name string,sal int)
row format delimited
fields terminated by ',';


3.加载数据

load data local inpath '/root/txt/t_ntile.txt' into table t_ntile;


4.rank,dense_rank,row_number over 的使用和比较

select id,name,sal
,rank() over(partition by name order by sal desc) rank
,dense_rank() over(partition by name order by sal desc)dense
,row_number() over(partition by name order by sal desc)row_number
from t_ntile;




5.实用场景是 TopN 和 50%数据,3分1数据

select * from (
select id,name,sal
,rank() over(partition by name order by sal desc) rank
,dense_rank() over(partition by name order by sal desc)dense
,row_number() over(partition by name order by sal desc)row_number
,count(*) over(partition by name) * 0.5 as count
from t_ntile
)t where t.rank <= t.count;




6.ntile 分割函数,也比较适用50%数据

select * from (
select id,name
,sal
,ntile(2) over(partition by name order by sal desc)n2
,sal
,ntile(3) over(partition by name order by sal desc)n3
from t_ntile
)t where t.n2 = 1;


内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  hive