您的位置:首页 > 其它

Hive-函数大全1

2018-12-26 19:30 204 查看

1,小数的精确度round
注意:精度在hive中是个很大的问题,类似这样的操作最好通过round指定精度
hive> select round(8.4 % 4 , 2) from lxw_dual;
hive>round(sum(score)/sum(question_total_score)100,2) accuracy_rate
2,用不到的groupby的字段的显示collect_set(pp.co10)[0]
select collect_set(pp.co10)[0] as class_ids,pp.examinee_num1,collect_set(pp.co20)[0] as examinee_nums,sum(pp.dt1) jd,sum(pp.dt) as xz,sum(pp.dt2) bc from (select tt.co1 as co10,tt.examinee_num as examinee_num1,tt.co2 as co20,case tt.names
when “1” then tt.sm0.3
when “4” then tt.sm0.5
when “2” then tt.sm0.2
else “0”
end as dt,case tt.names
when “5” then tt.sm
else “0”
end as dt1,case tt.names
when “6” then tt.sm
else “0”
end as dt2
from (select collect_set(calss_name)[0] co1,examinee_num,collect_set(examinee_name)[0] co2,d.name as names,sum(score)/sum(part_question_mark) sm from subject_part c join (select a.question_id,b.name as name from question a left join question_type b on a.question_type_id=b.question_type_id) d on c.question_id=d.question_id group by examinee_num,d.name limit 100) as tt)as pp group by pp.examinee_num1
3,动态分区
hive建立动态分区表:
create external table
qilap_db.t_single_choice_record_detail(
exam_id int
comment “考试id”,
exam_start_time string
comment “考试开始时间”,
class_name string
comment “班级名”,
stu_id string
comment “学号”,
stu_name string
comment “学生姓名”,
question_id int
comment “题目id”,
is_objective int
comment “题目是否主客观”,
question_total_score int
comment “题目分数”,
score int
comment “题目得分”
)
comment “单选题答题记录详情表,记录了学生单选题的答题详情,以学科,阶段,难度分区”
partitioned by(
subject string,
stage string,
question_diffculty string
)
row format delimited fields terminated by “\001”
location “/qilap_db/each_question_type_answer_record/t_single_choice_record_detail”;
–动态导入数据
insert into table
qilap_db.t_single_choice_record_detail
partition(
subject,
stage,
question_diffculty
)
select
exam_id,
exam_start_time,
class_name,
stu_id,
stu_name,
question_id,
is_objective,
question_total_score,
score,
subject,
stage,
question_diffculty
from
qilap_db.t_question_detail
where
question_type=“单选题”;
4,开窗函数
row_number() over (order by cast(error_rate as double) desc) as rnk
(row_number() over (partition by tstgf.exam_id,tstgf.class_name order by tstgf.sum_score_progress desc)) as rnk
5,判断得分为空赋值0
nvl(score,0)
6,求成绩波动的函数
(object_score - LAG(object_score,1,object_score) over (partition by stu_id,stu_name order by exam_id asc)) as object_score_progress,
7,方差(stu_wave.sql)
(variance(cast(object_score as double)) over (partition by stu_id,stu_name order by exam_id asc rows between 1 preceding and current row)) 当前行往前推一行(两次成绩)求成绩波动
8,截取字符串
hive> select substr(‘abcde’,3) fromlxw_dual;
cde
substr(tqer.error_rate-tdqaer.error_rate,1,if(tqer.error_rate-tdqaer.error_rate>0,5,6)) difference_value,

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: