日萌社
人工智能AI:Keras PyTorch MXNet TensorFlow PaddlePaddle 深度学习实战(不定时更新)
row_number()排序函数 统计每个部门薪资最高的员工信息(同一个部门的员工按照薪资进行降序排序)第一种写法:row_number() over(partition by 一个或多个分组列 order by 一个或多个排序列 asc/desc) as 别名 //如果不写asc/desc的话,默认为asc 第二种写法:row_number() over(distribute by 一个或多个分组列 sort by 一个或多个排序列 asc/desc) as 别名在使用 row_number() over()函数时候,over()里头的分组以及排序的执行晚于 where 、group by、 order by 的执行。select *,row_number() over(distribute by deptid sort by salary desc) rn from employee;//1.distribute by deptid sort by salary desc:按照部门deptid进行分组,每个分组内按照薪资即salary进行降序排序,即同一个部门的员工按照薪资进行降序排序//2.分组条件:distribute by deptid 排序条件:sort by salary desc//3.rn:为别名,代表每个分组中每行数据的所在序号ID,可用于根据rn序号ID直接获取出每个分组中的第一条数据,作用大。统计结果empid deptid sex salary rn1 10 female 5500.0 12 10 male 4500.0 24 20 male 4800.0 13 20 female 1900.0 27 40 male 44500.0 16 40 female 14500.0 25 40 female 6500.0 39 50 male 7500.0 18 50 male 6500.0 25.直接取出rn的编号为1的记录,就是每个部门薪资最高的员工信息(获取出每个分组中薪资最高的员工信息,where条件为rn=1)select * from (select *,row_number() over(distribute by deptid sort by salary desc) rn from employee) t where t.rn=1;//1.distribute by deptid sort by salary desc:按照部门deptid进行分组,每个分组内按照薪资即salary进行降序排序,即同一个部门的员工按照薪资进行降序排序//2.分组条件:distribute by deptid 排序条件:sort by salary desc//3.rn:为别名,代表每个分组中序号ID。// t.rn=1:表示取每个分组中序号ID为1的数据统计结果empid deptid sex salary rn1 10 female 5500.0 14 20 male 4800.0 17 40 male 44500.0 19 50 male 7500.0 1
============================================================================
Oracle从8.1.6开始提供分析函数,分析函数用于计算基于组的某种聚合值,它和聚合函数的不同之处是:对于每个组返回多行,而聚合函数对于每个组只返回一行。
开窗函数指定了分析函数工作的数据窗口大小,这个数据窗口大小可能会随着行的变化而变化,举例如下:1:over后的写法: over(order by salary) 按照salary排序进行累计,order by是个默认的开窗函数over(partition by deptno)按照部门分区over(partition by deptno order by salary)2:开窗的窗口范围:over(order by salary range between 5 preceding and 5 following):窗口范围为当前行数据幅度减5加5后的范围内的。举例:--sum(s)over(order by s range between 2 preceding and 2 following) 表示加2或2的范围内的求和select name,class,s, sum(s)over(order by s range between 2 preceding and 2 following) mm from t2adf 3 45 45 --45加2减2即43到47,但是s在这个范围内只有45asdf 3 55 55cfe 2 74 743dd 3 78 158 --78在76到80范围内有78,80,求和得158fda 1 80 158gds 2 92 92ffd 1 95 190dss 1 95 190ddd 3 99 198gf 3 99 198over(order by salary rows between 5 preceding and 5 following):窗口范围为当前行前后各移动5行。举例:--sum(s)over(order by s rows between 2 preceding and 2 following)表示在上下两行之间的范围内select name,class,s, sum(s)over(order by s rows between 2 preceding and 2 following) mm from t2adf 3 45 174 (45+55+74=174)asdf 3 55 252 (45+55+74+78=252)cfe 2 74 332 (74+55+45+78+80=332)3dd 3 78 379 (78+74+55+80+92=379)fda 1 80 419gds 2 92 440ffd 1 95 461dss 1 95 480ddd 3 99 388gf 3 99 293over(order by salary range between unbounded preceding and unbounded following)或者over(order by salary rows between unbounded preceding and unbounded following):窗口不做限制3、与over函数结合的几个函数介绍row_number()over()、rank()over()和dense_rank()over()函数的使用下面以班级成绩表t2来说明其应用t2表信息如下:cfe 2 74dss 1 95ffd 1 95fda 1 80gds 2 92gf 3 99ddd 3 99adf 3 45asdf 3 553dd 3 78select * from ( select name,class,s,rank()over(partition by class order by s desc) mm from t2) where mm=1;得到的结果是:dss 1 95 1ffd 1 95 1gds 2 92 1gf 3 99 1ddd 3 99 1 1.在求第一名成绩的时候,不能用row_number(),因为如果同班有两个并列第一,row_number()只返回一个结果;select * from ( select name,class,s,row_number()over(partition by class order by s desc) mm from t2) where mm=1;1 95 1 --95有两名但是只显示一个2 92 13 99 1 --99有两名但也只显示一个2.rank()和dense_rank()可以将所有的都查找出来:如上可以看到采用rank可以将并列第一名的都查找出来;rank()和dense_rank()区别:--rank()是跳跃排序,有两个第二名时接下来就是第四名;select name,class,s,rank()over(partition by class order by s desc) mm from t2dss 1 95 1ffd 1 95 1fda 1 80 3 --直接就跳到了第三gds 2 92 1cfe 2 74 2gf 3 99 1ddd 3 99 13dd 3 78 3asdf 3 55 4adf 3 45 5--dense_rank() 是连续排序,有两个第二名时仍然跟着第三名select name,class,s,dense_rank()over(partition by class order by s desc) mm from t2dss 1 95 1ffd 1 95 1fda 1 80 2 --连续排序(仍为2)gds 2 92 1cfe 2 74 2gf 3 99 1ddd 3 99 13dd 3 78 2asdf 3 55 3adf 3 45 4--sum()over()的使用select name,class,s, sum(s)over(partition by class order by s desc) mm from t2 --根据班级进行分数求和dss 1 95 190 --由于两个95都是第一名,所以累加时是两个第一名的相加ffd 1 95 190 fda 1 80 270 --第一名加上第二名的gds 2 92 92cfe 2 74 166gf 3 99 198ddd 3 99 1983dd 3 78 276asdf 3 55 331adf 3 45 376
--找出这三条电路每条电路的第一条记录类型和最后一条记录类型
SELECT opr_id,res_type,first_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) low,last_value(res_type) over(PARTITION BY opr_id ORDER BY res_type rows BETWEEN unbounded preceding AND unbounded following) highFROM rm_circuit_routeWHERE opr_id IN ('000100190000000000021311','000100190000000000021355','000100190000000000021339')ORDER BY opr_id;
注:rows BETWEEN unbounded preceding AND unbounded following 的使用--取last_value时不使用rows BETWEEN unbounded preceding AND unbounded following的结果SELECT opr_id,res_type,first_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) low,last_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) highFROM rm_circuit_routeWHERE opr_id IN ('000100190000000000021311','000100190000000000021355','000100190000000000021339')ORDER BY opr_id;如下图可以看到,如果不使用rows BETWEEN unbounded preceding AND unbounded following,取出的last_value由于与res_type进行进行排列,因此取出的电路的最后一行记录的类型就不是按照电路的范围提取了,而是以res_type为范围进行提取了。
在first_value和last_value中ignore nulls的使用
数据如下:
--lag() over()函数用法(取出前n行数据)lag(expresstion,<offset>,<default>)with a as (select 1 id,'a' name from dualunionselect 2 id,'b' name from dualunionselect 3 id,'c' name from dualunionselect 4 id,'d' name from dualunionselect 5 id,'e' name from dual) select id,name,lag(id,1,'')over(order by name) from a;--lead() over()函数用法(取出后N行数据)lead(expresstion,<offset>,<default>)with a as (select 1 id,'a' name from dualunionselect 2 id,'b' name from dualunionselect 3 id,'c' name from dualunionselect 4 id,'d' name from dualunionselect 5 id,'e' name from dual) select id,name,lead(id,1,'')over(order by name) from a;--ratio_to_report(a)函数用法 Ratio_to_report() 括号中就是分子,over() 括号中就是分母with a as (select 1 a from dualunion allselect 1 a from dualunion allselect 1 a from dualunion allselect 2 a from dualunion all select 3 a from dualunion allselect 4 a from dualunion allselect 4 a from dualunion allselect 5 a from dual)select a, ratio_to_report(a)over(partition by a) b from a order by a; with a as (select 1 a from dualunion allselect 1 a from dualunion allselect 1 a from dualunion allselect 2 a from dualunion all select 3 a from dualunion allselect 4 a from dualunion allselect 4 a from dualunion allselect 5 a from dual)select a, ratio_to_report(a)over() b from a --分母缺省就是整个占比order by a; with a as (select 1 a from dualunion allselect 1 a from dualunion allselect 1 a from dualunion allselect 2 a from dualunion all select 3 a from dualunion allselect 4 a from dualunion allselect 4 a from dualunion allselect 5 a from dual)select a, ratio_to_report(a)over() b from agroup by a order by a;--分组后的占比