GBase 8a支持cume_dist 函数,用于计算小于等于或者大于等于(根据order的顺序)该值的百分比。
目录导航
语法
cume_disk() over([partition by ] order by [desc])
说明
其中partition是否开窗,否则全部数值统一处理。
order by 的顺序,ASC(默认) = 小于等于,DESC =大于等于
样例
小于等于
如下例子是默认的升序,第一行表示:小于等于数值1的行数比例为20%
gbase> select id,cume_dist()over(order by id) cr from t2;
+------+-----+
| id | cr |
+------+-----+
| 1 | 0.2 |
| 2 | 0.4 |
| 3 | 0.6 |
| 4 | 0.8 |
| 5 | 1 |
+------+-----+
5 rows in set (Elapsed: 00:00:00.01)
大于等于
排序为desc,第一行表示:大于等于数值5的行数比例为20%。
gbase> select id,cume_dist()over(order by id desc) cr from t2;
+------+-----+
| id | cr |
+------+-----+
| 5 | 0.2 |
| 4 | 0.4 |
| 3 | 0.6 |
| 2 | 0.8 |
| 1 | 1 |
+------+-----+
5 rows in set (Elapsed: 00:00:00.02)
带开窗partition
每个partition内不分别计算百分比。
gbase> select * from t4;
+------+------+
| id | type |
+------+------+
| 1 | A |
| 2 | A |
| 3 | A |
| 1 | B |
| 2 | B |
| 3 | B |
| 4 | B |
+------+------+
7 rows in set (Elapsed: 00:00:00.00)
gbase> select type,id,cume_dist()over(partition by type order by id) cr from t4;
+------+------+-------------------+
| type | id | cr |
+------+------+-------------------+
| A | 1 | 0.333333333333333 |
| A | 2 | 0.666666666666667 |
| A | 3 | 1 |
| B | 1 | 0.25 |
| B | 2 | 0.5 |
| B | 3 | 0.75 |
| B | 4 | 1 |
+------+------+-------------------+
7 rows in set (Elapsed: 00:00:00.07)
与Percent_rank的对比
percent_rank是计算相对位置,包含起点0,而cume_dist是包含等于的,所以不会出现0。如果数据只有1行,那么percent_rank为
起点0, 而cume_dist为1(100%)。
gbase> select id,cume_dist()over(order by id) cr,percent_rank()over(order by id) pr from t2;
+------+-----+------+
| id | cr | pr |
+------+-----+------+
| 1 | 0.2 | 0 |
| 2 | 0.4 | 0.25 |
| 3 | 0.6 | 0.5 |
| 4 | 0.8 | 0.75 |
| 5 | 1 | 1 |
+------+-----+------+
5 rows in set (Elapsed: 00:00:00.01)
一行数据
gbase> select id,cume_dist()over(order by id) cr,percent_rank()over(order by id) pr from t5;
+------+----+----+
| id | cr | pr |
+------+----+----+
| 1 | 1 | 0 |
+------+----+----+
1 row in set (Elapsed: 00:00:00.02)