南大通用GBase 8a集群中数、中位数的实现

GBase 8a集群中数的实现。

对于9.5.3.28新版本,可以通过如下功能实现

GBase 8a集群百分位函数percentile_cont和percentile_disc介绍

原始数据, id3是分组。
gbase> select * from testled;
+------+------+------+
| id | id2 | id3 |
+------+------+------+
| 1 | 111 | 1 |
| 2 | 110 | 0 |
| 3 | 109 | 0 |
| 4 | 108 | 1 |
| 5 | 107 | 0 |
| 6 | 106 | 0 |
| 7 | 105 | 1 |
| 8 | 104 | 0 |
+------+------+------+
8 rows in set (Elapsed: 00:00:00.00)

给每一列,每一行做排序后的行号。
gbase> select
-> id,
-> row_number()over(partition by id3 order by id) num_id,
-> id2,
-> row_number() over(partition by id3 order by id2) num_id2,
-> id3
-> from testled;
+------+--------+------+---------+------+
| id | num_id | id2 | num_id2 | id3 |
+------+--------+------+---------+------+
| 8 | 5 | 104 | 1 | 0 |
| 6 | 4 | 106 | 2 | 0 |
| 5 | 3 | 107 | 3 | 0 |
| 3 | 2 | 109 | 4 | 0 |
| 2 | 1 | 110 | 5 | 0 |
| 7 | 3 | 105 | 1 | 1 |
| 4 | 2 | 108 | 2 | 1 |
| 1 | 1 | 111 | 3 | 1 |
+------+--------+------+---------+------+
8 rows in set (Elapsed: 00:00:00.03)

计算每一个列的一半行号
gbase> select *,
-> ceil((max(num_id)over(partition by id3) +1)/2) half_num_id,
-> ceil((max(num_id2)over(partition by id3)+1)/2) half_num_id2
-> from (
-> select
-> id,
-> row_number()over(partition by id3 order by id) num_id,
-> id2,
-> row_number() over(partition by id3 order by id2) num_id2,
-> id3
-> from testled
-> )t ;
+------+--------+------+---------+------+-------------+--------------+
| id | num_id | id2 | num_id2 | id3 | half_num_id | half_num_id2 |
+------+--------+------+---------+------+-------------+--------------+
| 6 | 4 | 106 | 2 | 0 | 3 | 3 |
| 5 | 3 | 107 | 3 | 0 | 3 | 3 |
| 3 | 2 | 109 | 4 | 0 | 3 | 3 |
| 2 | 1 | 110 | 5 | 0 | 3 | 3 |
| 8 | 5 | 104 | 1 | 0 | 3 | 3 |
| 4 | 2 | 108 | 2 | 1 | 2 | 2 |
| 1 | 1 | 111 | 3 | 1 | 2 | 2 |
| 7 | 3 | 105 | 1 | 1 | 2 | 2 |
+------+--------+------+---------+------+-------------+--------------+
8 rows in set (Elapsed: 00:00:00.31)

完整计算。每个列的中数

gbase>  select
    ->    max(case when num_id=half_num_id then id else null end) over(partition by id3) id,
    ->    max(case when num_id2=half_num_id2 then id2 else null end) over (partition by id3) id2,
    ->    id3
    ->  from (
    ->    select *,
    ->      ceil((max(num_id)over(partition by id3) +1)/2) half_num_id, 
    ->      ceil((max(num_id2)over(partition by id3)+1)/2) half_num_id2 
    ->    from (
    ->      select
    ->        id,
    ->        row_number()over(partition by id3 order by id) num_id,
    ->        id2,
    ->        row_number() over(partition by id3 order by id2) num_id2,
    ->        id3
    ->      from testled
    ->    )t
    -> )tt where num_id=half_num_id or num_id2=half_num_id2;
+------+------+------+
| id   | id2  | id3  |
+------+------+------+
|    5 |  107 |    0 |
|    4 |  108 |    1 |
+------+------+------+
2 rows in set (Elapsed: 00:00:00.03)

两个列的

select 
   max(case when num_id=half_num_id then id else null end) over(partition by id3) id, 
   max(case when num_id2=half_num_id2 then id2 else null end) over (partition by id3) id2, 
   id3
 from (
   select *,
     ceil((max(num_id)over(partition by id3) +1)/2) half_num_id, 
     ceil((max(num_id2)over(partition by id3)+1)/2) half_num_id2 
   from (
     select 
       id,
       row_number()over(partition by id3 order by id) num_id,
       id2,
       row_number() over(partition by id3 order by id2) num_id2,
       id3     
     from testled
   )t 
)tt where num_id=half_num_id or num_id2=half_num_id2;