数据库SQL面试题,查询出每门课都大于80 分的学生姓名

本文根据网上常见的SQL面试题,查询出每门课都大于80 分的学生姓名,给出分析和在GBase 8a数据库集群上的执行结果。

题目

已知学生成绩如下:

gbase> select * from exam_1;
+--------+--------+-------+
| name   | class  | score |
+--------+--------+-------+
| 张三   | 语文   |    81 |
| 张三   | 数学   |    75 |
| 李四   | 语文   |    76 |
| 李四   | 数学   |    90 |
| 王五   | 语文   |    81 |
| 王五   | 数学   |   100 |
| 王五   | 英语   |    90 |
+--------+--------+-------+
7 rows in set (Elapsed: 00:00:00.00)

查询出每门课都大于80 分的学生姓名

分析

1、题目中,课程数量不是相同的,所以不能提前假设只有3门成绩,解答方案必须能适应任意课程数量。

2、题目只要求学生姓名,所以课程列几乎就是多余的,算干扰因素了。

3、大于80分,指最低分是>80的。

答案样例

通过group

以name作为维度,最小分数min(score)做聚合运算,然后选择那些最小分数大于80分的

gbase> select name from exam_1 group by name having(min(score)>80);
+--------+
| name   |
+--------+
| 王五   |
+--------+
1 row in set (Elapsed: 00:00:00.03)

执行计划如下

gbase> explain select name from exam_1 group by name having(min(score)>80);
+----+----------------+-----------+-------------+-------------------------------+
| ID | MOTION         | OPERATION | TABLE       | CONDITION                     |
+----+----------------+-----------+-------------+-------------------------------+
| 01 | [RESULT]       |  Step     | <00>        |                               |
|    |                |  GROUP    |             | GROUP BY name                 |
|    |                |  HAVING   |             | HAVING (MIN(MIN(score)) > 80) |
| 00 | [REDIST(name)] |  Table    | exam_1[DIS] |                               |
|    |                |  GROUP    |             | GROUP BY name                 |
+----+----------------+-----------+-------------+-------------------------------+
5 rows in set (Elapsed: 00:00:00.01)

通过not exists

将那些存在分数<=80的name 排除在外。

gbase> select distinct name from exam_1 a where not exists (select 1 from exam_1 b where score<=80 and a.name=b.name);
+--------+
| name   |
+--------+
| 王五   |
+--------+
1 row in set (Elapsed: 00:00:00.08)

执行计划如下

gbase> explain select distinct name from exam_1 a where not exists (select 1 from exam_1 b where score<=80 and a.name=b.name);
+----+----------------+------------+-------------+--------------------------+
| ID | MOTION         | OPERATION  | TABLE       | CONDITION                |
+----+----------------+------------+-------------+--------------------------+
| 02 | [RESULT]       |  Step      | <00>        |                          |
|    |                |  WHERE     |             | NOT EXISTS ([SubQuery1]) |
|    |                |  AGG       |             |                          |
| 01 | [REDIST(name)] |  SubQuery1 | placeholder |                          |
|    |                |   SCAN     | b[DIS]      | (score{S} <= 80)         |
| 00 | [REDIST(name)] |  Table     | a[DIS]      |                          |
+----+----------------+------------+-------------+--------------------------+
6 rows in set (Elapsed: 00:00:00.00)

总结

2个方案,从执行计划看,方案1的group性能预计更优一些。方案2存在join的情况。