GBase 8a在做比较时,如果数据类型不一致,则会根据一些规则进行自动转换。本文通过一些实际样例来探讨这些自动转换的规则是什么,本文关注点放在字符串和数字,其它类型的单独文章讨论。
如下分三种情况进行探讨,连个常量比较、两个字段比较,以及最常见的一个字段一个常量比较。
目录导航
环境
2节点集群,一个包含int和varchar字段的表,以及几行数据。 集群版本9.5.2.44
集群
[gbase@gbase_rh7_001 ~]$ gcadmin
CLUSTER STATE: ACTIVE
VIRTUAL CLUSTER MODE: NORMAL
=============================================================
| GBASE COORDINATOR CLUSTER INFORMATION |
=============================================================
| NodeName | IpAddress | gcware | gcluster | DataState |
-------------------------------------------------------------
| coordinator1 | 10.0.2.101 | OPEN | OPEN | 0 |
-------------------------------------------------------------
| coordinator2 | 10.0.2.115 | OPEN | OPEN | 0 |
-------------------------------------------------------------
=========================================================================================================
| GBASE DATA CLUSTER INFORMATION |
=========================================================================================================
| NodeName | IpAddress | DistributionId | gnode | syncserver | DataState |
---------------------------------------------------------------------------------------------------------
| node1 | 10.0.2.101 | 1 | OPEN | OPEN | 0 |
---------------------------------------------------------------------------------------------------------
| node2 | 10.0.2.115 | 1 | OPEN | OPEN | 0 |
---------------------------------------------------------------------------------------------------------
表和数据
gbase> desc b1;
+-------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| id | int(11) | YES | MUL | NULL | |
| name | varchar(100) | YES | | NULL | |
+-------+--------------+------+-----+---------+-------+
2 rows in set (Elapsed: 00:00:00.00)
gbase> select * from b1;
+------+-------+
| id | name |
+------+-------+
| 2 | 2222 |
| 4 | 44444 |
| 1 | 111 |
| 3 | 3333 |
| 99 | First |
+------+-------+
5 rows in set (Elapsed: 00:00:00.02)
2个常量比较
测试结果:类型转化成优先级高的类型后,再进行比较。
2个字符串比较
完全按照ASCII的先后顺序进行比较。但同时注意到,对于字母是默认不区分大小写的,也就是'A'=='a'。如果需要区分,要加上binary关键字。
gbase> select count(*) from b1 where '3'>'111';
+----------+
| count(*) |
+----------+
| 5 |
+----------+
1 row in set (Elapsed: 00:00:00.02)
gbase> select count(*) from b1 where 'A'> 'a';
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (Elapsed: 00:00:00.04)
gbase> select count(*) from b1 where 'A'>= 'a';
+----------+
| count(*) |
+----------+
| 5 |
+----------+
1 row in set (Elapsed: 00:00:00.02)
gbase> select count(*) from b1 where 'A'='a';
+----------+
| count(*) |
+----------+
| 5 |
+----------+
1 row in set (Elapsed: 00:00:00.03)
带binary的区分大小写的字符串比较
gbase> select count(*) from b1 where binary 'A' >= binary 'a';
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (Elapsed: 00:00:00.07)
gbase> select count(*) from b1 where binary 'A' >= 'a';
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (Elapsed: 00:00:00.01)
gbase> select count(*) from b1 where 'A' >= 'a';
+----------+
| count(*) |
+----------+
| 5 |
+----------+
1 row in set (Elapsed: 00:00:00.01)
gbase> select count(*) from b1 where 'A' >= binary 'a';
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (Elapsed: 00:00:00.02)
gbase>
2个数字比较
这个不测试了,按数字大小给结果就对了。
1个数字和字符串比较
2个常量比较,要看优先级了。常见类型里,
datetime>date>time>decimal>bigint>int>timestamp>varchar>char
如下的数字和字符串常量比较,会都转成优先级更高的数字,然后比较。
gbase> select count(*) from b1 where 3>'111';
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (Elapsed: 00:00:00.01)
gbase> select count(*) from b1 where '3'>111;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (Elapsed: 00:00:00.02)
2个字段比较
测试结果:数据先转成左侧字段的类型,然后比较。
如下最典型的就是数字11和字符串'9'. 如果以数字为基准,11大于9, 如果以字符串为基准,'9'大于'11'
重新生成了数据
gbase> truncate table t;
Query OK, 8 rows affected (Elapsed: 00:00:00.46)
gbase> insert into t values(1,9),(11,9),(9,1),(9,11);
Query OK, 4 rows affected (Elapsed: 00:00:00.18)
Records: 4 Duplicates: 0 Warnings: 0
gbase> select * from t;
+------+------+
| id | name |
+------+------+
| 1 | 9 |
| 11 | 9 |
| 9 | 1 |
| 9 | 11 |
+------+------+
4 rows in set (Elapsed: 00:00:00.02)
gbase>
gbase> select * from t where id>name;
+------+------+
| id | name |
+------+------+
| 11 | 9 |
| 9 | 1 |
+------+------+
2 rows in set (Elapsed: 00:00:00.01)
gbase> select * from t where name>id;
+------+------+
| id | name |
+------+------+
| 1 | 9 |
| 11 | 9 |
+------+------+
2 rows in set (Elapsed: 00:00:00.02)
1个常量一个字段比较
最常见的场景,字段等于某个数值。如下2个字段分别是int和varchar。
测试结果,常量和字段比较,都是转换成字段的类型后再比较。
测试表
gbase> create table t(id int, name varchar(10));
Query OK, 0 rows affected (Elapsed: 00:00:00.51)
gbase> insert into t values(1,1),(100,100);
Query OK, 2 rows affected (Elapsed: 00:00:00.25)
Records: 2 Duplicates: 0 Warnings: 0
gbase> select * from t;
+------+------+
| id | name |
+------+------+
| 1 | 1 |
| 100 | 100 |
+------+------+
2 rows in set (Elapsed: 00:00:00.05)
数字字段对比常量
如下测试结果看,无论常量是数字还是字符串,无论是比较符的左面还是右面,最终都是以字段的数字类型为准。
gbase> select * from t where id>9;
+------+------+
| id | name |
+------+------+
| 100 | 100 |
+------+------+
1 row in set (Elapsed: 00:00:00.07)
gbase> select * from t where id>'9';
+------+------+
| id | name |
+------+------+
| 100 | 100 |
+------+------+
1 row in set (Elapsed: 00:00:00.02)
gbase> select * from t where 9<id;
+------+------+
| id | name |
+------+------+
| 100 | 100 |
+------+------+
1 row in set (Elapsed: 00:00:00.02)
gbase> select * from t where '9'<id;
+------+------+
| id | name |
+------+------+
| 100 | 100 |
+------+------+
1 row in set (Elapsed: 00:00:00.02)
字符串字段对比常量
gbase> select * from t where name>9;
Empty set (Elapsed: 00:00:00.01)
gbase> select * from t where name>'9';
Empty set (Elapsed: 00:00:00.05)
gbase> select * from t where 9<name;
Empty set (Elapsed: 00:00:00.03)
gbase> select * from t where '9'<name;
Empty set (Elapsed: 00:00:00.02)
总结
- 两个常量比较时,以优先级高的类型为准转换
- 字段和常量比较时,以字段类型为基准转换
- 两个字段比较时,以左侧字段类型为基准转换