南大通用GBase 8a导出和加载使用多个不可见字符做分隔符

GBase 8a数据库加载时,可以指定多个字符作为字段分隔符和行分隔符,可以指定最多1个字符作为包围符。这些字符可以通过\xXX的方式指定不可见字符。

不可见字符格式

支持2种方式,比如对应16进制的FF ,可以用 ’\xFF‘ 或者 x'FF', 如果有多个,连续单独写在一起即可,比如 ’\xff\xfe\xfd' 或者 x'fffefd',第二种写法更简单些,但第一种看上去更容易理解。

其中\在数据库属于转义字符,如果你需要把\存入数据库的某个字段内,需要用\\

gbase> desc t3;
+-------+--------------+------+-----+---------+-------+
| Field | Type         | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| value | varchar(100) | YES  |     | NULL    |       |
+-------+--------------+------+-----+---------+-------+
1 row in set (Elapsed: 00:00:00.00)

gbase> insert into t3 values ('\\xff');
Query OK, 1 row affected (Elapsed: 00:00:00.14)

gbase> select * from t3;
+-------+
| value |
+-------+
| \xff  |
+-------+
1 row in set (Elapsed: 00:00:00.00)

导出数据

如下是导出1个3列表里的2列,用\xff作为字段分隔符。通过xxd命令,可以看到该文件确实是16进制的FF作为了分隔符。

后面是多个分隔符的例子。

gbase> desc t1;
+-------+----------+------+-----+---------+-------+
| Field | Type     | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| id    | int(11)  | YES  | MUL | NULL    |       |
| value | int(11)  | YES  |     | NULL    |       |
| birth | datetime | YES  |     | NULL    |       |
+-------+----------+------+-----+---------+-------+
3 rows in set (Elapsed: 00:00:00.00)

gbase> select * from t1;
+------+--------+---------------------+
| id   | value  | birth               |
+------+--------+---------------------+
|    1 |    234 | NULL                |
|    2 |    567 | NULL                |
|    3 |    888 | NULL                |
|    4 |    999 | NULL                |
|    5 | 555555 | 2020-01-03 00:00:00 |
+------+--------+---------------------+
5 rows in set (Elapsed: 00:00:00.05)

gbase> select id,value from t1 into outfile '/home/gbase/t1.txt' fields terminated by '\xff'  ESCAPED BY '';
Query OK, 5 rows affected, 1 warning (Elapsed: 00:00:00.03)

gbase> system xxd /home/gbase/t1.txt/t1.txt
0000000: 31ff 3233 340a 32ff 3536 370a 33ff 3838  1.234.2.567.3.88
0000010: 380a 34ff 3939 390a 35ff 3535 3535 3535  8.4.999.5.555555
0000020: 0a   

gbase> select id,value from t1 into outfile '/home/gbase/t1.txt' fields terminated by '\xff\xfe\xfd'  ESCAPED BY '' writemode by overwrites;
Query OK, 5 rows affected, 1 warning (Elapsed: 00:00:00.02)

gbase>
gbase> system xxd /home/gbase/t1.txt/t1.txt
0000000: 31ff fefd 3233 340a 32ff fefd 3536 370a  1...234.2...567.
0000010: 33ff fefd 3838 380a 34ff fefd 3939 390a  3...888.4...999.
0000020: 35ff fefd 3535 3535 3535 0a              5...555555.
gbase>
gbase>
gbase> select id,value from t1 into outfile '/home/gbase/t1.txt' fields terminated by x'fffefd'  ESCAPED BY '' writemode by overwrites;
Query OK, 5 rows affected, 1 warning (Elapsed: 00:00:00.04)

gbase> system xxd /home/gbase/t1.txt/t1.txt
0000000: 31ff fefd 3233 340a 32ff fefd 3536 370a  1...234.2...567.
0000010: 33ff fefd 3838 380a 34ff fefd 3939 390a  3...888.4...999.
0000020: 35ff fefd 3535 3535 3535 0a              5...555555.
gbase>

导入数据

同样格式,加载到一个其它表里面。我们分别用2种格式,都正确加载了。

gbase> desc t2;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id    | int(11) | YES  |     | NULL    |       |
| value | int(11) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+
2 rows in set (Elapsed: 00:00:00.00)

gbase> truncate table t2;
Query OK, 2 rows affected (Elapsed: 00:00:00.26)

gbase> load data infile 'sftp://gbase:gbase1234@10.0.2.106/home/gbase/t1.txt/t1.txt' into table t2 fields terminated by '\xff';
Query OK, 5 rows affected (Elapsed: 00:00:11.96)
Task 1835193 finished, Loaded 5 records, Skipped 0 records

gbase> select * from t2;
+------+--------+
| id   | value  |
+------+--------+
|    2 |    567 |
|    4 |    999 |
|    1 |    234 |
|    3 |    888 |
|    5 | 555555 |
+------+--------+
5 rows in set (Elapsed: 00:00:00.00)


gbase> load data infile 'sftp://gbase:gbase1234@10.0.2.106/home/gbase/t1.txt/t1.txt' into table t2 fields terminated by x'ff';
Query OK, 5 rows affected (Elapsed: 00:00:06.35)
Task 1835196 finished, Loaded 5 records, Skipped 0 records

gbase> select * from t2;
+------+--------+
| id   | value  |
+------+--------+
|    2 |    567 |
|    4 |    999 |
|    2 |    567 |
|    4 |    999 |
|    1 |    234 |
|    3 |    888 |
|    5 | 555555 |
|    1 |    234 |
|    3 |    888 |
|    5 | 555555 |
+------+--------+
10 rows in set (Elapsed: 00:00:00.01)

gbase> load data infile 'sftp://gbase:gbase1234@10.0.2.106/home/gbase/t1.txt/t1.txt' into table t2 fields terminated by x'fffefd';
Query OK, 5 rows affected (Elapsed: 00:00:01.31)
Task 1835197 finished, Loaded 5 records, Skipped 0 records

gbase> select * from t2;
+------+--------+
| id   | value  |
+------+--------+
|    2 |    567 |
|    4 |    999 |
|    2 |    567 |
|    4 |    999 |
|    2 |    567 |
|    4 |    999 |
|    1 |    234 |
|    3 |    888 |
|    5 | 555555 |
|    1 |    234 |
|    3 |    888 |
|    5 | 555555 |
|    1 |    234 |
|    3 |    888 |
|    5 | 555555 |
+------+--------+
15 rows in set (Elapsed: 00:00:00.01)

gbase> load data infile 'sftp://gbase:gbase1234@10.0.2.106/home/gbase/t1.txt/t1.txt' into table t2 fields terminated by '\xff\xfe\xfd';
Query OK, 5 rows affected (Elapsed: 00:00:07.16)
Task 1835198 finished, Loaded 5 records, Skipped 0 records

gbase> select * from t2;
+------+--------+
| id   | value  |
+------+--------+
|    1 |    234 |
|    3 |    888 |
|    5 | 555555 |
|    1 |    234 |
|    3 |    888 |
|    5 | 555555 |
|    1 |    234 |
|    3 |    888 |
|    5 | 555555 |
|    1 |    234 |
|    3 |    888 |
|    5 | 555555 |
|    2 |    567 |
|    4 |    999 |
|    2 |    567 |
|    4 |    999 |
|    2 |    567 |
|    4 |    999 |
|    2 |    567 |
|    4 |    999 |
+------+--------+
20 rows in set (Elapsed: 00:00:00.02)