南大通用GBase 8a导出定长格式数据文件的方法

定长格式,每个字段占用相同的字节数,无需再考虑分隔符,可以简化加载处理逻辑,但一般会出现空间浪费。当数据内容复杂,不好设置合适的分隔符时,定长可能是唯一的方案。本文介绍GBase 8a导出定长数据的方法。

参考

GBase 8a加载LOAD定长格式的数据方法

语法

select XX into outfile 'XXX' fields length(XX,XX,XX....)
select XX into outfile 'XXX' fields FIELDS TERMINATED BY '' ENCLOSED BY '' ESCAPED BY ''
select XX into outfile 'XXX' fields FIELDS TERMINATED BY '' ESCAPED BY ''

注意,如果已经使用了第一种length方法,不能再指定方法2和3的字段,否则会报语法错误。

gbase> SELECT * FROM t1 INTO OUTFILE '/home/gbase/t1.zxq' FIELDS length '10,20' TERMINATED  BY '' ENCLOSED BY '' ESCAPED BY '' writemode by overwrites;
ERROR 1755 (HY000): Fields/columns length conllided with fields/columns enclosed by.

方法一,指定length

每个导出的字段都要指定,对于数字,不能小于默认长度。此时忽略考虑列定义指定的数字长度参数。

  • tinyint长度为4
  • smallint 长度为6
  • int长度为11
  • bigint长度为21
  • decimal 带小数点精度的长度不能低于定义的长度+2(小数点和负号),如果没有小数点精度,则+1
gbase> SELECT * FROM t1 INTO OUTFILE '/home/gbase/t1.zxq' FIELDS length '10,20'  writemode by overwrites;
ERROR 1149 (42000): Invalid length field, need length 11 for field 'id'.

gbase> select id,f1 from tsi into outfile '/home/gbase/t.out' fields length '1,5';
ERROR 1149 (42000): Invalid length field, need length 6 for field 'id'.
gbase>
gbase> create table tti(id tinyint,f1 varchar(10));
Query OK, 0 rows affected (Elapsed: 00:00:00.14)

gbase> select id,f1 from tti into outfile '/home/gbase/t.out' fields length '1,5';
ERROR 1149 (42000): Invalid length field, need length 4 for field 'id'.

gbase> select id,f1 from ttt into outfile '/home/gbase/t.out' fields length '6,5';
ERROR 1149 (42000): Invalid length field, need length 21 for field 'id'.

gbase> create table tdecimal(id decimal(32,5),f1 varchar(10));
Query OK, 0 rows affected (Elapsed: 00:00:00.13)

gbase> select id,f1 from tdecimal into outfile '/home/gbase/t.out' fields length '1,5';
ERROR 1149 (42000): Invalid length field, need length 34 for field 'id'.

gbase> create table tdecimal2(id decimal(32,0),f1 varchar(10));
Query OK, 0 rows affected (Elapsed: 00:00:00.16)

gbase> select id,f1 from tdecimal2 into outfile '/home/gbase/t.out' fields length '1,5';
ERROR 1149 (42000): Invalid length field, need length 33 for field 'id'.

注意:长度为字节长度。数据库内字段定义varchar是字符长度,要根据字符集,按比例计算。比如UTF8要按1个字符3个字节,utf8mb4要按照1个字符4个字节。

比如utf8字符集的 varchar(10)实际的字节长度是30。如果配置的长度小于实际数值的字节长度,则会出现截断。

gbase> SELECT * FROM t1 INTO OUTFILE '/home/gbase/t1.zxq' FIELDS length '11,2'  writemode by overwrites;
Query OK, 1 row affected (Elapsed: 00:00:00.03)

正常的
[root@gbase_rh7_001 t1.zxq]# cat t1.zxq
1           From vc1


被截断的
[root@gbase_rh7_001 t1.zxq]# cat t1.zxq
1          Fr
[root@gbase_rh7_001 t1.zxq]#

方法2和3,设置字段分隔符,转义符为空

这3个参数,只要任何一个设置不为空,则不是定长导出。只是字段分隔符和转义符设置是空为判断标记。

gbase> SELECT * FROM t1 INTO OUTFILE '/home/gbase/t1.zxq' FIELDS TERMINATED  BY '' ENCLOSED BY '' ESCAPED BY '' writemode by overwrites;
Query OK, 1 row affected (Elapsed: 00:00:00.03)

gbase> SELECT * FROM t1 INTO OUTFILE '/home/gbase/t1.zxq' FIELDS TERMINATED  BY ''  ESCAPED BY '' writemode by overwrites;
Query OK, 1 row affected (Elapsed: 00:00:00.02)