GBase 8a 数据库集群通过load方式将数据加载入库到一张表里,当数据某一列的值,无法转换为表对应字段类型时,比如带字母的字符串转化为数字,则会报Validate error这类错误。原因可能是列的顺序匹配错误,或原始数据采集问题。
目录导航
测试样例
一个3列的表
gbase> create table t_load5(name varchar(100),id int, birth date);
Query OK, 0 rows affected (Elapsed: 00:00:00.10)
样例数据
制造了可能引起Validate error错误的异常数据,第二列是包含字母的字符串,而表定义为数字int类型
[gbase@localhost ~]$ cat /home/gbase/tload.txt
3,Third,2020-01-02
4,Fouth,2020-14-16
9999999999,MAX,2020-12-13
执行加载
有3行错误数据,注意本次加载任务的编号task ID是 1572912
gbase> load data infile 'sftp://gbase:gbase1234@10.0.2.107/home/gbase/tload.txt' into table t_load4 fields terminated by ',';
Query OK, 0 rows affected (Elapsed: 00:00:00.63)
Task 1572912 finished, Loaded 0 records, Skipped 3 records
查看表数据
无数据入库
gbase> select * from t_load4;
Empty set (Elapsed: 00:00:00.00)
查看错误日志和数据
此处为了方便,直接用SQL方式读取。可以看到报错信息 Validate error。注意用SQL方式,可以从结果里直接看到错误数据。如果从文件查看,是分成2个文件保存的。
gbase> show load logs limit 3;
+---------+---------+---------+-------------------+--------------------------------------------------------+-------------+------------+------------+-----------------+----------------------------+
| task_id | db_name | tb_name | err_data_ip | file_name | file_offset | record_len | err_column | err_reason | err_data |
+---------+---------+---------+-------------------+--------------------------------------------------------+-------------+------------+------------+-----------------+----------------------------+
| 1572914 | testdb | t_load5 | ::ffff:10.0.2.107 | sftp://gbase:*********@10.0.2.107/home/gbase/tload.txt | 0 | 19 | 2 | Validate error
| 3,Third,2020-01-02
|
| 1572914 | testdb | t_load5 | ::ffff:10.0.2.107 | sftp://gbase:*********@10.0.2.107/home/gbase/tload.txt | 19 | 19 | 2 | Validate error
| 4,Fouth,2020-14-16
|
| 1572914 | testdb | t_load5 | ::ffff:10.0.2.107 | sftp://gbase:*********@10.0.2.107/home/gbase/tload.txt | 38 | 26 | 2 | Validate error
| 9999999999,MAX,2020-12-13
|
+---------+---------+---------+-------------------+--------------------------------------------------------+-------------+------------+------------+-----------------+----------------------------+
3 rows in set (Elapsed: 00:00:00.00)
通过文件查看错误日志和数据,其中1572914是加载完成后,返回的taskid.可以看到其中的 Validate error 报错信息。
[root@localhost ~]# cd /opt/gcluster/log/gcluster/loader_logs/1572914
[root@localhost 1572914]# ll
total 12
-rw------- 1 gbase gbase 245 Sep 4 04:46 1572914_loader_result.log
-rw------- 1 gbase gbase 64 Sep 4 04:46 1572914_testdb_t_load5_n1_::ffff:10.0.2.107_20200904044616.err
-rw------- 1 gbase gbase 308 Sep 4 04:46 1572914_testdb_t_load5_n1_::ffff:10.0.2.107_20200904044616.trc
[root@localhost 1572914]# cat 1572914_loader_result.log
1572914|testdb|t_load5|root|::ffff:10.0.2.107|::ffff:10.0.2.107|2020-09-04 04:46:16|2020-09-04 04:46:16|0|64|64|0|3|0|SUCCESS|load data infile 'sftp://gbase:*********@10.0.2.107/home/gbase/tload.txt' into table t_load5 fields terminated by ','|
[root@localhost 1572914]# cat 1572914_testdb_t_load5_n1_\:\:ffff\:10.0.2.107_20200904044616.trc
file_name | file_offset | record_len | column | reason
sftp://gbase:*********@10.0.2.107/home/gbase/tload.txt|0|19|2|Validate error
sftp://gbase:*********@10.0.2.107/home/gbase/tload.txt|19|19|2|Validate error
sftp://gbase:*********@10.0.2.107/home/gbase/tload.txt|38|26|2|Validate error
[root@localhost 1572914]# cat 1572914_testdb_t_load5_n1_\:\:ffff\:10.0.2.107_20200904044616.err
3,Third,2020-01-02
4,Fouth,2020-14-16
9999999999,MAX,2020-12-13
[root@localhost 1572914]#
解决方案
修复错误数据,或者调整数据库表字段类型
其它加载错误,请参考