GBase 8a是将数据分散到多台服务器来实现MPP架构,每个分片数据通过副本来保证高可用,最高允许2个副本。如果因意外,比如多台服务器RAID卡故障,服务器损毁等肯定无法恢复数据的情况,导致分片所在的所有主副本都不可用,此时虽然其它没有损坏的服务器数据是正常的,但在集群层因部分分片数据丢失而无法查询(完整性)。 如果用户允许这部分无法恢复的数据丢失,其它数据希望能继续查询,新服务器能替换现有故障节点,新的表能继续提供正常服务时,本文提供了一个处理方案。
本文仅针对V9.5版本。
警告:非常规操作,如果数据量不大,建议从gnode导出数据,然后重建集群,再把数据load进去更好一些,至少随机分布表的数据分布更均匀点。
目录导航
概述
分片数据必然丢失
即使再多的副本,只要主副本都丢失了,这部分数据就肯定丢失了。更多的副本带来更高的安全性,但需要更多的硬件资源。
目标
故障节点恢复前,现有可用节点的数据能继续提供【查询】服务, 不能继续提供DML,DDL等服务
故障节点恢复后,表能提供完整的功能。
方案整体描述
提供丢失部分数据的查询服务
GBase 8a提供了如下的参数,可以在主副本的服务都CLOSE/OFFLINE时能继续提供查询服务。建议同时修改集群的配置文件。
set global gcluster_allow_sg_lost=1;
此时集群只能查询,查询结果缺少损坏分片的数据,在故障节点恢复前,不能提供任何DML,DDL操作。
故障节点替换方案
服务器维修后,需要将集群的程序、数据复制到新节点。
程序和元数据替换
从其它正常节点复制集群运行必须的内容,包括元数据,但不包括用户数据(一般很大)
gcware从其它节点全部复制
gcluster从其它节点全部复制
gnode 除了userdata/gbase/用户表外,其它的都复制。 用户表数据看后面的用户数据替换章节
用户数据替换
确认丢失的分片,在计算节点主副本都手工建的空的分片表。
有副本的分片,在调度节点设置同步标志,自动创建分片表和恢复数据。
环境准备
搭建1个管理3个数据计算的集群,然后把2个数据节点服务停掉,并将用户的数据文件删除(程序就算了,懒得复制)。
集群
[gbase@gbase_rh7_001 gcinstall_9.5.2.43.5]$ gcadmin
CLUSTER STATE: ACTIVE
VIRTUAL CLUSTER MODE: NORMAL
=============================================================
| GBASE COORDINATOR CLUSTER INFORMATION |
=============================================================
| NodeName | IpAddress | gcware | gcluster | DataState |
-------------------------------------------------------------
| coordinator1 | 10.0.2.101 | OPEN | OPEN | 0 |
-------------------------------------------------------------
=========================================================================================================
| GBASE DATA CLUSTER INFORMATION |
=========================================================================================================
| NodeName | IpAddress | DistributionId | gnode | syncserver | DataState |
---------------------------------------------------------------------------------------------------------
| node1 | 10.0.2.101 | 1 | OPEN | OPEN | 0 |
---------------------------------------------------------------------------------------------------------
| node2 | 10.0.2.102 | 1 | OPEN | OPEN | 0 |
---------------------------------------------------------------------------------------------------------
| node3 | 10.0.2.115 | 1 | OPEN | OPEN | 0 |
---------------------------------------------------------------------------------------------------------
数据
[gbase@gbase_rh7_001 gcinstall_9.5.2.43.5]$ gccli
GBase client 9.5.2.43.5f8fd4b2. Copyright (c) 2004-2022, GBase. All Rights Reserved.
gbase> initnodedatamap;
Query OK, 0 rows affected (Elapsed: 00:00:00.47)
gbase> create database testdb;
Query OK, 1 row affected (Elapsed: 00:00:00.01)
gbase> use testdb;
Query OK, 0 rows affected (Elapsed: 00:00:00.00)
gbase> create table t1(id int) distributed by ('id');
Query OK, 0 rows affected (Elapsed: 00:00:00.14)
gbase> insert into t1 values(1),(10),(100),(1000);
Query OK, 4 rows affected (Elapsed: 00:00:00.17)
Records: 4 Duplicates: 0 Warnings: 0
gbase> select * from t1;
+------+
| id |
+------+
| 10 |
| 100 |
| 1 |
| 1000 |
+------+
4 rows in set (Elapsed: 00:00:00.01)
gbase> exit
Bye
模拟主副本都故障
停掉2个节点
[gbase@gbase_rh7_001 gcinstall_9.5.2.43.5]$ ssh 10.0.2.102 "gcluster_services all stop"
gbase@10.0.2.102's password:
Stopping GCMonit success!
Stopping gbase : [ OK ]
Stopping syncserver : [ OK ]
[gbase@gbase_rh7_001 gcinstall_9.5.2.43.5]$ ssh 10.0.2.115 "gcluster_services all stop"
gbase@10.0.2.115's password:
Stopping GCMonit success!
Stopping gbase : [ OK ]
Stopping syncserver : [ OK ]
[gbase@gbase_rh7_001 gcinstall_9.5.2.43.5]$ gcadmin
CLUSTER STATE: ACTIVE
VIRTUAL CLUSTER MODE: NORMAL
=============================================================
| GBASE COORDINATOR CLUSTER INFORMATION |
=============================================================
| NodeName | IpAddress | gcware | gcluster | DataState |
-------------------------------------------------------------
| coordinator1 | 10.0.2.101 | OPEN | OPEN | 0 |
-------------------------------------------------------------
=========================================================================================================
| GBASE DATA CLUSTER INFORMATION |
=========================================================================================================
| NodeName | IpAddress | DistributionId | gnode | syncserver | DataState |
---------------------------------------------------------------------------------------------------------
| node1 | 10.0.2.101 | 1 | OPEN | OPEN | 0 |
---------------------------------------------------------------------------------------------------------
| node2 | 10.0.2.102 | 1 | CLOSE | CLOSE | 0 |
---------------------------------------------------------------------------------------------------------
| node3 | 10.0.2.115 | 1 | CLOSE | CLOSE | 0 |
---------------------------------------------------------------------------------------------------------
删除主副本分片数据文件
删掉102和115对应的t1表的元数据和数据,模拟无法恢复的故障。 115上一样操作就不贴了。
清理102节点的数据文件
[gbase@localhost ~]$ cd /opt/gbase/gnode/userdata/gbase/
[gbase@localhost gbase]$ ll
total 12
-rw------- 1 gbase gbase 19 May 19 15:59 express.seq
drwx------ 2 gbase gbase 4096 May 19 15:59 gbase
drwx------ 4 gbase gbase 44 May 19 15:59 gclusterdb
drwx------ 4 gbase gbase 44 May 19 15:59 gctmpdb
drwx------ 4 gbase gbase 44 May 19 16:01 testdb
[gbase@localhost gbase]$ cd testdb
[gbase@localhost testdb]$ ll
total 0
drwx------ 4 gbase gbase 88 May 19 16:02 metadata
drwx------ 4 gbase gbase 32 May 19 16:02 sys_tablespace
[gbase@localhost testdb]$ cd me
-bash: cd: me: No such file or directory
[gbase@localhost testdb]$ cd metadata/
[gbase@localhost metadata]$ ll
total 28
-rw------- 1 gbase gbase 61 May 19 16:01 db.opt
-rw------- 1 gbase gbase 8558 May 19 16:02 t1_n1.frm
drwx------ 2 gbase gbase 65 May 19 16:02 t1_n1.GED
-rw------- 1 gbase gbase 8558 May 19 16:02 t1_n2.frm
drwx------ 2 gbase gbase 38 May 19 16:02 t1_n2.GED
[gbase@localhost metadata]$ mkdir bak
[gbase@localhost metadata]$ mv t1* bak/
[gbase@localhost metadata]$ ll
total 4
drwxrwxr-x 4 gbase gbase 74 May 19 16:06 bak
-rw------- 1 gbase gbase 61 May 19 16:01 db.opt
[gbase@localhost metadata]$ cd ..
[gbase@localhost testdb]$ cd sys_tablespace/
[gbase@localhost sys_tablespace]$ ll
total 0
drwx------ 2 gbase gbase 26 May 19 16:02 t1_n1
drwx------ 2 gbase gbase 6 May 19 16:02 t1_n2
[gbase@localhost sys_tablespace]$ mkdir bak
[gbase@localhost sys_tablespace]$ mv t1* bak/
[gbase@localhost sys_tablespace]$ ll
total 0
drwxrwxr-x 4 gbase gbase 32 May 19 16:07 bak
[gbase@localhost sys_tablespace]$
查询报错
[gbase@gbase_rh7_001 gcinstall_9.5.2.43.5]$ gccli testdb
GBase client 9.5.2.43.5f8fd4b2. Copyright (c) 2004-2022, GBase. All Rights Reserved.
gbase> select * from t1;
ERROR 1708 (HY000): (GBA-02EX-0004) Failed to get metadata:
DETAIL: check nodes, no valid node for suffix: n2,
please execute 'show datacopymap database.table_name;' to see the detail.
通过参数提供缺少部分数据的查询服务
例子不太好,n2分片竟然没有分到一行数据,但不影响不设置参数会出现前面的查询报错。
注意warnings信息,没有可用的n2分片。
gbase> show variables like '%sg%';
+------------------------+-------+
| Variable_name | Value |
+------------------------+-------+
| gcluster_allow_sg_lost | 0 |
+------------------------+-------+
1 row in set (Elapsed: 00:00:00.00)
gbase> set global gcluster_allow_sg_lost=1;
Query OK, 0 rows affected (Elapsed: 00:00:00.01)
gbase> show variables like '%sg%';
+------------------------+-------+
| Variable_name | Value |
+------------------------+-------+
| gcluster_allow_sg_lost | 1 |
+------------------------+-------+
1 row in set (Elapsed: 00:00:00.00)
gbase> select * from t1;
+------+
| id |
+------+
| 10 |
| 100 |
| 1 |
| 1000 |
+------+
4 rows in set, 1 warning (Elapsed: 00:00:00.00)
gbase> show warnings;
+-------+------+------------------------------------+
| Level | Code | Message |
+-------+------+------------------------------------+
| Note | 1702 | No valid nodes for table part 'n2' |
+-------+------+------------------------------------+
1 row in set (Elapsed: 00:00:00.00)
DML,DDL报错
gbase> insert into t1 values(20);
ERROR 1705 (HY000): gcluster DML error: segment id 2 has no valid node.
恢复程序和元数据
将整个目录从正常节点复制过来,除了gnode的userdata/gbase/用户库。
这步我省略了,一堆的scp -r操作。
启动服务
先启动了102节点。
[gbase@localhost root]$ gcluster_services all start
Starting gbase : [ OK ]
Starting syncserver : [ OK ]
Starting GCMonit success!
[gbase@localhost root]$
查询报表不存在
当服务启动后,gcluster_allow_sg_lost参数失效,此时必须尽快恢复用户数据。
gbase@gbase_rh7_001 gcinstall_9.5.2.43.5]$ gcadmin
CLUSTER STATE: ACTIVE
VIRTUAL CLUSTER MODE: NORMAL
=============================================================
| GBASE COORDINATOR CLUSTER INFORMATION |
=============================================================
| NodeName | IpAddress | gcware | gcluster | DataState |
-------------------------------------------------------------
| coordinator1 | 10.0.2.101 | OPEN | OPEN | 0 |
-------------------------------------------------------------
=========================================================================================================
| GBASE DATA CLUSTER INFORMATION |
=========================================================================================================
| NodeName | IpAddress | DistributionId | gnode | syncserver | DataState |
---------------------------------------------------------------------------------------------------------
| node1 | 10.0.2.101 | 1 | OPEN | OPEN | 0 |
---------------------------------------------------------------------------------------------------------
| node2 | 10.0.2.102 | 1 | OPEN | OPEN | 0 |
---------------------------------------------------------------------------------------------------------
| node3 | 10.0.2.115 | 1 | CLOSE | CLOSE | 0 |
---------------------------------------------------------------------------------------------------------
[gbase@gbase_rh7_001 gcinstall_9.5.2.43.5]$ gccli testdb
GBase client 9.5.2.43.5f8fd4b2. Copyright (c) 2004-2022, GBase. All Rights Reserved.
gbase> select * from t1;
ERROR 1708 (HY000): [10.0.2.102:5050](GBA-02AD-0005)Failed to query in gnode:
DETAIL: Table 'testdb.t1_n2' doesn't exist
SQL: SELECT /*10.0.2.101_30_1_2022-05-19_16:09:18*/ /*+ TID('55') */ `vcname000001.testdb.t1`.`id` AS `id` FROM `testdb`.`t1_n2` `vcname000001.testdb.t1`
gbase> show variables like '%sg%';
+------------------------+-------+
| Variable_name | Value |
+------------------------+-------+
| gcluster_allow_sg_lost | 1 |
+------------------------+-------+
1 row in set (Elapsed: 00:00:00.01)
恢复用户数据
确认丢失的表的数据
在计算节点服务创建空的分片表。
在102上gnode建表
[gbase@localhost root]$ gncli
GBase client 9.5.2.43.5f8fd4b2. Copyright (c) 2004-2022, GBase. All Rights Reserved.
gbase> use testdb;
Query OK, 0 rows affected (Elapsed: 00:00:00.00)
gbase> show tables;
Empty set (Elapsed: 00:00:00.00)
gbase> create table t1_n2(id int);
Query OK, 0 rows affected (Elapsed: 00:00:00.03)
可以正常查询了
在101上查询
gbase> select * from t1;
+------+
| id |
+------+
| 1 |
| 1000 |
| 10 |
| 100 |
+------+
4 rows in set (Elapsed: 00:00:00.01)
可恢复的表设置自动同步
详情参考 https://www.gbase8.cn/5799
gbase> show nodes;
+------------+------------+-------+--------------+----------------+---------+-----------+
| Id | ip | name | primary part | duplicate part | status | datastate |
+------------+------------+-------+--------------+----------------+---------+-----------+
| 1694629898 | 10.0.2.101 | node1 | n1 | n3 | online | 0 |
| 1711407114 | 10.0.2.102 | node2 | n2 | n1 | online | 0 |
| 1929510922 | 10.0.2.115 | node3 | n3 | n2 | offline | 0 |
+------------+------------+-------+--------------+----------------+---------+-----------+
3 rows in set (Elapsed: 00:00:00.00)
gbase> show vcs;
+---------+--------------+---------+
| id | name | default |
+---------+--------------+---------+
| vc00001 | vcname000001 | Y |
+---------+--------------+---------+
1 row in set (Elapsed: 00:00:00.00)
gbase> set self gcluster_node_status_list='vc00001.testdb.t1:n1:1711407114:2:2';
Query OK, 0 rows affected (Elapsed: 00:00:00.02)
查看恢复的表数据
然后在102上查询,一会就有数据了
gbase> select * from testdb.t1_n1;
+------+
| id |
+------+
| 10 |
| 100 |
+------+
2 rows in set (Elapsed: 00:00:00.00)
其它节点相同的处理方式
根据分片情况,创建空表或者自动同步