GBase 8a在做扩容缩容等操作时,会将需要重分布的任务,放到gclusterdb.rebalancing_status元数据表里,本文介绍该进度表的使用方法。
目录导航
表结构
字段 | 类型 | 说明 |
---|---|---|
index_name | varchar(129) | 任务索引,包含了库名和表名,用小数点分割 |
db_name | varchar(64) | 库名 |
table_name | varchar(64) | 表名 |
tmptable | varchar(129) | 数据迁移用到的临时表名字,一般以rebalanc开头 |
start_time | datetime | 开始时间,RUNNING状态有效 |
end_time | datetime | 完成时间,FINISHED状态有效 |
status | varchar(32) | 任务状态。STARTING、RUNNING、COMPLETED、PAUSED、CANCELED |
percentage | int | 完成百分比,无百分号。RUNNING状态有效。默认0 |
priority | int | 优先级。越小优先级越高,默认5。 |
host | varchar(60) | 任务所在IP |
distribution_id | bigint | 目标distribution_id |
gbase> desc rebalancing_status;
+-----------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+--------------+------+-----+---------+-------+
| index_name | varchar(129) | YES | | NULL | |
| db_name | varchar(64) | YES | | NULL | |
| table_name | varchar(64) | YES | | NULL | |
| tmptable | varchar(129) | YES | | NULL | |
| start_time | datetime | YES | | NULL | |
| end_time | datetime | YES | | NULL | |
| status | varchar(32) | YES | | NULL | |
| percentage | int(11) | YES | | NULL | |
| priority | int(11) | YES | | NULL | |
| host | varchar(60) | YES | | NULL | |
| distribution_id | bigint(8) | YES | | NULL | |
+-----------------+--------------+------+-----+---------+-------+
11 rows in set (Elapsed: 00:00:00.00)
重分布任务状态
- STARTING 准备状态,排队中
- RUNNING 正在重分布中
- COMPLETED 已经完成了重分布
- PAUSED 任务暂停
- CANCELED 任务取消
状态转换规则如下所示
- 表处于 STARTING 状态时,Coordinator 后台线程开始执行表的 rebalance 操作,表状态转换成 RUNNING。
- 表处于 PAUSED 状态时,对这个表执行 continue rebalance 操作,表状态转换成RUNNING。
- 表处于 RUNNING 状态时,coordinator 后台线程执行表的 rebalance 操作失败,表状态转换成 STARTING。
- 表处于 RUNNING 状态时,coordinator 后台线程完成了表的 rebalance 操作,表状态转换成 COMPLETED。
查询总体重分布进度
根据STATUS统计每个状态的数量。如下例子,等待中的15个,正在重分布的3个,完成了3个。
gbase> select status,count(*) from gclusterdb.rebalancing_status group by status;
+-----------+----------+
| status | count(*) |
+-----------+----------+
| STARTING | 15 |
| RUNNING | 3 |
| COMPLETED | 3 |
+-----------+----------+
3 rows in set (Elapsed: 00:00:00.61)
当所有的表都处于COMPLETED状态,则重分布全部完成。
gbase> select status,count(*) from gclusterdb.rebalancing_status group by status;
+-----------+----------+
| status | count(*) |
+-----------+----------+
| COMPLETED | 21 |
+-----------+----------+
1 row in set (Elapsed: 00:00:00.19)
查看正在重分布的任务进度
查看状态为RUNNING的进度。如下是4个正在运行的任务。
gbase> select * from rebalancing_status where status='RUNNING';
+-----------------------------------+------------+----------------------------+----------+----------------------------+------------+-----------+------------+----------+----------+-----------------+
| index_name | db_name | table_name | tmptable | start_time | end_time | status | percentage | priority | host | distribution_id |
+-----------------------------------+------------+----------------------------+----------+----------------------------+------------+-----------+------------+----------+----------+-----------------+
| testdb.t1 | testdb | t1 | | 2018-11-07 09:05:35.216000 | | RUNNING | 30 | 5 | 2001::61 | 2 |
| testdb.aggr_http_ci_hour_20180904 | testdb | aggr_http_ci_hour_20180904 | | 2018-11-07 09:05:35.276000 | | RUNNING | 10 | 5 | 2001::61 | 2 |
| testdb.dim_ci | testdb | dim_ci | | 2018-11-07 09:05:35.333000 | | RUNNING | 50 | 5 | 2001::61 | 2 |
| gclusterdb.audit_log_express | gclusterdb | audit_log_express | | 2018-11-07 09:05:35.251000 | | RUNNING | 20 | 5 | 2001::61 | 2 |
+-----------------------------------+------------+----------------------------+----------+----------------------------+------------+-----------+------------+----------+----------+-----------------+
4 rows in set (Elapsed: 00:00:00.02)
需要注意,默认重分布时,每个节点,每次处理的数据是通过参数(gcluster_rebalancing_step)配置的,在V8里默认是2000万行,在V95里是1亿行。
再次提醒,是每个节点。比如100一行,10个节点平均,则每个节点处理10亿行,需要处理10轮。对应的percentage在处理时会变动。如果节点处理的数据少于参数值,则一次性处理,percentage从0直接到99或者100。
暂定重分布
pause rebalance table 库名.表名
以rebalancing_status里面的index_name为准。
gbase> select * from rebalancing_status;
+---------------------------+---------+--------------------+----------+----------------------------+----------+----------+------------+----------+------+-----------------+
| index_name | db_name | table_name | tmptable | start_time | end_time | status | percentage | priority | host | distribution_id |
+---------------------------+---------+--------------------+----------+----------------------------+----------+----------+------------+----------+------+-----------------+
| testdb.t1 | testdb | t1 | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t2 | testdb | t2 | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_procedure | testdb | t_procedure | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_procedure_testdb | testdb | t_procedure_testdb | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_enc | testdb | t_enc | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.ta | testdb | ta | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
+---------------------------+---------+--------------------+----------+----------------------------+----------+----------+------------+----------+------+-----------------+
6 rows in set (Elapsed: 00:00:00.01)
gbase> pause rebalance table testdb.t1;
Query OK, 1 row affected (Elapsed: 00:00:00.08)
gbase> select * from rebalancing_status;
+---------------------------+---------+--------------------+----------+----------------------------+----------+----------+------------+----------+------+-----------------+
| index_name | db_name | table_name | tmptable | start_time | end_time | status | percentage | priority | host | distribution_id |
+---------------------------+---------+--------------------+----------+----------------------------+----------+----------+------------+----------+------+-----------------+
| testdb.t1 | testdb | t1 | NULL | 2022-06-01 09:49:42.633000 | NULL | PAUSED | 0 | 5 | NULL | 4 |
| testdb.t2 | testdb | t2 | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_procedure | testdb | t_procedure | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_procedure_testdb | testdb | t_procedure_testdb | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_enc | testdb | t_enc | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.ta | testdb | ta | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
+---------------------------+---------+--------------------+----------+----------------------------+----------+----------+------------+----------+------+-----------------+
6 rows in set (Elapsed: 00:00:00.14)
继续重分布
continue rebalance table 库名.表名
以rebalancing_status里面的index_name为准。
gbase> select * from rebalancing_status;
+---------------------------+---------+--------------------+----------+----------------------------+----------+----------+------------+----------+------+-----------------+
| index_name | db_name | table_name | tmptable | start_time | end_time | status | percentage | priority | host | distribution_id |
+---------------------------+---------+--------------------+----------+----------------------------+----------+----------+------------+----------+------+-----------------+
| testdb.t1 | testdb | t1 | NULL | 2022-06-01 09:49:42.633000 | NULL | PAUSED | 0 | 5 | NULL | 4 |
| testdb.t2 | testdb | t2 | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_procedure | testdb | t_procedure | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_procedure_testdb | testdb | t_procedure_testdb | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_enc | testdb | t_enc | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.ta | testdb | ta | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
+---------------------------+---------+--------------------+----------+----------------------------+----------+----------+------------+----------+------+-----------------+
6 rows in set (Elapsed: 00:00:00.14)
gbase>
gbase>
gbase>
gbase> continue rebalance table testdb.t1;
Query OK, 1 row affected (Elapsed: 00:00:00.08)
gbase> select * from rebalancing_status;
+---------------------------+---------+--------------------+----------+----------------------------+----------------------------+-----------+------------+----------+------------+-----------------+
| index_name | db_name | table_name | tmptable | start_time | end_time | status | percentage | priority | host | distribution_id |
+---------------------------+---------+--------------------+----------+----------------------------+----------------------------+-----------+------------+----------+------------+-----------------+
| testdb.t1 | testdb | t1 | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t2 | testdb | t2 | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_procedure | testdb | t_procedure | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_procedure_testdb | testdb | t_procedure_testdb | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.t_enc | testdb | t_enc | NULL | 2022-06-01 09:49:42.633000 | NULL | STARTING | 0 | 5 | NULL | 4 |
| testdb.ta | testdb | ta | NULL | 2022-06-01 09:56:27.395000 | 2022-06-01 09:56:27.395000 | COMPLETED | 0 | 5 | 10.0.2.101 | 4 |
+---------------------------+---------+--------------------+----------+----------------------------+----------------------------+-----------+------------+----------+------------+-----------------+
6 rows in set (Elapsed: 00:00:00.01)
取消重分布
cancel rebalance table 库名.表名
以rebalancing_status里面的index_name为准。
不保证一定能成功。
总结
通过统计已经FINISHED状态的表数量,计算出整体进度。通过查询RUNNING状态的数据,查看当前表的进度。
由于不同的表,数据量不同,耗时不同,所以通过进度来评估预计耗时,并没有那么精确,只能作为参考。用户可以根据业务数据量情况,划分不同的优先级,以此来评估每个不同优先级的完成进度,相对更准确些。