GBase 8a如何确认管理集群gcware的leader节点

GBase 8a的gcware服务集群用于各节点GCluster实例间共享信息(包括集群结构,节点状态,节点资源状态等信息),以及控制多副本数据操作时,提供可操作节点,控制各节点数据一致性状态。gcware集群由多个节点组成,一般是单数,其中有一个节点被选为leader,其余的为follower。leader节点保存当前最新的数据,本文介绍如何确认leader节点的方法。

gcware日志

该日志默认保存在安装目录/gcware/log/gcware.log。

Leader标记

当某个节点成为leader时,日志里会出现如下信息。其中XXXXX是一个数字。

switch to leader with term:XXXXX

排查时,可以用 switch to, 这样能看到选举者candidate->leader以及candidate->follower的过程

grep "switch to" /opt/gbase/*/gcware/log/gcware.log  | tail

确认当前leader方法

查找每个gcware节点【最新】的Leader标记,日期最近(避免时钟不同步),数字最大的节点是Leader节点。比如如下2个节点的输出

103节点

[gbase@gbase_rh7_003 ~]$ grep "switch to" /opt/gbase/*/gcware/log/gcware.log  | tail
Feb 07 17:29:44.546192 INFO  [GCWARE] node 1728184330 switch to candidate
Feb 07 17:29:47.887201 INFO  [GCWARE] node 1728184330 switch to candidate
Feb 07 17:29:50.295710 INFO  [GCWARE] node 1728184330 switch to candidate
Feb 07 17:29:50.320778 INFO  [GCWARE] node 1728184330 switch to leader with term:123
Feb 09 08:41:20.388174 INFO  [GCWARE] node 1728184330 switch to follower with term:124 leader:0
Feb 09 08:41:22.456324 INFO  [GCWARE] node 1728184330 switch to follower with term:125 leader:0
Feb 09 08:41:24.643133 INFO  [GCWARE] node 1728184330 switch to follower with term:126 leader:0
Feb 10 09:56:37.339436 INFO  [GCWARE] node 1728184330 switch to follower with term:126 leader:0
Feb 10 09:56:39.503562 INFO  [GCWARE] node 1728184330 switch to candidate
Feb 10 09:56:41.279513 INFO  [GCWARE] node 1728184330 switch to follower with term:205 leader:0
[gbase@gbase_rh7_003 ~]$

104节点

[gbase@gbase_rh7_004 log]$ grep "switch to " /opt/gbase/*/gcware/log/gcware.log | tail
Feb 09 09:14:37.059239 INFO  [GCWARE] node 1744961546 switch to candidate
Feb 09 09:14:40.076193 INFO  [GCWARE] node 1744961546 switch to candidate
Feb 10 09:56:21.190265 INFO  [GCWARE] node 1744961546 switch to follower with term:199 leader:0
Feb 10 09:56:24.842071 INFO  [GCWARE] node 1744961546 switch to candidate
Feb 10 09:56:27.875309 INFO  [GCWARE] node 1744961546 switch to candidate
Feb 10 09:56:30.875920 INFO  [GCWARE] node 1744961546 switch to candidate
Feb 10 09:56:33.957789 INFO  [GCWARE] node 1744961546 switch to candidate
Feb 10 09:56:37.263772 INFO  [GCWARE] node 1744961546 switch to candidate
Feb 10 09:56:41.272030 INFO  [GCWARE] node 1744961546 switch to candidate
Feb 10 09:56:41.519532 INFO  [GCWARE] node 1744961546 switch to leader with term:205
[gbase@gbase_rh7_004 log]$

其中103节点最新是2月7日的,term数值为123;104节点最新是2月10日。

在2月10日 09:56;41发生了选举,其中104成为leader节点,103为follower

总结

本功能一般情况下用不到,集群内部会自动协调。只有在集群发生断电等异常情况,并且发生部分节点gcwaer文件损坏,必须手工处理gcware文件时,才需要判断哪个可用的gcware是最新的。