南大通用gcdw云数仓之foundationdb 数据库集群的管理(启停,配置文件,扩容缩容,替换,升级和卸载)

本文重点是对一个已经存在的foundationdb 集群的管理。包括启停服务,增加扩容、缩容、升级,卸载数据库等。

环境

本文的例子为redhat 7.9操作系统,三节点的foundationdb集群,IP为10.0.2.81-83。版本为v6.3.13

[root@k8s-81 ~]# fdbcli
Using cluster file `/etc/foundationdb/fdb.cluster'.

The database is available.

Welcome to the fdbcli. For help, type `help'.
fdb> status

Using cluster file `/etc/foundationdb/fdb.cluster'.

Configuration:
  Redundancy mode        - double
  Storage engine         - memory-2
  Coordinators           - 3
  Usable Regions         - 1

Cluster:
  FoundationDB processes - 3
  Zones                  - 3
  Machines               - 3
  Memory availability    - 1.4 GB per process on machine with least available
                           >>>>> (WARNING: 4.0 GB recommended) <<<<<
  Fault Tolerance        - 1 machines
  Server time            - 03/03/23 08:06:20

Data:
  Replication health     - Healthy
  Moving data            - 0.000 GB
  Sum of key-value sizes - 1 MB
  Disk space used        - 328 MB

Operating space:
  Storage server         - 1.0 GB free on most full server
  Log server             - 13.4 GB free on most full server

Workload:
  Read rate              - 12 Hz
  Write rate             - 0 Hz
  Transactions started   - 5 Hz
  Transactions committed - 0 Hz
  Conflict rate          - 0 Hz

Backup and DR:
  Running backups        - 0
  Running DRs            - 0

Client time: 03/03/23 08:06:19

fdb>

集群配置文件为

[root@k8s-81 ~]# cat /etc/foundationdb/fdb.cluster
QdBWBJdf:lufGQ2kwQ9QURHhZKdG2B154wRF7066s@10.0.2.81:4500,10.0.2.82:4500,10.0.2.83:4500
[root@k8s-81 ~]#

fdb的配置文件为安装后默认的,未作修改

[root@k8s-81 ~]# cat /etc/foundationdb/foundationdb.conf
## foundationdb.conf
##
## Configuration file for FoundationDB server processes
## Full documentation is available at
## https://apple.github.io/foundationdb/configuration.html#the-configuration-file

[fdbmonitor]
user = foundationdb
group = foundationdb

[general]
restart_delay = 60
## by default, restart_backoff = restart_delay_reset_interval = restart_delay
# initial_restart_delay = 0
# restart_backoff = 60
# restart_delay_reset_interval = 60
cluster_file = /etc/foundationdb/fdb.cluster
# delete_envvars =
# kill_on_configuration_change = true

## Default parameters for individual fdbserver processes
[fdbserver]
command = /usr/sbin/fdbserver
public_address = auto:$ID
listen_address = public
datadir = /var/lib/foundationdb/data/$ID
logdir = /var/log/foundationdb
# logsize = 10MiB
# maxlogssize = 100MiB
# machine_id =
# datacenter_id =
# class =
# memory = 8GiB
# storage_memory = 1GiB
# cache_memory = 2GiB
# metrics_cluster =
# metrics_prefix =

## An individual fdbserver process with id 4500
## Parameters set here override defaults from the [fdbserver] section
[fdbserver.4500]

[backup_agent]
command = /usr/lib/foundationdb/backup_agent/backup_agent
logdir = /var/log/foundationdb

[backup_agent.1]
[root@k8s-81 ~]#

参考

https://apple.github.io/foundationdb/administration.html

启停数据库服务

命令如下。每个节点各自操作。

systemctl start foundationdb
systemctl stop foundationdb
systemctl status foundationdb

查看服务状态

其中Active为当前服务启动状态,running为启动。

[root@k8s-81 ~]# systemctl status foundationdb
● foundationdb.service - FoundationDB Key-Value Store
   Loaded: loaded (/usr/lib/systemd/system/foundationdb.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2023-03-03 08:03:13 CST; 1min 34s ago
  Process: 957 ExecStart=/usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonitor.pid --daemonize (code=exited, status=0/SUCCESS)
 Main PID: 960 (fdbmonitor)
    Tasks: 8
   Memory: 51.1M
   CGroup: /system.slice/foundationdb.service
           ├─ 960 /usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonito...
           ├─ 966 /usr/sbin/fdbserver --cluster_file /etc/foundationdb/fdb.cluster --datadir /var/lib/foundationdb/data/4500 --...
           └─3137 /usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonito...

Mar 03 08:03:12 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/back...gent.1
Mar 03 08:03:13 k8s-81 systemd[1]: Started FoundationDB Key-Value Store.
Mar 03 08:03:12 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundation...gent.1
Mar 03 08:03:12 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Process 967 exited 0, restarting in ...econds
Mar 03 08:03:19 k8s-81 fdbmonitor[960]: LogGroup="default" Process="fdbserver.4500": Warning: FDBD has not joined the clu...conds.
Mar 03 08:03:19 k8s-81 fdbmonitor[960]: LogGroup="default" Process="fdbserver.4500":   Check configuration and availabili...fdbcli
Mar 03 08:03:22 k8s-81 fdbmonitor[960]: LogGroup="default" Process="fdbserver.4500": FDBD joined cluster.
Mar 03 08:04:09 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/back...gent.1
Mar 03 08:04:09 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundation...gent.1
Mar 03 08:04:09 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Process 968 exited 0, restarting in ...econds
Hint: Some lines were ellipsized, use -l to show in full.
[root@k8s-81 ~]#

停止数据库

Active的值为inactive为停止状态。

[root@k8s-81 ~]# systemctl stop foundationdb
[root@k8s-81 ~]# systemctl status foundationdb
● foundationdb.service - FoundationDB Key-Value Store
   Loaded: loaded (/usr/lib/systemd/system/foundationdb.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Fri 2023-03-03 08:09:24 CST; 5s ago
  Process: 957 ExecStart=/usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonitor.pid --daemonize (code=exited, status=0/SUCCESS)
 Main PID: 960 (code=exited, status=0/SUCCESS)

Mar 03 08:07:18 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Process 5983 exited 0, restarting in...econds
Mar 03 08:08:12 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/back...gent.1
Mar 03 08:08:12 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundation...gent.1
Mar 03 08:08:12 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Process 6251 exited 0, restarting in...econds
Mar 03 08:09:17 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/back...gent.1
Mar 03 08:09:17 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundation...gent.1
Mar 03 08:09:17 k8s-81 fdbmonitor[960]: LogGroup="default" Process="backup_agent.1": Process 6503 exited 0, restarting in...econds
Mar 03 08:09:24 k8s-81 systemd[1]: Stopping FoundationDB Key-Value Store...
Mar 03 08:09:24 k8s-81 fdbmonitor[960]: LogGroup="default" Process="fdbmonitor": Received signal 15 (Terminated), shutting down
Mar 03 08:09:24 k8s-81 systemd[1]: Stopped FoundationDB Key-Value Store.
Hint: Some lines were ellipsized, use -l to show in full.
[root@k8s-81 ~]#

启动数据库

[root@k8s-81 ~]# systemctl start foundationdb
[root@k8s-81 ~]# systemctl status foundationdb
● foundationdb.service - FoundationDB Key-Value Store
   Loaded: loaded (/usr/lib/systemd/system/foundationdb.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2023-03-03 08:10:47 CST; 2s ago
  Process: 7219 ExecStart=/usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonitor.pid --daemonize (code=exited, status=0/SUCCESS)
 Main PID: 7220 (fdbmonitor)
    Tasks: 9
   Memory: 24.0M
   CGroup: /system.slice/foundationdb.service
           ├─7220 /usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonito...
           ├─7222 /usr/sbin/fdbserver --cluster_file /etc/foundationdb/fdb.cluster --datadir /var/lib/foundationdb/data/4500 --...
           └─7224 /usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonito...

Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="fdbmonitor": Starting fdbserver.4500
Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="fdbserver.4500": Launching /usr/sbin/fdbserver (7222...r.4500
Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/bac...gent.1
Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundatio...gent.1
Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Process 7221 exited 0, restarting i...econds
Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/bac...gent.1
Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundatio...gent.1
Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Process 7223 exited 0, restarting i...econds
Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="fdbserver.4500": FDBD joined cluster.
Mar 03 08:10:47 k8s-81 systemd[1]: Started FoundationDB Key-Value Store.
Hint: Some lines were ellipsized, use -l to show in full.
[root@k8s-81 ~]#

开机自动启动

关闭开机自启动

systemctl disable foundationdb

执行disable后,查看status,可以看到loaded部分,disbled表示关闭了开机自动启动。

[root@k8s-81 ~]# systemctl disable foundationdb
Removed symlink /etc/systemd/system/multi-user.target.wants/foundationdb.service.
[root@k8s-81 ~]# systemctl status foundationdb
● foundationdb.service - FoundationDB Key-Value Store
   Loaded: loaded (/usr/lib/systemd/system/foundationdb.service; disabled; vendor preset: disabled)
   Active: active (running) since Fri 2023-03-03 08:10:47 CST; 2min 34s ago
 Main PID: 7220 (fdbmonitor)
   CGroup: /system.slice/foundationdb.service
           ├─7220 /usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonito...
           ├─7222 /usr/sbin/fdbserver --cluster_file /etc/foundationdb/fdb.cluster --datadir /var/lib/foundationdb/data/4500 --...
           └─7761 /usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonito...

Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundatio...gent.1
Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Process 7223 exited 0, restarting i...econds
Mar 03 08:10:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="fdbserver.4500": FDBD joined cluster.
Mar 03 08:10:47 k8s-81 systemd[1]: Started FoundationDB Key-Value Store.
Mar 03 08:11:42 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/bac...gent.1
Mar 03 08:11:42 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundatio...gent.1
Mar 03 08:11:42 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Process 7224 exited 0, restarting i...econds
Mar 03 08:12:45 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/bac...gent.1
Mar 03 08:12:45 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundatio...gent.1
Mar 03 08:12:45 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Process 7477 exited 0, restarting i...econds
Hint: Some lines were ellipsized, use -l to show in full.

打开开机自启动

systemctl enable foundationdb

查看status,可以看到loaded部分,enabled表示关闭了开机自动启动。

[root@k8s-81 ~]# systemctl enable foundationdb
Created symlink from /etc/systemd/system/multi-user.target.wants/foundationdb.service to /usr/lib/systemd/system/foundationdb.service.
[root@k8s-81 ~]# systemctl status foundationdb
● foundationdb.service - FoundationDB Key-Value Store
   Loaded: loaded (/usr/lib/systemd/system/foundationdb.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2023-03-03 08:10:47 CST; 4min 57s ago
 Main PID: 7220 (fdbmonitor)
   CGroup: /system.slice/foundationdb.service
           ├─7220 /usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonito...
           ├─7222 /usr/sbin/fdbserver --cluster_file /etc/foundationdb/fdb.cluster --datadir /var/lib/foundationdb/data/4500 --...
           └─8332 /usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonito...

Mar 03 08:11:42 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Process 7224 exited 0, restarting i...econds
Mar 03 08:12:45 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/bac...gent.1
Mar 03 08:12:45 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundatio...gent.1
Mar 03 08:12:45 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Process 7477 exited 0, restarting i...econds
Mar 03 08:13:50 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/bac...gent.1
Mar 03 08:13:50 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundatio...gent.1
Mar 03 08:13:50 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Process 7761 exited 0, restarting i...econds
Mar 03 08:14:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/bac...gent.1
Mar 03 08:14:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Unable to launch /usr/lib/foundatio...gent.1
Mar 03 08:14:47 k8s-81 fdbmonitor[7220]: LogGroup="default" Process="backup_agent.1": Process 8079 exited 0, restarting i...econds
Hint: Some lines were ellipsized, use -l to show in full.
[root@k8s-81 ~]#

启动、停止和重启的实现

fdbmonitor负责fdbserver和backup-agent的守护

开机启动和重启,都是通过fdbmonitor 进程实现的。其负责fdbserver和backup-agent进程的启动。如果进程因任何原因消失,则fdmonitor负责重启启动这2个进程。

[root@k8s-81 ~]# cat  /usr/lib/systemd/system/foundationdb.service
[Unit]
Description=FoundationDB Key-Value Store
After=syslog.target network-online.target
Wants=network-online.target

[Service]
Type=forking
PIDFile=/var/run/fdbmonitor.pid
ExecStart=/usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonitor.pid --daemonize
KillMode=process

[Install]
WantedBy=multi-user.target
[root@k8s-81 ~]#

systemd负责foundationdb.service(fdbmonitor)的守护

如果fdbmonitor进程自己由于某些原因消失,比如oom,则由操作系统负责启动它。默认systemd会在60秒内重启fdbmonitor. 你也可以自行创建配置文件,修改重启参数。

/etc/systemd/system/foundationdb.service.d/override.conf

增加如下参数

[Service]
RestartSec=20s

如果不想systemd自动重启fdbmonitor,可以使用如下参数

[Service]
Restart=no

集群文件

foundationdb和客户端,都使用fdbc.cluster的配置来连接集群。集群中所有处理进程都使用相同的内容来连接集群。 该文件在集群安装时自动创建, 在修改coordinator时自动更新。fdbcli客户端可以将默认的配置文件复制一份到当前目录下。

该文件不建议手工更改,在集群扩容缩容时,会自动修改。 复制到本地的配置文件,可能需要手工更新。

默认集群文件位置

/etc/foundationdb/fdb.cluster

该文件是root用户的,其它用户可以将配置文件复制一份。

指定集群文件

fdbcli 

可以通过-C参数(大写的C),指定用哪个集群文件。文件名,也可以自定义,比如加上IP,账号等。

[root@k8s-81 opt]# fdbcli -C /root/fdb.cluster
Using cluster file `/root/fdb.cluster'.

The database is available.

Welcome to the fdbcli. For help, type `help'.
fdb> exit
[root@k8s-81 opt]# mv /root/fdb.cluster  /root/fdb.cluster.root
[root@k8s-81 opt]# fdbcli -C /root/fdb.cluster.root
Using cluster file `/root/fdb.cluster.root'.

The database is available.

Welcome to the fdbcli. For help, type `help'.
fdb> exit
[root@k8s-81 opt]#

API

https://apple.github.io/foundationdb/api-reference.html

python API

可以在open时,指定集群文件

fdb.open(cluster_file=None, event_model=None)
GO API
func OpenDatabase(clusterFile string) (Database, error)
java API

https://apple.github.io/foundationdb/javadoc/com/apple/foundationdb/FDB.html

com.apple.foundationdb.FDB
Database	open​(java.lang.String clusterFilePath)
import com.apple.foundationdb.Database;
import com.apple.foundationdb.FDB;
import com.apple.foundationdb.tuple.Tuple;

public class Example {
  public static void main(String[] args) {
    FDB fdb = FDB.selectAPIVersion(710);

    try(Database db = fdb.open("/root/fdb.cluster_10.0.2.83")) {
      // Run an operation on the database
      db.run(tr -> {
        tr.set(Tuple.from("hello").pack(), Tuple.from("world").pack());
        return null;
      });

      // Get the value of 'hello' from the database
      String hello = db.run(tr -> {
        byte[] result = tr.get(Tuple.from("hello").pack()).join();
        return Tuple.fromBytes(result).getString(0);
      });

      System.out.println("Hello " + hello);
    }
  }
}

FoundationDB和backup-agent进程的集群文件

在/etc/foundationdb/foundationdb.conf内配置

[root@k8s-81 opt]# cat /etc/foundationdb/foundationdb.conf
## foundationdb.conf
##
## Configuration file for FoundationDB server processes
## Full documentation is available at
## https://apple.github.io/foundationdb/configuration.html#the-configuration-file

[fdbmonitor]
user = foundationdb
group = foundationdb

[general]
restart_delay = 60
## by default, restart_backoff = restart_delay_reset_interval = restart_delay
# initial_restart_delay = 0
# restart_backoff = 60
# restart_delay_reset_interval = 60
cluster_file = /etc/foundationdb/fdb.cluster
# delete_envvars =
# kill_on_configuration_change = true

## Default parameters for individual fdbserver processes
[fdbserver]
command = /usr/sbin/fdbserver
public_address = auto:$ID
listen_address = public
datadir = /var/lib/foundationdb/data/$ID
logdir = /var/log/foundationdb
# logsize = 10MiB
# maxlogssize = 100MiB
# machine_id =
# datacenter_id =
# class =
# memory = 8GiB
# storage_memory = 1GiB
# cache_memory = 2GiB
# metrics_cluster =
# metrics_prefix =

## An individual fdbserver process with id 4500
## Parameters set here override defaults from the [fdbserver] section
[fdbserver.4500]

[backup_agent]
command = /usr/lib/foundationdb/backup_agent/backup_agent
logdir = /var/log/foundationdb

[backup_agent.1]
[root@k8s-81 opt]#

通过环境变量FDB_CLUSTER_FILE指定

方便应用连接多个数据库,而无需修改代码,只需要传递不同的环境变量即可。

集群文件的优先使用顺序

  • 命令行 -C指定的参数,或者open时指定的参数
  • 环境变量FDB_CLUSTER_FILE 指定的
  • 当前目录下,fdb.cluster指定的
  • 默认的

注意,集群文件必须由访问权限。

另外如果提供了错误的FDB_CLUSTER_FILE (为空,或者文件不存在),则会报错,而不是继续寻找其它的集群文件)

集群文件需要的权限

foundationDB 的集群文件和上级目录,必须有读写权限,因为其更新coordinator时需要自动修改每个节点的配置文件。如果权限不足,会导致无法连接服务。

客户端和API,有读权限即可。

集群文件的格式

如下是默认位置的3节点集群配置文件,每个节点上完全一样。

[root@k8s-81 ~]# cat /etc/foundationdb/fdb.cluster
QdBWBJdf:lufGQ2kwQ9QURHhZKdG2B154wRF7066s@10.0.2.81:4500,10.0.2.82:4500,10.0.2.83:4500
[root@k8s-81 ~]#

其格式如下,其中多个coordinator用逗号分隔。每个节点包括IP和PORT。标识与地址之间用@分割。标识描述与ID之间用冒号(:)分割。

description:ID@IP:PORT,IP:PORT,...
  • description 描述集群的逻辑描述,包含数字和字母以及下划线
  • ID 包含字母数字字符(a-z、a-z、0-9)的任意值。可以用随机8字符标识符(例如mktemp-u XXXXXXXX的输出)。当协调员更改时,所有节点的ID将自动更改。
  • IP:PORT 主机和端口

其中description:ID是唯一标识。所以不同的集群,要使用不同的唯一标识,否则可能会导致数据损坏。

经过部分测试,如果该内容被修改,fdbcli将无法连接,而且除了从备份恢复,尚不清楚从哪里能找到这个字符串。

从客户端获得集群文件信息

前提是,你的任意一种客户端,已经连接上了集群,然后就可以如下的方式,获得集群文件信息。

获得集群文件的位置

fdb> get \xFF\xFF/cluster_file_path
`\xff\xff/cluster_file_path' is `/etc/foundationdb/fdb.cluster'

获得集群文件的内容

fdb> get \xFF\xFF/connection_string
`\xff\xff/connection_string' is `QdBWBJdf:lufGQ2kwQ9QURHhZKdG2B154wRF7066s@10.0.2.81:4500,10.0.2.82:4500,10.0.2.83:4500'

IPV6支持

ipv6的地址和端口,采用如下的格式

[IP]:PORT

其中IP用中括号包围。比如

[::1]:4800 或者[abcd::dead:beef]:4500

ipv4和ipv6可以混合书写

description:ID@127.0.0.1:4500,[::1]:4500,...

修改IP可以参考foundationdb配置文档,其中的【public-address】属性,可以修改成IPV6的地址

[fdbserver]
command = /usr/sbin/fdbserver
#public_address = auto:$ID
public_address = [2001::81]:4500

扩容,向集群增加机器

新机器上安装服务

安装foundationdb需要的rpm包。根据需要配置服务。

复制fdb.cluster

从原有机器,覆盖掉/etc/foundationdb/fdb.cluster

重启新机器的fdb服务

systemctl restart foundationdb

缩容,从集群移除机器

确认副本策略

在机器被移除后,副本策略依然能保证集群可用,比如2副本或3副本,不要出现移除机器后,集群不可用的情况。

比如3副本模式,而缩容后的集群数量少于5个,则应该先降低副本策略,降低冗余数量。

修改coordinator策略

如果缩容机器是coordinator节点,则需要先通过coordinators命令调整调度节点配置.如下是将3节点的83节点移除。

fdb> coordinators
Cluster description: QdBWBJdf
Cluster coordinators (3): 10.0.2.81:4500,10.0.2.82:4500,10.0.2.83:4500
Type `help coordinators' to learn how to change this information.
fdb>
fdb> coordinators 10.0.2.81:4500 10.0.2.82:4500
Coordination state changed
fdb> coordinators
Cluster description: QdBWBJdf
Cluster coordinators (2): 10.0.2.81:4500,10.0.2.82:4500
Type `help coordinators' to learn how to change this information.
fdb>

查看集群文件,已经修改,包括83节点

[root@k8s-83 ~]# cat /etc/foundationdb/fdb.cluster
# DO NOT EDIT!
# This file is auto-generated, it is not to be edited by hand
QdBWBJdf:C8R2S3zhBQsZMQL8hcWNMlDZ40xUpQcu@10.0.2.81:4500,10.0.2.82:4500
[root@k8s-83 ~]#

运行exclude排除节点

移除参数为节点IP和端口, 如果只有IP,则该IP上的所有进程都将移除。

fdb> exclude 10.0.2.83:4500
Waiting for state to be removed from all excluded servers. This may take a while.
(Interrupting this wait with CTRL+C will not cancel the data movement.)
  10.0.2.83:4500  ---- Successfully excluded. It is now safe to remove this process from the cluster.
fdb>

移除需要搬移数据,耗时较长,如果按了CTRL+C, 不会取消数据迁移,而是在后台继续执行。再次运行这个命令,会继续出现等待完成的界面。 如果想取消exclude,需要运行include命令。

查看当前移除节点列表

fdb> exclude
There are currently 1 servers or processes being excluded from the database:
  10.0.2.83:4500
To find out whether it is safe to remove one or more of these
servers from the cluster, type `exclude <addresses>'.
To return one of these servers to the cluster, type `include <addresses>'.
fdb>

关闭移除节点服务,

systemctl stop foundationdb
systemctl disable foundationdb

卸载安装程序

yum remove foundationdb-server
yum remove foundationdb-client

运行include节点重新加入

可以用 all参数,将所有的exclude的都加入,也可以只加入部分,指定IP或者IP:Port

Usage: include all|<ADDRESS...>
fdb> include all
fdb> exclude
There are currently no servers excluded from the database.
To learn how to exclude a server, type `help exclude'.
fdb>

更换或迁移机器

原则上是扩容和缩容的组合。 先扩容新机器,然后缩容老机器。 期间缩容操作会搬迁数据,耗时较长。

卸载

将rpm包删除即可,包括server和client

rpm -e foundationdb-clients foundationdb-server

升级

升级集群rpm包即可,所有节点必须全部升级。

 rpm -Uvh foundationdb-clients-7.2.3-1.el7.x86_64.rpm foundationdb-server-7.2.3-1.el7.x86_64.rpm