GBase 8a在执行时。为避免并发冲突保证一致性,会持有一些锁来保证自己需要的资源在执行期间不会出现问题。锁在SQL执行完毕后会自动释放掉。在某些特殊场景下,特别是一些老版本集群,出现需要强行释放掉锁的需求,比如SQL长时间无法结束,而该SQL持有的锁又导致其它的SQL无法正常运行,同时环境又不能重启节点服务时,可以考虑本文的方法强行释放SQL持有的锁。
对于DDL,DML类的SQL,一般要配合failover清理:GBase 8a查看和清理故障恢复状态Failover的方法
目录导航
参考
GBase8a MPP Cluster查看集群锁 showlock
GBase 8a查看和清理故障恢复状态Failover的方法
构造锁的测试场景
如下通过lock table语句构造了一个持有锁的SQL场景,连接节点为201。
集群版本:8.6.2.43-R33.129391
[root@rh6-1 ~]# gcadmin
CLUSTER STATE: ACTIVE
CLUSTER MODE: NORMAL
=================================================================
| GBASE COORDINATOR CLUSTER INFORMATION |
=================================================================
| NodeName | IpAddress |gcware |gcluster |DataState |
-----------------------------------------------------------------
| coordinator1 | 10.0.2.201 | OPEN | OPEN | 0 |
-----------------------------------------------------------------
=============================================================
| GBASE DATA CLUSTER INFORMATION |
=============================================================
|NodeName | IpAddress |gnode |syncserver |DataState |
-------------------------------------------------------------
| node1 | 10.0.2.201 | OPEN | OPEN | 0 |
-------------------------------------------------------------
| node2 | 10.0.2.202 | OPEN | OPEN | 0 |
-------------------------------------------------------------
gbase> lock table t1 write;
Query OK, 0 rows affected (Elapsed: 00:00:00.01)
gbase> show processlist;
+-----+-----------------+-----------+--------+---------+-------+-----------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+-----------------+-----------+--------+---------+-------+-----------------------------+------------------+
| 1 | event_scheduler | localhost | NULL | Daemon | 42262 | Waiting for next activation | NULL |
| 258 | root | localhost | testdb | Query | 0 | NULL | show processlist |
+-----+-----------------+-----------+--------+---------+-------+-----------------------------+------------------+
2 rows in set (Elapsed: 00:00:00.00)
查看SQL持有的锁
通过SQL查看锁
可以通过show detail processlist查看SQL当前持有的锁,需要的锁等信息。详情请参考
GBase8a 显示集群正在跑的SQL进程show [full | detail] processlist
可以看到ID=258的连接持有的3个锁,用分号分割:testdb;testdb.t1;testdb.t1580D5F90-B287-4199-B057-E6FBD44B5BFA
gbase> show detail processlist;
+-----+-------+-----------------+-----------+--------+---------+-------+-----------------------------+----------------------------------------------------------------+------+-------------------------+
| Id | Tid | User | Host | db | Command | Time | State | Lock | Wait | Info |
+-----+-------+-----------------+-----------+--------+---------+-------+-----------------------------+----------------------------------------------------------------+------+-------------------------+
| 1 | 18648 | event_scheduler | localhost | NULL | Daemon | 42525 | Waiting for next activation | NULL | NULL | NULL |
| 259 | 9985 | root | localhost | testdb | Query | 0 | NULL | testdb;testdb.t1;testdb.t1580D5F90-B287-4199-B057-E6FBD44B5BFA | NULL | show detail processlist |
+-----+-------+-----------------+-----------+--------+---------+-------+-----------------------------+----------------------------------------------------------------+------+-------------------------+
2 rows in set (Elapsed: 00:00:00.00)
通过gcadmin查看锁
通过命令行的gcadmin showlock查看锁,详情请参考
GBase8a MPP Cluster查看集群锁 showlock
[root@rh6-1 ~]# gcadmin showlock
+===================================================================================================================+
| GCLUSTER LOCK |
+===================================================================================================================+
+---------------------------------------------+----------+-------------------------------+--------------+------+----+
| Lock name | owner | content | create time |orphan|type|
+---------------------------------------------+----------+-------------------------------+--------------+------+----+
| testdb |10.0.2.201|LOCK_Lock_Tables: 259(LWP:9985)|20220223090419|FALSE | S |
+---------------------------------------------+----------+-------------------------------+--------------+------+----+
|testdb.t1580d5f90-b287-4199-b057-e6fbd44b5bfa|10.0.2.201|LOCK_Lock_Tables: 259(LWP:9985)|20220223090419|FALSE | E |
+---------------------------------------------+----------+-------------------------------+--------------+------+----+
| testdb.t1 |10.0.2.201|LOCK_Lock_Tables: 259(LWP:9985)|20220223090419|FALSE | S |
+---------------------------------------------+----------+-------------------------------+--------------+------+----+
Total : 3
[root@rh6-1 ~]#
通过python查看锁
通过python接口,从gcware.getlocks拿到锁。前面gcadmin showlock的信息就是从这里获得然后格式化输出的。请关注其中的
name:锁名字
lockid:锁ID,这个在showlock里没有显示出来。
[gbase@rh6-1 ubas]$ python
Python 2.6.6 (r266:84292, Oct 12 2012, 14:23:48)
[GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import gcware
>>> help('gcware')
>>> gcware.getlocks()
({'create time': '20220223090419', 'name': 'testdb', 'lockid': 1612718352306798592, 'orphan': 'FALSE', 'content': 'LOCK_Lock_Tables: 259(LWP:9985)', 'owner': '10.0.2.201', 'type': 'SHARE'}, {'create time': '20220223090419', 'name': 'testdb.t1580d5f90-b287-4199-b057-e6fbd44b5bfa', 'lockid': 8587815309107265537, 'orphan': 'FALSE', 'content': 'LOCK_Lock_Tables: 259(LWP:9985)', 'owner': '10.0.2.201', 'type': 'EXCLUSIVE'}, {'create time': '20220223090419', 'name': 'testdb.t1', 'lockid': 7728385150517510147, 'orphan': 'FALSE', 'content': 'LOCK_Lock_Tables: 259(LWP:9985)', 'owner': '10.0.2.201', 'type': 'SHARE'})
>>>
根据锁名字,从里面找到对应的lockid, 用于清理锁的参数。先找工具格式化一下,方便观看
{
'create time': '20220223090419',
'name': 'testdb',
'lockid': 1612718352306798592,
'orphan': 'FALSE',
'content': 'LOCK_Lock_Tables: 259(LWP:9985)',
'owner': '10.0.2.201',
'type': 'SHARE'
}, {
'create time': '20220223090419',
'name': 'testdb.t1580d5f90-b287-4199-b057-e6fbd44b5bfa',
'lockid': 8587815309107265537,
'orphan': 'FALSE',
'content': 'LOCK_Lock_Tables: 259(LWP:9985)',
'owner': '10.0.2.201',
'type': 'EXCLUSIVE'
}, {
'create time': '20220223090419',
'name': 'testdb.t1',
'lockid': 7728385150517510147,
'orphan': 'FALSE',
'content': 'LOCK_Lock_Tables: 259(LWP:9985)',
'owner': '10.0.2.201',
'type': 'SHARE'
}
通过python清理锁的方法
我们尝试释放一个排它锁
'create time': '20220223090419',
'name': 'testdb.t1580d5f90-b287-4199-b057-e6fbd44b5bfa',
'lockid': 8587815309107265537,
'orphan': 'FALSE',
'content': 'LOCK_Lock_Tables: 259(LWP:9985)',
'owner': '10.0.2.201',
'type': 'EXCLUSIVE'
接口
gcware.unlocktableforce(name,lockid)
执行效果
>>> gcware.unlocktableforce('testdb.t1580d5f90-b287-4199-b057-e6fbd44b5bfa',8587815309107265537);
0
>>>
查看锁确实少了一个
[root@rh6-1 ~]# gcadmin showlock
+===============================================================================+
| GCLUSTER LOCK |
+===============================================================================+
+---------+----------+-------------------------------+--------------+------+----+
|Lock name| owner | content | create time |orphan|type|
+---------+----------+-------------------------------+--------------+------+----+
| testdb |10.0.2.201|LOCK_Lock_Tables: 259(LWP:9985)|20220223090419|FALSE | S |
+---------+----------+-------------------------------+--------------+------+----+
|testdb.t1|10.0.2.201|LOCK_Lock_Tables: 259(LWP:9985)|20220223090419|FALSE | S |
+---------+----------+-------------------------------+--------------+------+----+
Total : 2
再开一个连接,尝试拿到锁。发现可以正常拿到Lock table write, 但show detail processlist的持有锁信息并没有刷新,怀疑....
[gbase@rh6-1 ubas]$ gccli testdb
GBase client 8.6.2.43-R33.129391. Copyright (c) 2004-2022, GBase. All Rights Reserved.
gbase> lock table t1 write;
Query OK, 0 rows affected (Elapsed: 00:00:00.01)
gbase> show detail processlist;
+-----+-------+-----------------+-----------+--------+---------+-------+-----------------------------+----------------------------------------------------------------+------+-------------------------+
| Id | Tid | User | Host | db | Command | Time | State | Lock | Wait | Info |
+-----+-------+-----------------+-----------+--------+---------+-------+-----------------------------+----------------------------------------------------------------+------+-------------------------+
| 1 | 18648 | event_scheduler | localhost | NULL | Daemon | 46724 | Waiting for next activation | NULL | NULL | NULL |
| 259 | 9985 | root | localhost | testdb | Sleep | 74 | | testdb;testdb.t1;testdb.t1580D5F90-B287-4199-B057-E6FBD44B5BFA | NULL | NULL |
| 267 | 10695 | root | localhost | testdb | Query | 0 | NULL | testdb;testdb.t1;testdb.t1580D5F90-B287-4199-B057-E6FBD44B5BFA | NULL | show detail processlist |
+-----+-------+-----------------+-----------+--------+---------+-------+-----------------------------+----------------------------------------------------------------+------+-------------------------+
3 rows in set (Elapsed: 00:00:00.00)
如果锁没有释放,会一直卡在这里
gbase> lock table t1 write;
^CQuery aborted by Ctrl+C
ERROR 1100 (HY000): Table 'testdb.t1' was not locked with LOCK TABLES
gbase>
查看实际的锁情况,已经更新
[root@rh6-1 ~]# gcadmin showlock
+====================================================================================================================+
| GCLUSTER LOCK |
+====================================================================================================================+
+---------------------------------------------+----------+--------------------------------+--------------+------+----+
| Lock name | owner | content | create time |orphan|type|
+---------------------------------------------+----------+--------------------------------+--------------+------+----+
| testdb |10.0.2.201|LOCK_Lock_Tables: 259(LWP:9985) |20220223090419|FALSE | S |
+---------------------------------------------+----------+--------------------------------+--------------+------+----+
| testdb |10.0.2.201|LOCK_Lock_Tables: 267(LWP:10695)|20220223101414|FALSE | S |
+---------------------------------------------+----------+--------------------------------+--------------+------+----+
|testdb.t1580d5f90-b287-4199-b057-e6fbd44b5bfa|10.0.2.201|LOCK_Lock_Tables: 267(LWP:10695)|20220223101414|FALSE | E |
+---------------------------------------------+----------+--------------------------------+--------------+------+----+
| testdb.t1 |10.0.2.201|LOCK_Lock_Tables: 259(LWP:9985) |20220223090419|FALSE | S |
+---------------------------------------------+----------+--------------------------------+--------------+------+----+
| testdb.t1 |10.0.2.201|LOCK_Lock_Tables: 267(LWP:10695)|20220223101414|FALSE | S |
+---------------------------------------------+----------+--------------------------------+--------------+------+----+
Total : 5
总结
SQL持有的锁是为了一致性,强行清理锁可能导致不可预期的结果,比如数据错误,所以只有在没有其它方法解决,且能通过其它手段确保数据一致性的情况下(比如手工检查和同步数据),才能强行清理锁。
请慎重。