Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
None
-
MaxScale server karma197:
OS: CentOS 7
Version: built from 2.4 branch, MariaDB MaxScale 2.4.1 started (Commit: 658aae6b6c77774c34d0f52c03f63edc2d44969e)
Clustrix nodes:
OS: CentOS 7
Version: clustrix-fredonyer-16045
-
MXS-SPRINT-88, MXS-SPRINT-89
Description
[root@karma108 ~]# clx s
|
Cluster Name: cld8d568c4e1f3e0db
|
Cluster Version: clustrix-fredonyer-16045
|
Cluster Status: OK
|
Cluster Size: 4 nodes - 16 CPUs per Node
|
Current Node: karma108 - nid 1
|
|
nid | Hostname | Status | IP Address | Zone | TPS | Used | Total
|
----+-----------+--------+--------------+------+-----+---------------+--------
|
1 | karma108 | OK | 10.2.13.91 | 1 | 0 | 1.8G (0.24%) | 762.9G
|
2 | karma180 | OK | 10.2.15.180 | 2 | 0 | 1.8G (0.24%) | 762.9G
|
3 | karma123 | OK | 10.2.15.89 | 3 | 0 | 1.8G (0.24%) | 762.9G
|
4 | karma065 | OK | 10.2.14.119 | 0 | 0 | 9.3M (0.00%) | 762.9G
|
----+-----------+--------+--------------+------+-----+---------------+--------
|
0 | 5.5G (0.18%) | 3.0T
|
Conf file snip:
|
[RCR]
|
type=service
|
router=readwritesplit
|
user=maxscale
|
password=maxscale_pw
|
cluster=Clustrix
|
transaction_replay=true
|
slave_selection_criteria=LEAST_GLOBAL_CONNECTIONS
|
delayed_retry_timeout=360s
|
transaction_replay_attempts=500
|
|
MaxScale run as:
|
[root@karma197 etc]# maxscale -d -f clustrix_karma108_trxreplay.cnf --user=root
|
|
[root@karma197 log]# maxctrl list servers
|
┌─────────────────────┬─────────────┬──────┬─────────────┬─────────────────┬──────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Clustrix:node-4 │ 10.2.14.119 │ 3306 │ 0 │ Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Clustrix:node-3 │ 10.2.15.89 │ 3306 │ 0 │ Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Clustrix:node-2 │ 10.2.15.180 │ 3306 │ 0 │ Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Clustrix:node-1 │ 10.2.13.91 │ 3306 │ 0 │ Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ Bootstrap1-karma108 │ 10.2.13.91 │ 3306 │ 0 │ Master, Running │ │
|
└─────────────────────┴─────────────┴──────┴─────────────┴─────────────────┴──────┘
|
Connected to MaxScale from client machine and run following:
[root@jones ~]# mysql -h karma197 -u maxscale -pmaxscale_pw -P 6008
|
mysql> set session autocommit=false;
|
Query OK, 0 rows affected (0.00 sec)
|
|
mysql> begin;
|
Query OK, 0 rows affected (0.00 sec)
|
|
mysql> use test;
|
Reading table information for completion of table and column names
|
You can turn off this feature to get a quicker startup with -A
|
|
Database changed
|
mysql> insert into t2 values (9);
|
Query OK, 1 row affected (0.03 sec)
|
Restart clustrix here by running:
[root@karma108 ~]# clx dbrestart
Success: stop_job krobix run on all nodes.
Success: start_job krobix run on all nodes.
Before the cluster forms, run following.
mysql> insert into t2 values (9);
|
Expected:
The trx should be replayed including the last statement that was attempted while the cluster was forming, once all nodes come up and the cluster forms. Last statement should complete successfully.
Actual:
Transaction does not get retried. Last statement execution never completed, but just got stuck. Waited multiple minutes. Same statement completed instantly before Clustrix service restart.
Relevant logs:
2019-08-07 22:33:05 info : > Autocommit: [disabled], trx is [open], cmd: (0x03) COM_QUERY, plen: 10, type: QUERY_TYPE_BEGIN_TRX, stmt: begin
|
2019-08-07 22:33:05 info : [readwritesplit] Connected to '@@Clustrix:node-1'
|
2019-08-07 22:33:05 info : [readwritesplit] Queuing query until '@@Clustrix:node-1' completes session command
|
2019-08-07 22:33:05 info : (1) Connected to '@@Clustrix:node-1' with thread id 3073
|
2019-08-07 22:33:05 info : (1) [readwritesplit] Reply complete, last reply from @@Clustrix:node-1
|
2019-08-07 22:33:05 info : (1) [readwritesplit] 1 session commands left on '@@Clustrix:node-1'
|
2019-08-07 22:33:05 info : (1) [readwritesplit] Reply complete, last reply from @@Clustrix:node-1
|
2019-08-07 22:33:05 info : (1) [readwritesplit] >>> Routing stored queries
|
2019-08-07 22:33:05 info : (1) [readwritesplit] New query received while transaction replay is active: insert into t2 values (9)
|
2019-08-07 22:33:05 info : (1) [readwritesplit] <<< Stored queries routed
|
2019-08-07 22:33:07 notice : Server changed state: Bootstrap1-karma108[10.2.13.91:3306]: master_up. [Down] -> [Master, Running]
|
Full logs and config file attached.
Clustrix log for service shutwon and cluster formation times:
2019-08-07 22:32:36.343713 UTC nid 1 karma108.colo.sproutsys.com clxnode: INFO dbcore/shutdown.c:61 shutdown_signal(): Shutting down node pfec37993505e4
|
d91
|
2019-08-07 22:32:36.365713 UTC nid 1 karma108.colo.sproutsys.com clxnode: INFO dbcore/coordinate_dbstart.c:193 write_coordinate_dbstart_file_done(): Dbs
|
tart coordination info ('4') written to /etc/clustrix/expected_nodes
|
2019-08-07 22:32:48.073414 UTC karma108.colo.sproutsys.com clxnode: INFO test/clxnode.c:410 clxnode_post_args(): Starting Clustrix, clustrix-fredonyer-1
|
6045, pid=25572, build=release
|
...
|
2019-08-07 22:33:04.143217 UTC nid 1 karma108.colo.sproutsys.com clxnode: ALERT NEW_GROUP INFO Node 1 has new group 2cfffe: { 1-4 } for cluster_id d8d568c4e1f3e0db
|