Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Not a Bug
-
None
-
None
-
MaxScale server karma197:
OS: CentOS 7
Version: built from 2.4 branch, MariaDB MaxScale 2.4.1 started (Commit: 658aae6b6c77774c34d0f52c03f63edc2d44969e)
Clustrix nodes:
OS: CentOS 7
Version: clustrix-fredonyer-16045
-
MXS-SPRINT-88
Description
[root@karma108 ~]# clx s
|
Cluster Name: cld8d568c4e1f3e0db
|
Cluster Version: clustrix-fredonyer-16045
|
Cluster Status: OK
|
Cluster Size: 4 nodes - 16 CPUs per Node
|
Current Node: karma108 - nid 1
|
|
nid | Hostname | Status | IP Address | Zone | TPS | Used | Total
|
----+-----------+--------+--------------+------+-----+---------------+--------
|
1 | karma108 | OK | 10.2.13.91 | 1 | 0 | 1.8G (0.24%) | 762.9G
|
2 | karma180 | OK | 10.2.15.180 | 2 | 0 | 1.8G (0.24%) | 762.9G
|
3 | karma123 | OK | 10.2.15.89 | 3 | 0 | 1.8G (0.24%) | 762.9G
|
4 | karma065 | OK | 10.2.14.119 | 0 | 0 | 9.3M (0.00%) | 762.9G
|
----+-----------+--------+--------------+------+-----+---------------+--------
|
0 | 5.5G (0.18%) | 3.0T
|
MaxScale run as:
|
[root@karma197 etc]# maxscale -d -f clustrix_karma108_trxreplay.cnf --user=root
|
|
[root@karma197 log]# maxctrl list servers
|
┌─────────────────────┬─────────────┬──────┬─────────────┬─────────────────┬──────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Clustrix:node-4 │ 10.2.14.119 │ 3306 │ 0 │ Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Clustrix:node-3 │ 10.2.15.89 │ 3306 │ 0 │ Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Clustrix:node-2 │ 10.2.15.180 │ 3306 │ 0 │ Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ @@Clustrix:node-1 │ 10.2.13.91 │ 3306 │ 0 │ Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
|
│ Bootstrap1-karma108 │ 10.2.13.91 │ 3306 │ 0 │ Master, Running │ │
|
└─────────────────────┴─────────────┴──────┴─────────────┴─────────────────┴──────┘
|
Now, let's softfail node 4.
MySQL [system]> SELECT * FROM system.nodeinfo ORDER BY nodeid;
|
+--------+----------------------------+----------------------------+----------------------------+-----------------------------+------------+-------------------+-------------+-----------------+------------+-------------+---------+--------------+----------------+-------------------+-------+------------+----------+------+
|
| nodeid | started | ntptime | up_since | hostname | iface_name | iface_mac_addr | iface_ip | iface_connected | mysql_port | be_ip | be_port | control_port | healthmon_port | pnid | cores | first_pass | be_iface | zone |
|
+--------+----------------------------+----------------------------+----------------------------+-----------------------------+------------+-------------------+-------------+-----------------+------------+-------------+---------+--------------+----------------+-------------------+-------+------------+----------+------+
|
| 1 | 2019-08-07 22:32:47.808417 | 2019-08-08 00:08:38.300635 | 2019-08-05 05:48:39.300635 | karma108.colo.sproutsys.com | enp5s0f0 | 00:25:90:20:f5:a2 | 10.2.13.91 | 1 | 3306 | 10.2.13.91 | 24378 | 2048 | 3581 | pfec37993505e4d91 | 16 | 0 | enp5s0f0 | 1 |
|
| 2 | 2019-08-07 22:32:47.571999 | 2019-08-08 00:08:38.299715 | 2019-08-05 05:48:16.299715 | karma180.colo.sproutsys.com | enp5s0f0 | 00:25:90:20:f4:ce | 10.2.15.180 | 1 | 3306 | 10.2.15.180 | 24378 | 2048 | 3581 | pcb504659bf22ce02 | 16 | 0 | enp5s0f0 | 2 |
|
| 3 | 2019-08-07 22:32:47.519503 | 2019-08-08 00:08:38.305634 | 2019-08-05 05:47:56.305634 | karma123.colo.sproutsys.com | enp5s0f0 | 00:25:90:78:f7:3c | 10.2.15.89 | 1 | 3306 | 10.2.15.89 | 24378 | 2048 | 3581 | pe268896eb1da201a | 16 | 0 | enp5s0f0 | 3 |
|
| 4 | 2019-08-07 22:33:01.935622 | 2019-08-08 00:08:38.297207 | 2019-07-29 00:37:02.297207 | karma065.colo.sproutsys.com | enp5s0f0 | 00:25:90:7b:5c:f4 | 10.2.14.119 | 1 | 3306 | 10.2.14.119 | 24378 | 2048 | 3581 | p67082d6315829f66 | 16 | 0 | enp5s0f0 | 0 |
|
+--------+----------------------------+----------------------------+----------------------------+-----------------------------+------------+-------------------+-------------+-----------------+------------+-------------+---------+--------------+----------------+-------------------+-------+------------+----------+------+
|
4 rows in set (0.00 sec)
|
|
MySQL [system]> ALTER CLUSTER SOFTFAIL 4;
|
Query OK, 0 rows affected (0.04 sec)
|
|
Immediately run `maxctrl list servers` and it shows that node is marked as `Drained` instead of `Draining`.
[root@karma197 log]# maxctrl list servers;date
|
┌─────────────────────┬─────────────┬──────┬─────────────┬──────────────────────────┬──────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├─────────────────────┼─────────────┼──────┼─────────────┼──────────────────────────┼──────┤
|
│ @@Clustrix:node-4 │ 10.2.14.119 │ 3306 │ 0 │ Drained, Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼──────────────────────────┼──────┤
|
│ @@Clustrix:node-2 │ 10.2.15.180 │ 3306 │ 0 │ Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼──────────────────────────┼──────┤
|
│ @@Clustrix:node-3 │ 10.2.15.89 │ 3306 │ 0 │ Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼──────────────────────────┼──────┤
|
│ @@Clustrix:node-1 │ 10.2.13.91 │ 3306 │ 0 │ Master, Running │ │
|
├─────────────────────┼─────────────┼──────┼─────────────┼──────────────────────────┼──────┤
|
│ Bootstrap1-karma108 │ 10.2.13.91 │ 3306 │ 0 │ Master, Running │ │
|
└─────────────────────┴─────────────┴──────┴─────────────┴──────────────────────────┴──────┘
|
Thu Aug 8 00:11:50 UTC 2019
|
Whereas Clustrix is not yet done softfailing the node.
MySQL [system]> SELECT count(*) FROM system.softfailing_containers; select now();
|
+----------+
|
| count(*) |
|
+----------+
|
| 2 |
|
+----------+
|
1 row in set (0.03 sec)
|
|
+---------------------+
|
| now() |
|
+---------------------+
|
| 2019-08-08 00:12:01 |
|
+---------------------+
|
1 row in set (0.00 sec)
|
|
MySQL [system]> SELECT * FROM system.softfailed_nodes WHERE nodeid NOT IN (SELECT DISTINCT nodeid FROM system.softfailing_containers);
|
Empty set (0.04 sec)
|
|
This shows that the node is not ready to be removed as Clustrix rebalancer is still moving data from the node.
MaxScale log also says:
2019-08-08 00:11:43 notice : [clustrixmon] Clustrix: Node 4 (10.2.14.119) has been SOFTFAILed. Turning ON 'Being Drained'.
|
2019-08-08 00:11:50 info : Accept authentication from 'admin', using password. Request: /v1/servers
|
2019-08-08 00:11:50 info : Accept authentication from 'admin', using password. Request: /v1/monitors/Clustrix
|
2019-08-08 00:12:07 info : Accept authentication from 'admin', using password. Request: /v1/servers
|
2019-08-08 00:12:07 info : Accept authentication from 'admin', using password. Request: /v1/monitors/Clustrix
|
2019-08-08 00:12:19 info : Accept authentication from 'admin', using password. Request: /v1/servers
|
2019-08-08 00:12:19 info : Accept authentication from 'admin', using password. Request: /v1/monitors/Clustrix
|
2019-08-08 00:13:12 info : Accept authentication from 'admin', using password. Request: /v1/servers
|
2019-08-08 00:13:12 info : Accept authentication from 'admin', using password. Request: /v1/monitors/Clustrix
|
2019-08-08 00:14:21 info : Accept authentication from 'admin', using password. Request: /v1/servers
|
2019-08-08 00:14:21 info : Accept authentication from 'admin', using password. Request: /v1/monitors/Clustrix
|
It's only at 2019-08-08 00:13:14.502252 UTC, that node should be marked as `Drained`
2019-08-08 00:13:14.502252 UTC nid 1 karma108.colo.sproutsys.com clxnode: INFO dbcore/softfail.ct:26 softfail_node_ready_signal(): softfailing nodes are ready to be removed: 4
|
Cofig file and logs attached.