Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-3472

Transaction Replay: transactions not replayed after Xpand group change

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.5.11
    • Component/s: readwritesplit
    • Labels:
      None
    • Environment:
      MariaDB MaxScale 2.5.9 (Commit: f4a081d82c5b00a33a71f7ba209501ee9ab223d4) and
      xpand Glassbutte 5.3.15
      Both using CentOS 7

      Description

      1) What works: Transaction replay works fine with XPAND direct as long as there is NO activity on outstanding transactions during the group change. A transaction that begins before group change is initiated, stays idle during it, will be able to continue happily after group change is completed. This includes the situation when the transaction in question sits on a session connected on a departing node - MaxScale moves the session to the surviving node successfully.

      It has been tested in 2 scenarios
      a) very fast group change. Monitor does not have a chance to detect the node departure, but splitter gets a rollback. Splitter reacts properly, and completes session transfer and replay after group change finishes.
      b) long running group change. Monitor detects group change, spins around waiting for it to complete, and abandons the node, yet keeps monitoring the situation. Rollback is not registering in the traces, it seems. As long as the transaction in question does not attempt any activity prior to group change completion, all is fine. Once the GC completes, the transaction can proceed, and there will be a successful session transfer (if needed) and a replay.

      2) what does not work: Should the transaction in progress attempt to execute anything while group change is in progress, monitor loses track of its nodes. It attempts to engage a node long gone from the cluster - a node that had been taken down and subsequently restored under a different number in previous tests.

      Below is a relevant fragment of the trace. Scenario:

      • There are three nodes – 3, 15, and 22 up.
      • We hang a session – and it goes to node 15. We create a transaction on it.
      • We kill node 22 (10.2.13.175).

      Group Change is detected all right. At 23:01:48 we attempt an insert, and transaction migration from node 15 begins. Except – it goes to a wrong node (node-20, a one killed in a previous experiment). What exactly happens is not obvious from the traces. Node 20 used to be on 10.2.13.175 where 22 was a second ago. But it is not really obvious from the trace what IP MaxScale is attempting to use, and with which credentials - but it fails hard enough for MaxScale to abandon all hope and communicate "session closed" to the client.

      Comment: Splitter attempts session transfer and replay the moment insert is detected, even though monitor is still aware that group change is in progress. Not only that - there appears to be a gallant but ultimately unsuccessful attempt to find a Xpand bootstrap node from one of the nodes used during the previous run of MaxScale.. Could it be better to just wait until Group Change is over?

      The fact of the matter is that no matter how hard one can try, there will not be a suitable node to do a replay on while this group change is going on. Of course we can keep trying, circling the cluster over and over - just that if there is a good number of outstanding transactions all in this same state, it may get very hairy very fast... Is there a way to coordinate with the splitter and have him pause all until group change is in fact over?

      Trace fragment.

      2021-04-02 23:01:45   info   : [xpandmon] Deleted Xpand node 3 from bookkeeping.
      2021-04-02 23:01:45   info   : [xpandmon] Deleted Xpand node 15 from bookkeeping.
      2021-04-02 23:01:45   info   : [xpandmon] Deleted Xpand node 22 from bookkeeping.
      2021-04-02 23:01:47   error  : [xpandmon] Xpand: Could not execute 'SELECT status FROM system.membership WHERE nid = gtmnid()' on 10.2.12.248 via TCP/IP: [16388] Group change during GTM operation: group change in progress, try restarting transaction
      2021-04-02 23:01:47   error  : [xpandmon] Xpand: Could either not ping or create connection to 10.2.12.248:3306: [16388] Group change during GTM operation: group change in progress, try restarting transaction
      2021-04-02 23:01:47   error  : [xpandmon] Xpand: Could either not ping or create connection to 10.2.15.59:3306: [16388] Group change during GTM operation: group change in progress, try restarting transaction
      2021-04-02 23:01:47   error  : [xpandmon] Xpand: Could either not ping or create connection to 10.2.13.175:3306: Can't connect to MySQL server on '10.2.13.175' (115)
      2021-04-02 23:01:47   error  : [xpandmon] Xpand: Could either not ping or create connection to karma114:3306: [16388] Group change during GTM operation: group change in progress, try restarting transaction
      2021-04-02 23:01:47   notice : [xpandmon] Attempting to find a Xpand bootstrap node from one of the nodes used during the previous run of MaxScale.
      2021-04-02 23:01:47   error  : [xpandmon] Xpand: Could not connect to any server or no server that could be connected to was part of the quorum.
      2021-04-02 23:01:48   info   : (7) (Splitter-Service) > Autocommit: [disabled], trx is [open], cmd: (0x03) COM_QUERY, plen: 30, type: QUERY_TYPE_WRITE, stmt: insert into t1 values (3)
      2021-04-02 23:01:48   info   : (7) [readwritesplit] (Splitter-Service) Starting transaction migration to '@@Xpand:node-20'
      2021-04-02 23:01:48   info   : (7) [readwritesplit] (Splitter-Service) Starting transaction replay 1
      2021-04-02 23:01:48   info   : (7) [readwritesplit] (Splitter-Service) Replaying: begin
      2021-04-02 23:01:49   info   : > Autocommit: [disabled], trx is [open], cmd: (0x03) COM_QUERY, plen: 10, type: QUERY_TYPE_BEGIN_TRX, stmt: begin
      2021-04-02 23:01:49   info   : [readwritesplit] Replacing old master '@@Xpand:node-15' with new master '@@Xpand:node-20'
      2021-04-02 23:01:49   info   : Execute sescmd #1 on '@@Xpand:node-20': [COM_INIT_DB]
      2021-04-02 23:01:49   info   : [readwritesplit] Connected to '@@Xpand:node-20'
      2021-04-02 23:01:49   info   : [readwritesplit] Queuing query until '@@Xpand:node-20' completes session command
      2021-04-02 23:01:49   info   : (7) Connected to '@@Xpand:node-20' with thread id 1025
      2021-04-02 23:01:49   error  : (7) Invalid authentication message from backend '@@Xpand:node-20'. Error code: 1045, Msg : #HY000: [39936] Access denied: for user 'maxscale'@'' (using password: YES)
      2021-04-02 23:01:49   info   : (7) [readwritesplit] (Splitter-Service) Master '@@Xpand:node-20' failed: #HY000: Lost connection to backend server: connection closed by peer (@@Xpand:node-20)
      2021-04-02 23:01:49   error  : (7) [readwritesplit] (Splitter-Service) Lost connection to the master server, closing session. Lost connection to master server while waiting for a result. Connection has been idle for 0 seconds. Error caused by: #HY000: Lost connection to backend server: connection closed by peer (@@Xpand:node-20). Last close reason: <none>. Last error:
      2021-04-02 23:01:49   info   : Stopped Splitter-Service client session [7]
      

      Original Description
      Start trx.
      Do a group change on Xpand.
      While the group is yet to be formed try to insert another row.
      After the group is formed try another insert.
      Expected: Now, the whole trx should be retried.
      Actual: trx replay fails.

      mysql> set session autocommit=false;
      Query OK, 0 rows affected (0.01 sec)
      mysql> begin;
      Query OK, 0 rows affected (0.00 sec)
      mysql> insert into t1 values (1);
      Query OK, 1 row affected (0.01 sec)
      -- here, run /opt/clustrix/bin/clx dbrestart on one of the xpand nodes and before the cluster is formed, execute this:
      mysql> insert into t1 values (2);
      ERROR 1927 (HY000): Lost connection to backend server: network error (@@Xpand:node-1: 111, Connection refused)
      mysql> insert into t1 values (3);
      ERROR 2006 (HY000): MySQL server has gone away
      No connection. Trying to reconnect...
      Connection id:    2
      Current database: db1
      Query OK, 1 row affected (6.35 sec)
      mysql> commit;
      Query OK, 0 rows affected (0.00 sec)
      mysql> select * from t1;
      +------+
      | i    |
      +------+
      |    3 |
      +------+
      1 row in set (0.00 sec)
      mysql>
      

      Config file:

      [maxscale]
      log_info=1
      threads=auto
      logdir=/data/clustrix/log
      [Bootstrap1-karma114]
      type=server
      address=karma114
      port=3306
      protocol=mariadbbackend
      [Xpand]
      type=monitor
      module=xpandmon
      servers=Bootstrap1-karma114
      user=maxscale
      password=maxscale_pw
      #cluster_monitor_interval=10s
      [Splitter-Service]
      type=service
      router=readwritesplit
      user=maxscale
      password=maxscale_pw
      cluster=Xpand
      transaction_replay=true
      slave_selection_criteria=LEAST_GLOBAL_CONNECTIONS
      delayed_retry_timeout=360s
      transaction_replay_attempts=500
      transaction_replay_retry_on_deadlock=true
      [Splitter-Listener]
      type=listener
      service=Splitter-Service
      protocol=MariaDBClient
      address=0.0.0.0
      port=3306
      

      Logs:

      2021-03-29 23:55:44   info   : (1) (Splitter-Service) > Autocommit: [disabled], trx is [open], cmd: (0x03) COM_QUERY, plen: 10, type: QUERY_TYPE_BEGIN_TRX, stmt: begin
      2021-03-29 23:55:44   info   : (1) [readwritesplit] (Splitter-Service) Route query to master: @@Xpand:node-2 <
      2021-03-29 23:55:44   info   : (1) [readwritesplit] (Splitter-Service) Transaction starting on '@@Xpand:node-2'
      2021-03-29 23:55:44   info   : (1) [readwritesplit] (Splitter-Service) Adding to trx: begin
      2021-03-29 23:55:44   info   : (1) [readwritesplit] (Splitter-Service) Reply complete, last reply from @@Xpand:node-2
      2021-03-29 23:55:51   info   : (1) (Splitter-Service) > Autocommit: [disabled], trx is [open], cmd: (0x03) COM_QUERY, plen: 30, type: QUERY_TYPE_WRITE, stmt: insert into t1 values (1)
      2021-03-29 23:55:51   info   : (1) [readwritesplit] (Splitter-Service) Route query to master: @@Xpand:node-2 <
      2021-03-29 23:55:51   info   : (1) [readwritesplit] (Splitter-Service) Adding to trx: insert into t1 values (1)
      2021-03-29 23:55:51   info   : (1) [readwritesplit] (Splitter-Service) Reply complete, last reply from @@Xpand:node-2
      2021-03-29 23:56:18   info   : (1) [readwritesplit] (Splitter-Service) Master '@@Xpand:node-2' failed: #HY000: Lost connection to backend server: connection closed by peer (@@Xpand:node-2)
      2021-03-29 23:56:18   info   : (1) [readwritesplit] (Splitter-Service) Starting transaction replay 1
      2021-03-29 23:56:18   info   : (1) [readwritesplit] (Splitter-Service) Replaying: begin
      2021-03-29 23:56:19   info   : > Autocommit: [disabled], trx is [open], cmd: (0x03) COM_QUERY, plen: 10, type: QUERY_TYPE_BEGIN_TRX, stmt: begin
      2021-03-29 23:56:19   info   : [readwritesplit] Replacing old master '@@Xpand:node-2' with new master '@@Xpand:node-1'
      2021-03-29 23:56:19   info   : Execute sescmd #1 on '@@Xpand:node-1': [COM_INIT_DB]
      2021-03-29 23:56:19   info   : [readwritesplit] Connected to '@@Xpand:node-1'
      2021-03-29 23:56:19   info   : [readwritesplit] Queuing query until '@@Xpand:node-1' completes session command
      2021-03-29 23:56:19   info   : (1) [readwritesplit] (Splitter-Service) Master '@@Xpand:node-1' failed: #HY000: Lost connection to backend server: network error (@@Xpand:node-1: 111, Connection refused)
      2021-03-29 23:56:19   error  : (1) [readwritesplit] (Splitter-Service) Lost connection to the master server, closing session. Lost connection to master server while waiting for a result. Connection has been idle for 0 seconds. Error caused by: #HY000: Lost connection to backend server: network error (@@Xpand:node-1: 111, Connection refused). Last close reason: <none>. Last error:
      2021-03-29 23:56:19   info   : Stopped Splitter-Service client session [1]
      

      [root@karma076 ~]# maxctrl list servers
      ┌─────────────────────┬─────────────┬──────┬─────────────┬─────────────────┬──────┐
      │ Server              │ Address     │ Port │ Connections │ State           │ GTID │
      ├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
      │ @@Xpand:node-3      │ 10.2.12.248 │ 3306 │ 0           │ Master, Running │      │
      ├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
      │ @@Xpand:node-1      │ 10.2.13.175 │ 3306 │ 0           │ Master, Running │      │
      ├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
      │ @@Xpand:node-2      │ 10.2.15.59  │ 3306 │ 1           │ Master, Running │      │
      ├─────────────────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
      │ Bootstrap1-karma114 │ karma114    │ 3306 │ 0           │ Down            │      │
      └─────────────────────┴─────────────┴──────┴─────────────┴─────────────────┴──────┘
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              markus makela markus makela
              Reporter:
              rahul.joshi@mariadb.com Rahul Joshi
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: