GTID gets out of sync between Galera cluster nodes by executing 2 transactions under the same GTID on the restarted node!



    • 10.6.4
    • 10.6
    • Galera
    • Ubuntu 20.04
      10.6.4-MariaDB-1:10.6.4+maria~focal-log - mariadb.org binary distribution


      I have 3 galera cluster nodes setup configured as required: https://mariadb.com/kb/en/using-mariadb-gtids-with-mariadb-galera-cluster/
      It was running fine for 1 month. But suddenly - 29th of october I noticed that 2 nodes are having GTID which is +1 than the 3rd node. I started to investigate what is issue. It seams that mariadm on node with name node2 was self-restarted because server run out of RAM. And after restart the first new transaction executed was executed and logged in binary logs with the same GTID as the last transaction before the restart - so the node2 executed 2 transactions (the last before restart and the first after restart) with the same GTID!
      I am attaching combined screenshots where we can see difference between node2 and node1 binary logs - green lines marks situation so far good. The red ones marks what has gone wrong.

      I am also attaching the config file and the error log files from node1 and node2. Hope this helps to find out the cause.
      gtid_domain_id on each server is different 1 on node1, 2 on node2 and 3 on node 3 as recomended in mariadb docs link above.

      This situation leads also to the problem of replica server. My replica server (slave) now is attached to node2. All the nodes - node1, node2 and node3 have enabled binary logs. Before problem with GTID arrived I was able to switch the replica server to any cluster node and it was syncing fine. Now as GTIDs differs - this is not possible.

      No data loss is detected as it is just mess up with GTID numbers which causes also problem with replica server - no option to attach it to other cluster node except the node2 right now.


        mariadb.override.cnf
          2 kB
        node1.mariadb.err.log
          10 kB
        node2.mariadb.err.log
          17 kB
        ONE-GTID-2-queries.png
          329 kB
        out-of-sync-all.PNG
          28 kB
        Servers.PNG
          34 kB



