Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26945

GTID gets out of sync between Galera cluster nodes by executing 2 transactions under the same GTID on the restarted node!

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.6.4
    • 10.6
    • Galera
    • Ubuntu 20.04
      10.6.4-MariaDB-1:10.6.4+maria~focal-log - mariadb.org binary distribution

    Description

      I have 3 galera cluster nodes setup configured as required: https://mariadb.com/kb/en/using-mariadb-gtids-with-mariadb-galera-cluster/
      It was running fine for 1 month. But suddenly - 29th of october I noticed that 2 nodes are having GTID which is +1 than the 3rd node. I started to investigate what is issue. It seams that mariadm on node with name node2 was self-restarted because server run out of RAM. And after restart the first new transaction executed was executed and logged in binary logs with the same GTID as the last transaction before the restart - so the node2 executed 2 transactions (the last before restart and the first after restart) with the same GTID!
      I am attaching combined screenshots where we can see difference between node2 and node1 binary logs - green lines marks situation so far good. The red ones marks what has gone wrong.

      I am also attaching the config file and the error log files from node1 and node2. Hope this helps to find out the cause.
      gtid_domain_id on each server is different 1 on node1, 2 on node2 and 3 on node 3 as recomended in mariadb docs link above.

      This situation leads also to the problem of replica server. My replica server (slave) now is attached to node2. All the nodes - node1, node2 and node3 have enabled binary logs. Before problem with GTID arrived I was able to switch the replica server to any cluster node and it was syncing fine. Now as GTIDs differs - this is not possible.

      No data loss is detected as it is just mess up with GTID numbers which causes also problem with replica server - no option to attach it to other cluster node except the node2 right now.

      Attachments

        1. mariadb.override.cnf
          2 kB
        2. ONE-GTID-2-queries.png
          ONE-GTID-2-queries.png
          329 kB
        3. Servers.PNG
          Servers.PNG
          34 kB
        4. node1.mariadb.err.log
          10 kB
        5. node2.mariadb.err.log
          17 kB
        6. out-of-sync-all.PNG
          out-of-sync-all.PNG
          28 kB

        Activity

          People

            seppo Seppo Jaakola
            normunds.puzo@gmail.com Normunds Puzo
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.