Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14153

Implicitly dropped temporary tables cause GTID drift in Galera cluster

    XMLWordPrintable

    Details

      Description

      I have a Galera cluster running with the usual required configuration as well as the following:

      wsrep_gtid_mode = 1
      log_slave_updates = 1

      If I bootstrap the cluster with (rsync) SST, I can see that the initial GTID state is copied over from the donor, which works as expected. In general, the GTID state will stay synced across all masters as they execute queries, but it is possible to cause drift in normal use cases. For example, if we execute the following:

      echo 'CREATE TEMPORARY TABLE `t5` (`c1` INT, PRIMARY KEY(`c1`)) ENGINE=InnoDB;' | mysql -u root test
      echo "SHOW VARIABLES LIKE 'gtid_binlog_pos';" | mysql -u root

      On the master that executes the statement, you will see the GTID increment once for every execution, this is because a "DROP TEMPORARY TABLE IF EXISTS `t5`" will end up in the binary log when the connection is closed. However, that binary log entry will only exist on that master, no other masters in the Galera cluster will see it, and so the GTID values will drift by one every time this happens.

      One very interesting observation is that if we explicitly drop the temporary table, then nothing ends up in the binlog. Indeed, you can run this all day and the GTID value won't budge:

      echo 'CREATE TEMPORARY TABLE `t5` (`c1` INT, PRIMARY KEY(`c1`)) ENGINE=InnoDB; DROP TEMPORARY TABLE `t5`;' | mysql -u root test
      echo "SHOW VARIABLES LIKE 'gtid_binlog_pos';" | mysql -u root

      I'm not sure if it's correct behaviour for the aforementioned query to magically appear in the binlog on an implicit table drop, but it's definitely not correct behaviour for the query to only exist in the original master's binlog and not have it replicated to any other Galera masters (as this causes GTID drift).

      This GTID drift doesn't break the Galera cluster on it's own, but it makes it impractical to have a traditional replication slave replicating from the Galera cluster, as the slave's replication will break when trying to switch to another master.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jplindst Jan Lindström
                Reporter:
                neiltembo Neil Skrypuch
              • Votes:
                3 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated: