[MDEV-14153] Implicitly dropped temporary tables cause GTID drift in Galera cluster Created: 2017-10-26 Updated: 2021-12-14 Resolved: 2019-12-12 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Data Definition - Temporary, Galera |
| Affects Version/s: | 10.2.9, 10.2.12 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Neil Skrypuch | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Duplicate | Votes: | 5 |
| Labels: | None | ||
| Environment: |
CentOS 7.4 |
||
| Issue Links: |
|
||||||||
| Description |
|
I have a Galera cluster running with the usual required configuration as well as the following: wsrep_gtid_mode = 1 If I bootstrap the cluster with (rsync) SST, I can see that the initial GTID state is copied over from the donor, which works as expected. In general, the GTID state will stay synced across all masters as they execute queries, but it is possible to cause drift in normal use cases. For example, if we execute the following: echo 'CREATE TEMPORARY TABLE `t5` (`c1` INT, PRIMARY KEY(`c1`)) ENGINE=InnoDB;' | mysql -u root test On the master that executes the statement, you will see the GTID increment once for every execution, this is because a "DROP TEMPORARY TABLE IF EXISTS `t5`" will end up in the binary log when the connection is closed. However, that binary log entry will only exist on that master, no other masters in the Galera cluster will see it, and so the GTID values will drift by one every time this happens. One very interesting observation is that if we explicitly drop the temporary table, then nothing ends up in the binlog. Indeed, you can run this all day and the GTID value won't budge: echo 'CREATE TEMPORARY TABLE `t5` (`c1` INT, PRIMARY KEY(`c1`)) ENGINE=InnoDB; DROP TEMPORARY TABLE `t5`;' | mysql -u root test I'm not sure if it's correct behaviour for the aforementioned query to magically appear in the binlog on an implicit table drop, but it's definitely not correct behaviour for the query to only exist in the original master's binlog and not have it replicated to any other Galera masters (as this causes GTID drift). This GTID drift doesn't break the Galera cluster on it's own, but it makes it impractical to have a traditional replication slave replicating from the Galera cluster, as the slave's replication will break when trying to switch to another master. |
| Comments |
| Comment by Neil Skrypuch [ 2018-01-30 ] |
|
I was hoping to see this fixed now that #10715 is resolved with 10.2.12, but after testing I can confirm that this is still an issue in 10.2.12 as originally described. |
| Comment by Arjen Lentz [ 2018-07-06 ] |
|
I can confirm this behaviour occurring in 10.2.12, and have a binlog examples which I'll post separately in a private comment. The GTID is the same across all the replicants, but drifts relative to the committing node. The objective of GTID is to be global, and consistent. In this case it's neither, and faulty. |
| Comment by Arjen Lentz [ 2018-07-06 ] |
|
neiltembo If the GTID were passed through the cluster, it can't drift. I conclude that GTID is not actually passed through the cluster, but updated "magically" somewhere in wsrep space, and that's wrong and very dangerous. |
| Comment by Jan Lindström (Inactive) [ 2019-12-12 ] |
|
WIll be fixed on |