Details
Description
wsrep XID is checkpointed in innodb rollback segment during transaction commit, and this checkpointing is supposed to happen in strict GTID sequence order.
While troubleshooting MDEV-23851 under highly conflicting multi-master work loads, it was observed that Xid checkpointing order can be violated in two scenarios:
- if MariaDB is configured with binlogging enabled but with log_slave_updates = OFF, fairly frequent Xid checkpoint ordering violating happens
- write sets, which failed in certification can perform Xid checkpointing too early in receiving nodes
These Xid checkpointing failures do not cause the issue with MDEV-23851, but they make troubleshooting MDEV-23851 harder by hiding the underlying issue
If I understand correctly the net effect of the change means log_slave_updates option is now always enabled, on a MariaDB 10.3.28 3-node Galera cluster with binary logging enabled, I now see all writes to all nodes appearing in the binary logs on all nodes which is a change in behaviour. Unfortunately as the change came from a merge from the 10.2 branch it has taken me a while to track down as I have seen an impact on disk IO and disk space on my cluster nodes since upgrading from 10.3.9 to 10.3.28. I am concerned about the extra demands on Galera cluster nodes, and is there any way to achieve the previous behaviour?