If a slave crashes (unrelated) while processing an XA PREPARE such that the event fully commits in the binlog and innodb; however, crashes before updating gtid_slave_pos, attempts to restart the slave SQL thread will crash with errors such as out-of-order GTID attempt (if gtid strict mode is enabled) or XID already exists (otherwise). The following comment in Xid_apply_log_event::do_apply_event() documents this behavior.
/*
...
XA_PREPARE_LOG_EVENT also updates the gtid table *but* the update gets
committed as separate "autocommit" transaction.
*/
I think logic should be added to detect the possibility of a crash happening before the separate transaction completes, and if so, automatically update gtid slave state on restart, because gtid_binlog_pos will already be updated.
Attachments
Issue Links
causes
MDEV-34526Mariadb crashed and replication got broken after MariaDB services came up
Closed
relates to
MDEV-742LP:803649 - Xa recovery failed on client disconnection
Closed
MDEV-31038Parallel Replication Breaks if XA PREPARE Fails Updating Slave GTID State
Closed
MDEV-21469Implement crash-safe logging of the user XA
Stalled
MDEV-30165X-lock on supremum for prepared transaction for RR
MDEV-21469 relates to this one. The current one rightfully claims gtid_slave_pos update should be a part of the replicated prepared XA.
Andrei Elkin
added a comment - MDEV-21469 relates to this one. The current one rightfully claims gtid_slave_pos update should be a part of the replicated prepared XA.
I think bugs such as this is a clear indication that the design has not been thought through for the replication of user XA PREPARE.
It's such a central design of GTID that the mysql.gtid_slave_pos table is updated in the same transaction as the transaction it belongs to. The user XA PREPARE needs to respect this part of the design, not break it.
Let's do it differently. We can binlog and send to the slave the XA PREPARE, but don't apply the events on the slave.
Then in the normal case, when XA COMMIT happens on the master, the events are applied on the slave as a normal transaction.
This bug and a lot of other bugs will then simply go away.
And then if the master crashes, implement suitable recovery code for the slave to recover the XA PREPAREd transactions when it is promoted as the master. This code will then be separate and not affect the logic of normal replication.
I think this is a much cleaner design and should have some chance of working, at least.
Kristian Nielsen
added a comment - I think bugs such as this is a clear indication that the design has not been thought through for the replication of user XA PREPARE.
It's such a central design of GTID that the mysql.gtid_slave_pos table is updated in the same transaction as the transaction it belongs to. The user XA PREPARE needs to respect this part of the design, not break it.
Let's do it differently. We can binlog and send to the slave the XA PREPARE, but don't apply the events on the slave.
Then in the normal case, when XA COMMIT happens on the master, the events are applied on the slave as a normal transaction.
This bug and a lot of other bugs will then simply go away.
And then if the master crashes, implement suitable recovery code for the slave to recover the XA PREPAREd transactions when it is promoted as the master. This code will then be separate and not affect the logic of normal replication.
I think this is a much cleaner design and should have some chance of working, at least.
knielsen, well bnestere, whose analysis of course was cool, was not aware of MDEV-21777 at reporting. In my comment I should've referred to it (not just to the related MDEV-21469) and close this one its duplicate.
The plan has been to process GTID-insert as
> a separate transaction to be two-phase-committed with the replicated one.
That is XA_prepare_log_event::do_apply_event would execute a 2pc-like sequence of gtid_insert.prepare(xid), XA.prepare(xid), insert.commit(xid). How to recover having from Innodb zero, one or two xid is proposed in here (now I believe this can be done better - say with narrowing `formatID` domain for 1-2 bits which would be employed for recovery purpose.).
This sane idea
> We can binlog and send to the slave the XA PREPARE, but don't apply the events on the slave.
seemed feasible but was not elected for apparent extra latency (proportional to the XAP size) and not least for the very recovery reason. Slave sure can recover it, provided XA-prepare is held recoverably. I hope you'd agree the trouble to implement of what seems to be a transactional write by the slave IO thread (that acks in the semisync to master who eventually okays to the client on XAP's completion), that trouble is not smaller than one of 21777.
Andrei Elkin
added a comment - - edited knielsen , well bnestere , whose analysis of course was cool, was not aware of MDEV-21777 at reporting. In my comment I should've referred to it (not just to the related MDEV-21469 ) and close this one its duplicate.
The plan has been to process GTID-insert as
> a separate transaction to be two-phase-committed with the replicated one.
That is XA_prepare_log_event::do_apply_event would execute a 2pc-like sequence of gtid_insert.prepare(xid), XA.prepare(xid), insert.commit(xid) . How to recover having from Innodb zero, one or two xid is proposed in here (now I believe this can be done better - say with narrowing `formatID` domain for 1-2 bits which would be employed for recovery purpose.).
This sane idea
> We can binlog and send to the slave the XA PREPARE, but don't apply the events on the slave.
seemed feasible but was not elected for apparent extra latency (proportional to the XAP size) and not least for the very recovery reason. Slave sure can recover it, provided XA-prepare is held recoverably . I hope you'd agree the trouble to implement of what seems to be a transactional write by the slave IO thread (that acks in the semisync to master who eventually okays to the client on XAP's completion), that trouble is not smaller than one of 21777.
It should be trivial to ensure that XA prepare is replicated recoverably, by using the existing binlog crash recovery mechanism.
Require the slave to enable --log-bin and --log-slave-updates. When XA PREPARE is replicated on the slave, it is binlogged together with mysql.gtid_slave_pos update in the normal way, but the xid_could (ie. unlog()) is postponed until XA COMMIT is received. This way, the BINLOG CHECKPOINT event will be postponed, and the binlog will be scanned during crash recovery, at which time the XA PREPAREd transaction can be recoved.
Maybe this can even be used to optionally omit the query/row events from the XA COMMIT to reduce binlog size, since these can be read from the binlog at XA COMMIT time.
Kristian Nielsen
added a comment - It should be trivial to ensure that XA prepare is replicated recoverably, by using the existing binlog crash recovery mechanism.
Require the slave to enable --log-bin and --log-slave-updates. When XA PREPARE is replicated on the slave, it is binlogged together with mysql.gtid_slave_pos update in the normal way, but the xid_could (ie. unlog()) is postponed until XA COMMIT is received. This way, the BINLOG CHECKPOINT event will be postponed, and the binlog will be scanned during crash recovery, at which time the XA PREPAREd transaction can be recoved.
Maybe this can even be used to optionally omit the query/row events from the XA COMMIT to reduce binlog size, since these can be read from the binlog at XA COMMIT time.
knielsen, I agree this would be a viable solution, perhaps a preferable one to cover cases where adding hints to slave execution context (like suggested in MDEV-32020) may not help.
Andrei Elkin
added a comment - knielsen , I agree this would be a viable solution, perhaps a preferable one to cover cases where adding hints to slave execution context (like suggested in MDEV-32020 ) may not help.
People
Andrei Elkin
Brandon Nesterenko
Votes:
1Vote for this issue
Watchers:
7Start watching this issue
Dates
Created:
Updated:
Resolved:
Git Integration
Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.
{"report":{"fcp":1113.9000000022352,"ttfb":367.1000000014901,"pageVisibility":"visible","entityId":114979,"key":"jira.project.issue.view-issue","isInitial":true,"threshold":1000,"elementTimings":{},"userDeviceMemory":8,"userDeviceProcessors":64,"apdex":0.5,"journeyId":"8b1ad091-fd6f-45a4-bdf8-54e44c1acf55","navigationType":0,"readyForUser":1197.800000000745,"redirectCount":0,"resourceLoadedEnd":645.8000000007451,"resourceLoadedStart":374.20000000298023,"resourceTiming":[{"duration":130.59999999776483,"initiatorType":"link","name":"https://jira.mariadb.org/s/2c21342762a6a02add1c328bed317ffd-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/css/_super/batch.css","startTime":374.20000000298023,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":374.20000000298023,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":504.80000000074506,"responseStart":0,"secureConnectionStart":0},{"duration":130.89999999850988,"initiatorType":"link","name":"https://jira.mariadb.org/s/7ebd35e77e471bc30ff0eba799ebc151-CDN/lu2cib/820016/12ta74/2bf333562ca6724060a9d5f1535471f6/_/download/contextbatch/css/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true","startTime":374.4000000022352,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":374.4000000022352,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":505.30000000074506,"responseStart":0,"secureConnectionStart":0},{"duration":223.89999999850988,"initiatorType":"script","name":"https://jira.mariadb.org/s/0917945aaa57108d00c5076fea35e069-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/js/_super/batch.js?locale=en","startTime":374.6000000014901,"connectEnd":374.6000000014901,"connectStart":374.6000000014901,"domainLookupEnd":374.6000000014901,"domainLookupStart":374.6000000014901,"fetchStart":374.6000000014901,"redirectEnd":0,"redirectStart":0,"requestStart":506.80000000074506,"responseEnd":598.5,"responseStart":522.8000000007451,"secureConnectionStart":374.6000000014901},{"duration":271,"initiatorType":"script","name":"https://jira.mariadb.org/s/2d8175ec2fa4c816e8023260bd8c1786-CDN/lu2cib/820016/12ta74/2bf333562ca6724060a9d5f1535471f6/_/download/contextbatch/js/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true","startTime":374.80000000074506,"connectEnd":374.80000000074506,"connectStart":374.80000000074506,"domainLookupEnd":374.80000000074506,"domainLookupStart":374.80000000074506,"fetchStart":374.80000000074506,"redirectEnd":0,"redirectStart":0,"requestStart":507.80000000074506,"responseEnd":645.8000000007451,"responseStart":520.4000000022352,"secureConnectionStart":374.80000000074506},{"duration":144.90000000223517,"initiatorType":"script","name":"https://jira.mariadb.org/s/a9324d6758d385eb45c462685ad88f1d-CDN/lu2cib/820016/12ta74/c92c0caa9a024ae85b0ebdbed7fb4bd7/_/download/contextbatch/js/atl.global,-_super/batch.js?locale=en","startTime":375,"connectEnd":375,"connectStart":375,"domainLookupEnd":375,"domainLookupStart":375,"fetchStart":375,"redirectEnd":0,"redirectStart":0,"requestStart":507.9000000022352,"responseEnd":519.9000000022352,"responseStart":519.2000000029802,"secureConnectionStart":375},{"duration":148.69999999925494,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-en/jira.webresources:calendar-en.js","startTime":375.20000000298023,"connectEnd":375.20000000298023,"connectStart":375.20000000298023,"domainLookupEnd":375.20000000298023,"domainLookupStart":375.20000000298023,"fetchStart":375.20000000298023,"redirectEnd":0,"redirectStart":0,"requestStart":508.1000000014901,"responseEnd":523.9000000022352,"responseStart":523.4000000022352,"secureConnectionStart":375.20000000298023},{"duration":149,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-localisation-moment/jira.webresources:calendar-localisation-moment.js","startTime":375.4000000022352,"connectEnd":375.4000000022352,"connectStart":375.4000000022352,"domainLookupEnd":375.4000000022352,"domainLookupStart":375.4000000022352,"fetchStart":375.4000000022352,"redirectEnd":0,"redirectStart":0,"requestStart":508.1000000014901,"responseEnd":524.4000000022352,"responseStart":523.9000000022352,"secureConnectionStart":375.4000000022352},{"duration":133,"initiatorType":"link","name":"https://jira.mariadb.org/s/b04b06a02d1959df322d9cded3aeecc1-CDN/lu2cib/820016/12ta74/a2ff6aa845ffc9a1d22fe23d9ee791fc/_/download/contextbatch/css/jira.global.look-and-feel,-_super/batch.css","startTime":375.5,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":375.5,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":508.5,"responseStart":0,"secureConnectionStart":0},{"duration":169.39999999850988,"initiatorType":"script","name":"https://jira.mariadb.org/rest/api/1.0/shortcuts/820016/47140b6e0a9bc2e4913da06536125810/shortcuts.js?context=issuenavigation&context=issueaction","startTime":375.70000000298023,"connectEnd":375.70000000298023,"connectStart":375.70000000298023,"domainLookupEnd":375.70000000298023,"domainLookupStart":375.70000000298023,"fetchStart":375.70000000298023,"redirectEnd":0,"redirectStart":0,"requestStart":509.4000000022352,"responseEnd":545.1000000014901,"responseStart":543.9000000022352,"secureConnectionStart":375.70000000298023},{"duration":132.69999999925494,"initiatorType":"link","name":"https://jira.mariadb.org/s/3ac36323ba5e4eb0af2aa7ac7211b4bb-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/css/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.css?jira.create.linked.issue=true","startTime":375.9000000022352,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":375.9000000022352,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":508.6000000014901,"responseStart":0,"secureConnectionStart":0},{"duration":146.80000000074506,"initiatorType":"script","name":"https://jira.mariadb.org/s/5d5e8fe91fbc506585e83ea3b62ccc4b-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/js/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.js?jira.create.linked.issue=true&locale=en","startTime":376,"connectEnd":376,"connectStart":376,"domainLookupEnd":376,"domainLookupStart":376,"fetchStart":376,"redirectEnd":0,"redirectStart":0,"requestStart":508.20000000298023,"responseEnd":522.8000000007451,"responseStart":522.2000000029802,"secureConnectionStart":376},{"duration":236.40000000223517,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-js/jira.webresources:bigpipe-js.js","startTime":384.80000000074506,"connectEnd":384.80000000074506,"connectStart":384.80000000074506,"domainLookupEnd":384.80000000074506,"domainLookupStart":384.80000000074506,"fetchStart":384.80000000074506,"redirectEnd":0,"redirectStart":0,"requestStart":609.2000000029802,"responseEnd":621.2000000029802,"responseStart":620.4000000022352,"secureConnectionStart":384.80000000074506},{"duration":236,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-init/jira.webresources:bigpipe-init.js","startTime":387.6000000014901,"connectEnd":387.6000000014901,"connectStart":387.6000000014901,"domainLookupEnd":387.6000000014901,"domainLookupStart":387.6000000014901,"fetchStart":387.6000000014901,"redirectEnd":0,"redirectStart":0,"requestStart":609.4000000022352,"responseEnd":623.6000000014901,"responseStart":621.3000000007451,"secureConnectionStart":387.6000000014901},{"duration":255.59999999776483,"initiatorType":"xmlhttprequest","name":"https://jira.mariadb.org/rest/webResources/1.0/resources","startTime":824.7000000029802,"connectEnd":824.7000000029802,"connectStart":824.7000000029802,"domainLookupEnd":824.7000000029802,"domainLookupStart":824.7000000029802,"fetchStart":824.7000000029802,"redirectEnd":0,"redirectStart":0,"requestStart":1051.1000000014901,"responseEnd":1080.300000000745,"responseStart":1079.4000000022352,"secureConnectionStart":824.7000000029802}],"fetchStart":0,"domainLookupStart":0,"domainLookupEnd":0,"connectStart":0,"connectEnd":0,"requestStart":204,"responseStart":367,"responseEnd":387,"domLoading":371,"domInteractive":1269,"domContentLoadedEventStart":1269,"domContentLoadedEventEnd":1339,"domComplete":2336,"loadEventStart":2336,"loadEventEnd":2337,"userAgent":"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)","marks":[{"name":"bigPipe.sidebar-id.start","time":1241.5},{"name":"bigPipe.sidebar-id.end","time":1242.300000000745},{"name":"bigPipe.activity-panel-pipe-id.start","time":1242.5},{"name":"bigPipe.activity-panel-pipe-id.end","time":1244.2000000029802},{"name":"activityTabFullyLoaded","time":1356.4000000022352}],"measures":[],"correlationId":"b970fddc6e259e","effectiveType":"4g","downlink":9.3,"rtt":0,"serverDuration":101,"dbReadsTimeInMs":16,"dbConnsTimeInMs":24,"applicationHash":"9d11dbea5f4be3d4cc21f03a88dd11d8c8687422","experiments":[]}}
MDEV-21469 relates to this one. The current one rightfully claims gtid_slave_pos update should be a part of the replicated prepared XA.