[MDEV-32830] refactor XA binlogging for better integration with BGC/replication/recovery Created: 2023-11-17  Updated: 2024-01-19

Status: In Review
Project: MariaDB Server
Component/s: Replication
Fix Version/s: 11.5

Type: Task Priority: Critical
Reporter: Andrei Elkin Assignee: Sergei Golubchik
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File bug1.txt     Text File bug10.txt     Text File bug2.txt     Text File bug3.txt     Text File bug4.txt     Text File bug5.txt     Text File bug6.txt     Text File bug7.txt     Text File bug8.txt     Text File bug9.txt    
Issue Links:
Blocks
blocks MDEV-31949 slow parallel replication of user xa In Review
blocks MDEV-32896 Unstable XA + binglog tests, with pos... Stalled
is blocked by MDEV-32852 Assertion `is_prepared_xa(thd) || ((t... Closed
is blocked by MDEV-32857 Assertion `thd->lex->sql_command == S... Closed
Problem/Incident
causes MDEV-32852 Assertion `is_prepared_xa(thd) || ((t... Closed
causes MDEV-32857 Assertion `thd->lex->sql_command == S... Closed

 Description   

The task is to improve on MDEV-742 design of the XA binlogging.
When binlog is ON, execute XA-{prepare, commit, rollback} via
MYSQL_BIN_LOG::run_ordered_commit, so that at the end of binlog_{prepare, commit, rollback} the operation over the xa transaction is completed as far as (parallel) replication/recovery is concerned. Specifically its XID is available for next use, incl for
xa-complete by another slave parallel commit/rollback or xa-start of a new same xid transaction.



 Comments   
Comment by Andrei Elkin [ 2023-11-17 ]

Howdy Kristian!

I implemented a big part of your review note coverage which improves upon
XA binlogging, which is mostly the master side.
The commit - bb-10.6-andrei - is 10.6 based to help with eventual cherrypicking.

While you're looking into it I am merging the former patch's slave code on the top of this commit.

Comment by Andrei Elkin [ 2023-11-17 ]

serg, it'd be very helpful it you run through the commit - bb-10.6-andrei - as well. You'll find then - in how binlog_commit is made first to run hton - similarity to MDEV-21117.
I am trying to publish it on the commit list, can't make that yet (my mailer setup got obsolete).

Comment by Andrei Elkin [ 2023-11-17 ]

Roel, howdy!

Could you please run general and xa loads to validate the refactoring in bb-10.6-andrei. This is a base of the 2nd part of MDEV-31949 which is going to be pure slave changes. The base changes the bin-logging behavior for XA, but not only.
I suggest to check specifically non-transactional engines like Aria, MyISAM, Sequence with binlog filters and XA load. Normal Innodb transaction load is not supposed to produce any difference.
It makes sense for 24 hour long tests too.

Comment by Roel Van de Paar [ 2023-11-20 ]

Elkin Hi! I started testing.
Does marko need to review this patch also? It looks like there were some changes in xa.cc that may affect InnoDB XA.

Comment by Andrei Elkin [ 2023-11-20 ]

roel, Marko is welcome to check but, the refactoring commit does not really dwell into Innodb. It changes, when log-bin = ON, though for the XA its local connection commit style from the pair of innobase_commit_ordered(), innobase_commit() to innobase_commit_by_xid() of the "external" (connection) commit branch.

But I should have shortly for Marko a solely Innodb optimization commit that blocks `fsyncdata` for XA PREPARE in the log-bin = ON case.

Comment by Andrei Elkin [ 2023-11-20 ]

knielsen,roel, serg, I had to correct lately found mistakes in new xa branches of MYSQL_BIN_LOG::run_ordered_commit(). The original commit's message is extended and the fixes are made as the 2nd commit: 88b0738b7c6...803e496ad0f HEAD -> bb-10.6-andrei.

Comment by Roel Van de Paar [ 2023-11-21 ]

Created MDEV-32852 Assertion `is_prepared_xa(thd) || ((thd->lex->sql_command == SQLCOM_XA_ROLLBACK && thd->transaction->xid_state.get_state_code() == XA_IDLE) || thd->lex->xa_opt == XA_ONE_PHASE)' failed in TC_LOG::run_commit_ordered

Comment by Roel Van de Paar [ 2023-11-21 ]

From what I have seen so far in bugs, we will definitely want an InnoDB review also.

Comment by Roel Van de Paar [ 2023-11-22 ]

Created MDEV-32857 Assertion `thd->lex->sql_command == SQLCOM_XA_COMMIT || thd->lex->sql_command == SQLCOM_XA_ROLLBACK' failed in TC_LOG::run_commit_ordered

Comment by Andrei Elkin [ 2023-11-22 ]

Howdy Sergei.

I am inviting to check this patch which is a base for MDEV-31949.
Kristian already browsed, Roel has found assert hitting cases. Those are covered with separate commits.
The whole work is in bb-10.6-andrei.
It's based on 10.6 for easier eventual cherry-picking.

This patch is planned to be followed with two more:

  • the slave side optimization that is mostly borrowed from a previously reviewed (by KN) branch, and
  • optimization in Innodb for XA-PREPARE when binlog is ON:
    as of current, Innodb continues to call fsyncdata per each prepared user xa, but this time innobase_xa_prepare() is run via `MYSQL_BIN_LOG::run_commit_ordered()` which can be (very) costly (no benchmarking to confirm that yet though).

I'll invite you to check the latter one too later.

Comment by Roel Van de Paar [ 2023-11-27 ]

Commit 0d728f33b6ae806a48d995738718e2c34ee94f1e: MDEV-32852
Commit 960e246b29d7842ecbf94b3dcf864738b4c9edf5: MDEV-32857
Commit 2cf0b46fd282477f21f1b9910d117d8db940a505: Updates based on Kristian's review

Comment by Roel Van de Paar [ 2023-11-28 ]

Created MDEV-32896 Unstable XA + binglog tests, with possible MDEV-32830 caused issues, which blocks MTR testing of this ticket.

Comment by Roel Van de Paar [ 2023-12-07 ]

Elkin FYI, here are some additional bugs I saw during the run (one UniqueID per line):

1: SIGSEGV|TC_LOG::run_prepare_ordered|MYSQL_BIN_LOG::queue_for_group_commit|MYSQL_BIN_LOG::write_transaction_to_binlog_events|MYSQL_BIN_LOG::write_transaction_to_binlog
2: SIGSEGV|ha_partition::register_query_cache_dependant_tables|Query_cache::register_tables_from_list|Query_cache::register_all_tables|Query_cache::store_query
3: SIGSEGV|prepare_or_error|ha_commit_trans|trans_commit_implicit|mysql_execute_command
4: ! is_set() || m_can_overwrite_status|SIGABRT|Backtrace stopped: Cannot access memory at address|
5: !field->prefix_len || field->fixed_len == field->prefix_len|SIGABRT|btr_node_ptr_max_size|btr_cur_t::search_leaf|row_ins_sec_index_entry_low|row_ins_sec_index_entry  
6: is_started()|SIGABRT|Ha_trx_info::ht|TC_LOG::run_prepare_ordered|MYSQL_BIN_LOG::queue_for_group_commit|MYSQL_BIN_LOG::write_transaction_to_binlog_events
7: !((col->prtype ^ prtype) & ~(16384U|32768U))|SIGABRT|innobase_rename_or_enlarge_column_try|innobase_rename_or_enlarge_columns_try|commit_try_norebuild|ha_innobase::commit_inplace_alter_table 
8: !index->is_spatial()|SIGABRT|btr_estimate_number_of_different_key_vals|dict_stats_update_transient_for_index|dict_stats_update_for_index|alter_stats_norebuild
9: is_prepared_xa(thd) || ((thd->lex->sql_command == SQLCOM_XA_ROLLBACK && thd->transaction->xid_state.get_state_code() == XA_IDLE) || thd->lex->xa_opt == XA_ONE_PHASE)|SIGABRT|
10: thd->lex->sql_command == SQLCOM_XA_COMMIT || thd->lex->sql_command == SQLCOM_XA_ROLLBACK|SIGABRT|TC_LOG::run_commit_ordered|MYSQL_BIN_LOG::trx_group_commit_leader|

If there is interest in any of these I can attempt to create a testcase.
NTS: 649729->630833/213313/816264

Comment by Andrei Elkin [ 2023-12-07 ]

roel, it'd be good to see an original stack. Could you please paste one if still available. I am interested in the query the most at this point. If it won't be seen through the stack then it must be present in the general query log, so also please provide one.

Comment by Roel Van de Paar [ 2023-12-07 ]

Elkin I have numbered the 10 UniqueID's above, and attached the stacks with failing queries and patch revision details as bug1.txt to bug10.txt. Note that 9 and 10 are from a previous iteration of the patch. If any of these are of interest, I can attempt to create a testcase for them.

I also did a bit further checking and #5 and #7 appear in a non-patched version as well.

Comment by Andrei Elkin [ 2023-12-08 ]

> and attached the stacks with failing queries
Great!

Comment by Andrei Elkin [ 2023-12-08 ]

Some important update is reported to validate this MDEV's role. The ver1 of MDEV-31949 is finally proven to provide performance.

Comment by Andrei Elkin [ 2023-12-11 ]

Roel, now that I am about to publish the final branch for review, these 1-10 needs checking on my branch.
But first I need to reproduce them on bb-10.6-MDEV-32830-qa, and that's enough to fix them if anything on
my current branch which is just slightly different.

So I need now methods how to reproduce.
Not for all I believe as I am not sure whether the cases 2,5,7,8 have anything to do with my branch.
Their load need to be tried on the vanilla 10.6 before to bundle with the rest, or better as I am used to do some
Jira search
that gives away the 8th at once in the MDEV-24558 summary.
I did not exercise search for the rest, but of course it needs to be done before the 10.6 would be tried.

To the rest,
the case #3 - an ALTER stack - is rather doubtful that is due to my branch, but there may a core file still exists.
Please provide access to it and its env/machine then, or at least gdb thr all bt from it, after that I might be able to say more, whether it indeed the vanilla 10.6 client too.
The case #4 does not feature any good bug report,
though it claimed there was a core file. Please share with me - as gdb thr all bt to start with - if one is there still.

The 1 and 6 looks to be the same `rollback to savepoint a`, ideally I need an mtr test, bu any other way to reproduce would do for me,
incl rr.
As a general note, along with the stacks all other piece of evidence are necessary to provide. In our case that would be
the var-dir. I beg to attach those next time.
Now consider share that alone in this `rollback to savepoint` when reproducing turns too tedious.
The same applies to the case #9 - at least the vardir is needed.
Please attend to 1+6 and 9 first so I'd move on with their analysis.

The case #10 is most probably a variant of MDEV-32455/MDEV-32347.
I am looking into that route.

Comment by Andrei Elkin [ 2023-12-11 ]

> The case #10 is most probably a variant of MDEV-32455

Confirmed that.
I once more adapted my branch, locally yet, to not produce any crash anymore.

Comment by Andrei Elkin [ 2023-12-19 ]

Howdy Sergei!

This task is ready for review in the form of a commit sum-lines with 'MDEV-32830 I. refactor..' in
bb-10.6-MDEV-31949 branch.
A couple of followup commits fix something that was found at QA-testing (by Roel), I found
appropriate not not merge them into the main, at least for the review being.
Parts II, III I am distributing for review by Kristian and Marko.

Looking forward for you notes.

Generated at Thu Feb 08 10:34:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.