Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35271

XA behavior changed, assertion fails in Ha_trx_info::is_trx_read_write

Details

    Description

      In bb-11.6-MDEV-32887-vector branch the behavior of XA PREPARE when non-supported engines are involved has changed. The new behavior causes replication failures or assertion failures. Vector search is also affected, but it is not needed to demonstrate the problem.

      --source include/have_innodb.inc
      --source include/have_binlog_format_mixed.inc
      --source include/master-slave.inc
       
      CREATE TABLE t1 (a INT) ENGINE=Aria;
      CREATE TABLE t2 (a INT) ENGINE=InnoDB;
      XA BEGIN 'x';
      INSERT INTO t2 VALUES (1);
      INSERT INTO t1 VALUES (1);
      XA END 'x';
      XA PREPARE 'x';
       
      --sync_slave_with_master
       
      # Cleanup
      --connection master
      XA ROLLBACK 'x';
      DROP TABLE t1, t2;
      --source include/rpl_end.inc
      

      On the baseline and earlier versions, the test case above passes, with XA PREPARE issuing a warning

      11.6 a5b80531fbbd85050dfaf578db5c18f33ab3f066

      XA PREPARE 'x';
      Warnings:
      Warning	1030	Got error 131 "Command not supported by the engine" from storage engine Aria
      

      and binlog looking this way:

      master-bin.000001	679	Query	1	0	use `test`; INSERT INTO t1 VALUES (1)
      master-bin.000001	771	Query	1	844	COMMIT
      master-bin.000001	844	Gtid	1	893	XA START X'78',X'',1 GTID 0-1-4
      master-bin.000001	893	Query	1	0	use `test`; INSERT INTO t2 VALUES (1)
      master-bin.000001	985	Query	1	0	XA END X'78',X'',1
      master-bin.000001	1070	XA_prepare	1	1107	XA PREPARE X'78',X'',1
      

      On the vector branch, XA PREPARE produces an error:

      4d4ac4ed2866a7e154223f8ddab79b51bb2b7666

      mysqltest: At line 11: query 'XA PREPARE 'x'' failed: ER_GET_ERRNO (1030): Got error 131 "Command not supported by the engine" from storage engine Aria
      

      If we allow the test to continue, on a non-debug build it ends with a replication failure:

      Last_SQL_Error	Error 'XAER_RMFAIL: The command cannot be executed when global transaction is in the  ACTIVE state' on query. Default database: 'test'. Query: 'ROLLBACK'
      

      with the binlog looking this way:

      master-bin.000001	637	Gtid	1	679	BEGIN GTID 0-1-3
      master-bin.000001	679	Query	1	0	use `test`; INSERT INTO t1 VALUES (1)
      master-bin.000001	771	Query	1	844	COMMIT
      master-bin.000001	844	Gtid	1	893	XA START X'78',X'',1 GTID 0-1-4
      master-bin.000001	893	Query	1	0	use `test`; INSERT INTO t2 VALUES (1)
      master-bin.000001	985	Query	1	1060	ROLLBACK
      

      A debug build fails on an assertion upon XA PREPARE:

      4d4ac4ed2866a7e154223f8ddab79b51bb2b7666

      mariadbd: /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/handler.h:2009: bool Ha_trx_info::is_trx_read_write() const: Assertion `is_started()' failed.
      241027 22:32:29 [ERROR] mysqld got signal 6 ;
       
      #9  0x00007fefeb053e32 in __GI___assert_fail (assertion=0x560ef5e75c98 "is_started()", file=0x560ef5e75ca8 "/data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/handler.h", line=2009, function=0x560ef5e75cf0 "bool Ha_trx_info::is_trx_read_write() const") at ./assert/assert.c:101
      #10 0x0000560ef4f2ca69 in Ha_trx_info::is_trx_read_write (this=0x7fefa4003ab8) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/handler.h:2009
      #11 0x0000560ef534e387 in ha_count_rw_2pc (thd=0x7fefa4000dc8, all=true) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/handler.cc:1628
      #12 0x0000560ef54fd372 in Gtid_log_event::Gtid_log_event (this=0x7fefdca2bc10, thd_arg=0x7fefa4000dc8, seq_no_arg=4, domain_id_arg=0, standalone=false, flags_arg=8, is_transactional=true, commit_id_arg=0, has_xid=false, ro_1pc=false) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/log_event_server.cc:2904
      #13 0x0000560ef54ca5a1 in MYSQL_BIN_LOG::write_gtid_event (this=0x560ef6d13060 <mysql_bin_log>, thd=0x7fefa4000dc8, standalone=false, is_transactional=true, commit_id=0, has_xid=false, is_ro_1pc=false) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/log.cc:6916
      #14 0x0000560ef54d19a9 in MYSQL_BIN_LOG::write_transaction_or_stmt (this=0x560ef6d13060 <mysql_bin_log>, entry=0x7fefdca2c280, commit_id=0) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/log.cc:9219
      #15 0x0000560ef54d05f9 in MYSQL_BIN_LOG::trx_group_commit_leader (this=0x560ef6d13060 <mysql_bin_log>, leader=0x7fefdca2c280) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/log.cc:8941
      #16 0x0000560ef54cfc27 in MYSQL_BIN_LOG::write_transaction_to_binlog_events (this=0x560ef6d13060 <mysql_bin_log>, entry=0x7fefdca2c280) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/log.cc:8733
      #17 0x0000560ef54cec3b in MYSQL_BIN_LOG::write_transaction_to_binlog (this=0x560ef6d13060 <mysql_bin_log>, thd=0x7fefa4000dc8, cache_mngr=0x7fefa401f848, end_ev=0x7fefdca2c450, all=true, using_stmt_cache=false, using_trx_cache=true, is_ro_1pc=false) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/log.cc:8330
      #18 0x0000560ef54ba253 in binlog_flush_cache (thd=0x7fefa4000dc8, cache_mngr=0x7fefa401f848, end_ev=0x7fefdca2c450, all=true, using_stmt=false, using_trx=true, is_ro_1pc=false) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/log.cc:1748
      #19 0x0000560ef54bab97 in binlog_rollback_flush_trx_cache (thd=0x7fefa4000dc8, all=true, cache_mngr=0x7fefa401f848) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/log.cc:1878
      #20 0x0000560ef54bc401 in binlog_rollback (thd=0x7fefa4000dc8, all=true) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/log.cc:2316
      #21 0x0000560ef5350912 in ha_rollback_trans (thd=0x7fefa4000dc8, all=true) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/handler.cc:2345
      #22 0x0000560ef534e196 in ha_prepare (thd=0x7fefa4000dc8) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/handler.cc:1559
      #23 0x0000560ef527d28d in trans_xa_prepare (thd=0x7fefa4000dc8) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/xa.cc:551
      #24 0x0000560ef4f1fda8 in mysql_execute_command (thd=0x7fefa4000dc8, is_called_from_prepared_stmt=false) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/sql_parse.cc:5707
      #25 0x0000560ef4f2685a in mysql_parse (thd=0x7fefa4000dc8, rawbuf=0x7fefa4016ac0 "XA PREPARE 'x'", length=14, parser_state=0x7fefdca2d2f0) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/sql_parse.cc:7873
      #26 0x0000560ef4f12cb6 in dispatch_command (command=COM_QUERY, thd=0x7fefa4000dc8, packet=0x7fefa400bf59 "XA PREPARE 'x'", packet_length=14, blocking=true) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/sql_parse.cc:1892
      #27 0x0000560ef4f1160f in do_command (thd=0x7fefa4000dc8, blocking=true) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/sql_parse.cc:1405
      #28 0x0000560ef511a3fd in do_handle_one_connection (connect=0x560ef8ef7328, put_in_cache=true) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/sql_connect.cc:1448
      #29 0x0000560ef511a172 in handle_one_connection (arg=0x560ef8ef7328) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/sql/sql_connect.cc:1350
      #30 0x0000560ef5690eec in pfs_spawn_thread (arg=0x560ef8ef7408) at /data/bld/preview-11.7-bb-11.6-MDEV-32887-vector-debug/storage/perfschema/pfs.cc:2198
      #31 0x00007fefeb0a8044 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
      #32 0x00007fefeb12861c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
      

      Attachments

        Issue Links

          Activity

            The error was intentional, it's

            commit 5ff9a6e1450
            Author: Sergei Golubchik <serg@mariadb.org>
            Date:   Wed Oct 2 20:33:42 2024 +0200
             
                MDEV-35061 XA PREPARE "not supported by the engine" from storage engine mhnsw, memory leak
                
                on XA PREPARE the server cannot simply skip XA-incapable transactional
                engines, because:
                * on disconnect they're neither committed or rolled back (which is a bug)
                * if the server will automatically rollback them on disconnect,
                  the data will get out of sync
                
                the only option to avoid inconsistent data is to disallow
                explicit XA PREPARE if the transaction includes XA-incapable engines.
            

            serg Sergei Golubchik added a comment - The error was intentional, it's commit 5ff9a6e1450 Author: Sergei Golubchik <serg@mariadb.org> Date: Wed Oct 2 20:33:42 2024 +0200   MDEV-35061 XA PREPARE "not supported by the engine" from storage engine mhnsw, memory leak on XA PREPARE the server cannot simply skip XA-incapable transactional engines, because: * on disconnect they're neither committed or rolled back (which is a bug) * if the server will automatically rollback them on disconnect, the data will get out of sync the only option to avoid inconsistent data is to disallow explicit XA PREPARE if the transaction includes XA-incapable engines.

            Maybe it should then just write a generated "XA ROLLBACK" instead of "ROLLBACK" in the binary log?
            At least for the provided test case it should do the trick: update for the non-transactional t1 is applied on the master and is written outside XA in the binary log, so it will be applied on the slave; update for the transactional t2 is rolled back implicitly on the master, so with "XA ROLLBACK" in the binary log it should be rolled back on the slave too; by the end of the exercise both will have a record in t1, no record in t2, and no existing XA transaction 'x'.
            I don't know of course whether it can break something else.

            elenst Elena Stepanova added a comment - Maybe it should then just write a generated "XA ROLLBACK" instead of "ROLLBACK" in the binary log? At least for the provided test case it should do the trick: update for the non-transactional t1 is applied on the master and is written outside XA in the binary log, so it will be applied on the slave; update for the transactional t2 is rolled back implicitly on the master, so with "XA ROLLBACK" in the binary log it should be rolled back on the slave too; by the end of the exercise both will have a record in t1, no record in t2, and no existing XA transaction 'x'. I don't know of course whether it can break something else.
            serg Sergei Golubchik added a comment - - edited

            I'll redo MDEV-35061 to avoid changing existing behavior.

            Aria will work as before. XA over vector indexes still isn't supported and breaks replication:

            --source include/have_innodb.inc
            --source include/have_binlog_format_mixed.inc
            --source include/master-slave.inc
             
            create table t2 (a int, b vector(1) not null, vector index (b)) engine=innodb;
            xa begin 'x';
            insert into t2 values (1, 0x31313131);
            xa end 'x';
            --error ER_GET_ERRNO
            xa prepare 'x';
             
            --sync_slave_with_master
            --connection master
            xa rollback 'x';
            drop table t2;
            --source include/rpl_end.inc
            

            causes

            SHOW SLAVE STATUS;
            ...
            Last_IO_Errno   0
            Last_IO_Error
            Last_SQL_Errno  1030
            Last_SQL_Error  Got error 138 "Unsupported extension used for table" from storage engine mhnsw
            

            serg Sergei Golubchik added a comment - - edited I'll redo MDEV-35061 to avoid changing existing behavior. Aria will work as before. XA over vector indexes still isn't supported and breaks replication: --source include/have_innodb.inc --source include/have_binlog_format_mixed.inc --source include/master-slave.inc create table t2 (a int , b vector(1) not null , vector index (b)) engine=innodb; xa begin 'x' ; insert into t2 values (1, 0x31313131); xa end 'x' ; --error ER_GET_ERRNO xa prepare 'x' ; --sync_slave_with_master --connection master xa rollback 'x' ; drop table t2; --source include/rpl_end.inc causes SHOW SLAVE STATUS; ... Last_IO_Errno 0 Last_IO_Error Last_SQL_Errno 1030 Last_SQL_Error Got error 138 "Unsupported extension used for table" from storage engine mhnsw

            People

              serg Sergei Golubchik
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.