Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29819

Shutdown unexpectedly executes XA ROLLBACK

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.3(EOL), 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11
    • 10.5, 10.6, 10.11
    • Replication, Server

    Description

      When an user prepared XA does not update data it expectedly does not survive
      the server restart.
      However in case of crash it gets recovered. Here is a test:

      --source include/have_binlog_format_mixed.inc
      --source include/have_innodb.inc
       
      # MDEV-TODO unneccessary recover of read-only transaction
      CREATE TABLE t1 (a INT PRIMARY KEY) ENGINE=InnoDB;
      INSERT INTO t1 VALUES (1);
       
      call mtr.add_suppression("Found 1 prepared XA transactions");
       
      XA START '1';
        --error ER_DUP_ENTRY
        INSERT INTO t1 VALUES (2),(1);
        --echo Despite '1' has done no work ...
        SELECT * FROM t1 WHERE a = 2;
      XA END '1';
      XA PREPARE '1';
      XA RECOVER;
       
      --echo server crash
      --let $shutdown_timeout=0
      --source include/restart_mysqld.inc
       
      --connection default
      --enable_reconnect
      --echo ... it is recovered:
      XA RECOVER;
       
      # Cleanup
      XA ROLLBACK '1';
      DROP TABLE t1;
      

      The recoverable behavior is more than unnecessary as it's harmful for replication because of
      inconsistent prepared status. There are reasons to claim that the "read-only" prepared user XA should not be recovered.

      Attachments

        Issue Links

          Activity

            For an empty XA transaction, innobase_xa_prepare() will not be called. That is fine, but the following does not look correct to me:

            --source include/have_binlog_format_mixed.inc
            XA START '1';
            XA END '1';
            XA PREPARE '1';
            XA RECOVER;
            --source include/restart_mysqld.inc
            XA RECOVER;
            --error ER_XAER_NOTA
            XA COMMIT '2';
            XA COMMIT '1';
            

            I got the following result:

            10.6 db330c87d6d253a1e2e93a747327040d9205c5fe

            mysqltest: At line 10: query 'XA COMMIT '1'' failed: ER_XAER_NOTA (1397): XAER_NOTA: Unknown XID
            

            The XA RECOVER listed the transaction before the server was shut down and restarted but not after it.

            marko Marko Mäkelä added a comment - For an empty XA transaction, innobase_xa_prepare() will not be called. That is fine, but the following does not look correct to me: --source include/have_binlog_format_mixed.inc XA START '1' ; XA END '1' ; XA PREPARE '1' ; XA RECOVER; --source include/restart_mysqld.inc XA RECOVER; --error ER_XAER_NOTA XA COMMIT '2' ; XA COMMIT '1' ; I got the following result: 10.6 db330c87d6d253a1e2e93a747327040d9205c5fe mysqltest: At line 10: query 'XA COMMIT '1'' failed: ER_XAER_NOTA (1397): XAER_NOTA: Unknown XID The XA RECOVER listed the transaction before the server was shut down and restarted but not after it.

            I created a variation of my test that kills the server to work around the bug that causes the unexpected rollback:

            --source include/have_innodb.inc
            CREATE TABLE t1 (a INT PRIMARY KEY) ENGINE=InnoDB;
            INSERT INTO t1 VALUES (1);
            connect con1,localhost,root;
            XA START '1';
            INSERT INTO t1 VALUES(3);
            --error ER_DUP_ENTRY
            INSERT INTO t1 VALUES(2),(1);
            XA END '1';
            XA PREPARE '1';
            XA RECOVER;
            connection default;
            --let $shutdown_timeout=0
            --source include/restart_mysqld.inc
            disconnect con1;
            SET innodb_lock_wait_timeout=1;
            --error ER_LOCK_WAIT_TIMEOUT
            INSERT INTO t1 VALUES(3);
            --error ER_LOCK_WAIT_TIMEOUT
            DROP TABLE t1;
            XA RECOVER;
            XA ROLLBACK '1';
            INSERT INTO t1 VALUES(3);
            DROP TABLE t1;
            

            2022-10-20 17:04:06 0 [Warning] Found 1 prepared XA transactions
            

            If I remove the insert of the row (3), then the XA transaction will be persisted with no undo log records and no InnoDB locks recovered:

            --source include/have_innodb.inc
            CREATE TABLE t1 (a INT PRIMARY KEY) ENGINE=InnoDB;
            INSERT INTO t1 VALUES (1);
            connect con1,localhost,root;
            XA START '1';
            --error ER_DUP_ENTRY
            INSERT INTO t1 VALUES(2),(1);
            XA END '1';
            XA PREPARE '1';
            XA RECOVER;
            connection default;
            --let $shutdown_timeout=0
            --source include/restart_mysqld.inc
            disconnect con1;
            DROP TABLE t1;
            XA RECOVER;
            XA ROLLBACK '1';
            

            2022-10-20 17:06:39 0 [Note] InnoDB: Starting recovery for XA transactions...
            2022-10-20 17:06:39 0 [Note] InnoDB: Transaction 24 in prepared state after recovery
            2022-10-20 17:06:39 0 [Note] InnoDB: Transaction contains changes to 0 rows
            2022-10-20 17:06:39 0 [Note] InnoDB: 1 transactions in prepared state after recovery
            2022-10-20 17:06:39 0 [Note] Found 1 prepared transaction(s) in InnoDB
            2022-10-20 17:06:39 0 [Warning] Found 1 prepared XA transactions
            

            These tests demonstrate that the XA PREPARE is durably written inside InnoDB (it will trigger a synchronous redo log write) and that InnoDB will preserve transactions in XA PREPARE state until it is requested to commit or roll back them.

            marko Marko Mäkelä added a comment - I created a variation of my test that kills the server to work around the bug that causes the unexpected rollback: --source include/have_innodb.inc CREATE TABLE t1 (a INT PRIMARY KEY ) ENGINE=InnoDB; INSERT INTO t1 VALUES (1); connect con1,localhost,root; XA START '1' ; INSERT INTO t1 VALUES (3); --error ER_DUP_ENTRY INSERT INTO t1 VALUES (2),(1); XA END '1' ; XA PREPARE '1' ; XA RECOVER; connection default ; --let $shutdown_timeout=0 --source include/restart_mysqld.inc disconnect con1; SET innodb_lock_wait_timeout=1; --error ER_LOCK_WAIT_TIMEOUT INSERT INTO t1 VALUES (3); --error ER_LOCK_WAIT_TIMEOUT DROP TABLE t1; XA RECOVER; XA ROLLBACK '1' ; INSERT INTO t1 VALUES (3); DROP TABLE t1; 2022-10-20 17:04:06 0 [Warning] Found 1 prepared XA transactions If I remove the insert of the row (3), then the XA transaction will be persisted with no undo log records and no InnoDB locks recovered: --source include/have_innodb.inc CREATE TABLE t1 (a INT PRIMARY KEY ) ENGINE=InnoDB; INSERT INTO t1 VALUES (1); connect con1,localhost,root; XA START '1' ; --error ER_DUP_ENTRY INSERT INTO t1 VALUES (2),(1); XA END '1' ; XA PREPARE '1' ; XA RECOVER; connection default ; --let $shutdown_timeout=0 --source include/restart_mysqld.inc disconnect con1; DROP TABLE t1; XA RECOVER; XA ROLLBACK '1' ; 2022-10-20 17:06:39 0 [Note] InnoDB: Starting recovery for XA transactions... 2022-10-20 17:06:39 0 [Note] InnoDB: Transaction 24 in prepared state after recovery 2022-10-20 17:06:39 0 [Note] InnoDB: Transaction contains changes to 0 rows 2022-10-20 17:06:39 0 [Note] InnoDB: 1 transactions in prepared state after recovery 2022-10-20 17:06:39 0 [Note] Found 1 prepared transaction(s) in InnoDB 2022-10-20 17:06:39 0 [Warning] Found 1 prepared XA transactions These tests demonstrate that the XA PREPARE is durably written inside InnoDB (it will trigger a synchronous redo log write) and that InnoDB will preserve transactions in XA PREPARE state until it is requested to commit or roll back them.
            Elkin Andrei Elkin added a comment - - edited

            marko, there's no question to the INSERT(3) case, it just demonstrates the read-write case is crash-recoverable.
            The 2nd one is borrowed from the description with all my conclusion on incompatibilities. A glaring one to my taste is
            The following case of the explicit rollback

                 xa start '1';
                  savepoint sp;
                  ---error 0
                  insert into t1 values (2);
                 rollback to sp;
                  xa end '1';
                  xa prepare '1'; 
            

            which does not recover after the crash. does not recover. Ideally they should behave equally I think.
            But the major issue in this ticket has been identified by us in a slack conversation.
            It is that the server should optimize the read-only branches of XA to not invoke their prepare(). After all that's what we effectively always do for pure empty XA (to defacto implement the XA specs' read-only optimization).

            To the pure empty XA case with no Engine involved its prepared status is, as you speak 'optimized' and that's actually just to follow XA specs, specifically p.44:

            xa_prepare() may return XA_RDONLY

            The optimization decision was necessary for MDEV-742.

            Elkin Andrei Elkin added a comment - - edited marko , there's no question to the INSERT(3) case, it just demonstrates the read-write case is crash-recoverable. The 2nd one is borrowed from the description with all my conclusion on incompatibilities. A glaring one to my taste is The following case of the explicit rollback xa start '1' ; savepoint sp; ---error 0 insert into t1 values (2); rollback to sp; xa end '1' ; xa prepare '1' ; which does not recover after the crash. does not recover. Ideally they should behave equally I think. But the major issue in this ticket has been identified by us in a slack conversation. It is that the server should optimize the read-only branches of XA to not invoke their prepare() . After all that's what we effectively always do for pure empty XA (to defacto implement the XA specs' read-only optimization). To the pure empty XA case with no Engine involved its prepared status is, as you speak 'optimized' and that's actually just to follow XA specs , specifically p.44: xa_prepare() may return XA_RDONLY The optimization decision was necessary for MDEV-742 .

            Elkin, for the following test, innobase_xa_prepare() is not being invoked at all:

            --source include/have_innodb.inc
             
            create table t1 (a int primary key) engine=innodb;
            xa start '1';
            savepoint sp;
            insert into t1 values (2);
            rollback to sp;
            xa end '1';
            xa prepare '1';
            xa recover;
             
            --source include/restart_mysqld.inc
            xa recover;
            #--error ER_XAER_NOTA
            xa commit '1';
             
            drop table t1;
            

            Instead, a full InnoDB rollback is wrongly being invoked:

            10.6 79dc3989fdc3bdcfc4eb5e17ac87fa501039ebba

            #0  trx_t::commit (this=0x7f569f933b80) at /mariadb/10.6/storage/innobase/trx/trx0trx.cc:1482
            #1  0x0000555ed506004e in trx_t::rollback_finish (this=this@entry=0x7f569f933b80) at /mariadb/10.6/storage/innobase/trx/trx0roll.cc:66
            #2  0x0000555ed505fa5c in trx_t::rollback_low (this=this@entry=0x7f569f933b80, savept=savept@entry=0x0) at /mariadb/10.6/storage/innobase/trx/trx0roll.cc:133
            #3  0x0000555ed505d453 in trx_rollback_for_mysql (trx=trx@entry=0x7f569f933b80) at /mariadb/10.6/storage/innobase/include/dyn0buf.h:192
            #4  0x0000555ed4d3cb99 in innobase_rollback (hton=<optimized out>, thd=0x7f56740020d8, rollback_trx=true) at /mariadb/10.6/storage/innobase/handler/ha_innodb.cc:4697
            #5  0x0000555ed476b0e6 in ha_rollback_to_savepoint (thd=thd@entry=0x7f56740020d8, sv=sv@entry=0x7f5674033f40) at /mariadb/10.6/sql/handler.cc:2877
            #6  0x0000555ed4b87236 in trans_rollback_to_savepoint (thd=thd@entry=0x7f56740020d8, name={str = 0x7f567402df78 "sp", length = <synthetic pointer>}) at /mariadb/10.6/sql/transaction.cc:670
            #7  0x0000555ed4a26062 in mysql_execute_command (thd=thd@entry=0x7f56740020d8, is_called_from_prepared_stmt=false) at /mariadb/10.6/sql/sql_parse.cc:5694
            #8  0x0000555ed4a1fee0 in mysql_parse (thd=thd@entry=0x7f56740020d8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x7f568d7ac3e0) at /mariadb/10.6/sql/sql_parse.cc:8016
            #9  0x0000555ed4a1e4c7 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7f56740020d8, packet=packet@entry=0x7f5674024539 "rollback to sp", packet_length=packet_length@entry=14, 
                blocking=true) at /mariadb/10.6/sql/sql_parse.cc:1896
            

            The SQL layer is wrongly invoking a handlerton::rollback instead of handlerton::savepoint_rollback or handlerton::savepoint_rollback_can_release_mdl.

            Please try to post complete test cases and try present proof (such as call stack traces) for claims like "automatic statement rollback in Innodb". InnoDB typically does what the SQL layer tells it to do.

            marko Marko Mäkelä added a comment - Elkin , for the following test, innobase_xa_prepare() is not being invoked at all: --source include/have_innodb.inc   create table t1 (a int primary key ) engine=innodb; xa start '1' ; savepoint sp; insert into t1 values (2); rollback to sp; xa end '1' ; xa prepare '1' ; xa recover;   --source include/restart_mysqld.inc xa recover; # --error ER_XAER_NOTA xa commit '1' ;   drop table t1; Instead, a full InnoDB rollback is wrongly being invoked: 10.6 79dc3989fdc3bdcfc4eb5e17ac87fa501039ebba #0 trx_t::commit (this=0x7f569f933b80) at /mariadb/10.6/storage/innobase/trx/trx0trx.cc:1482 #1 0x0000555ed506004e in trx_t::rollback_finish (this=this@entry=0x7f569f933b80) at /mariadb/10.6/storage/innobase/trx/trx0roll.cc:66 #2 0x0000555ed505fa5c in trx_t::rollback_low (this=this@entry=0x7f569f933b80, savept=savept@entry=0x0) at /mariadb/10.6/storage/innobase/trx/trx0roll.cc:133 #3 0x0000555ed505d453 in trx_rollback_for_mysql (trx=trx@entry=0x7f569f933b80) at /mariadb/10.6/storage/innobase/include/dyn0buf.h:192 #4 0x0000555ed4d3cb99 in innobase_rollback (hton=<optimized out>, thd=0x7f56740020d8, rollback_trx=true) at /mariadb/10.6/storage/innobase/handler/ha_innodb.cc:4697 #5 0x0000555ed476b0e6 in ha_rollback_to_savepoint (thd=thd@entry=0x7f56740020d8, sv=sv@entry=0x7f5674033f40) at /mariadb/10.6/sql/handler.cc:2877 #6 0x0000555ed4b87236 in trans_rollback_to_savepoint (thd=thd@entry=0x7f56740020d8, name={str = 0x7f567402df78 "sp", length = <synthetic pointer>}) at /mariadb/10.6/sql/transaction.cc:670 #7 0x0000555ed4a26062 in mysql_execute_command (thd=thd@entry=0x7f56740020d8, is_called_from_prepared_stmt=false) at /mariadb/10.6/sql/sql_parse.cc:5694 #8 0x0000555ed4a1fee0 in mysql_parse (thd=thd@entry=0x7f56740020d8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x7f568d7ac3e0) at /mariadb/10.6/sql/sql_parse.cc:8016 #9 0x0000555ed4a1e4c7 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7f56740020d8, packet=packet@entry=0x7f5674024539 "rollback to sp", packet_length=packet_length@entry=14, blocking=true) at /mariadb/10.6/sql/sql_parse.cc:1896 The SQL layer is wrongly invoking a handlerton::rollback instead of handlerton::savepoint_rollback or handlerton::savepoint_rollback_can_release_mdl . Please try to post complete test cases and try present proof (such as call stack traces) for claims like "automatic statement rollback in Innodb". InnoDB typically does what the SQL layer tells it to do.
            Elkin Andrei Elkin added a comment -

            marko, thanks for following up! To

            > Please try to post complete test cases and try present proof (such as call stack traces) for claims like "automatic statement rollback in Innodb"

            I had removed (with explanation) any mentioning of 'automatic statement rollback in Innodb' (which also included 'partial' adjective and which was the key ).

            After that this ticket contains two items to our attention:

            • The recovery/replication related one - the read-only optimization should be completed cleanly;
            • the errored statement rollback and explicit rollback to sp should be made compatible (might be also on the server layer to do so
              as your latest comment says).
            Elkin Andrei Elkin added a comment - marko , thanks for following up! To > Please try to post complete test cases and try present proof (such as call stack traces) for claims like "automatic statement rollback in Innodb" I had removed (with explanation) any mentioning of 'automatic statement rollback in Innodb' (which also included 'partial' adjective and which was the key ). After that this ticket contains two items to our attention: The recovery/replication related one - the read-only optimization should be completed cleanly; the errored statement rollback and explicit rollback to sp should be made compatible (might be also on the server layer to do so as your latest comment says).

            People

              Elkin Andrei Elkin
              Elkin Andrei Elkin
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.