Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27342

MariaDB 10.6 failed to do crash recovery, potential regression in commit 6c39eaeb

Details

    Description

      Issue description:

      MariaDB is designed to be crash-safe and recent versions of MariaDB (tested with 10.5.12) are perfectly capable of starting and recovering if the /var/lib/mysql data directory is based on a filesystem level snapshot and there is a transaction in progress. However in 10.6.5 this is no longer the case and mariadbd fails to start and recover when the data directory is from a filesystem snapshot. Based on my investigation is seems to be a regression introduced in 6c39eaeb1.

      MariaDB 10.6.5 failed to recover the database from a data directory snapshot which was taken while following transaction is running:

      create database testdb;
      use testdb;
      CREATE TABLE `t1` (`col1` int(11) NOT NULL,`col2` int(11) DEFAULT NULL,PRIMARY KEY (`col1`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
      DELIMITER $$
      CREATE PROCEDURE generateredo(count int)
      BEGIN
      DECLARE counter INT DEFAULT 1;
      WHILE counter <= count DO
      insert into t1 values(counter,counter);
      update t1 set col2=counter+1 where col1=counter;
      delete from t1 where col1=counter;
      SET counter = counter + 1;
      END WHILE;
      END$$
      DELIMITER ;
       
      call testdb.generateredo(10000000);
      

      The mysql error log when recovering:

      2021-12-20 18:55:50 0 [Note] RocksDB: 2 column families found
      2021-12-20 18:55:50 0 [Note] RocksDB: Column Families at start:
      2021-12-20 18:55:50 0 [Note]   cf=default
      2021-12-20 18:55:50 0 [Note]     write_buffer_size=67108864
      2021-12-20 18:55:50 0 [Note]     target_file_size_base=67108864
      2021-12-20 18:55:50 0 [Note]   cf=__system__
      2021-12-20 18:55:50 0 [Note]     write_buffer_size=67108864
      2021-12-20 18:55:50 0 [Note]     target_file_size_base=67108864
      2021-12-20 18:55:50 0 [Note] RocksDB: Table_store: loaded DDL data for 0 tables
      2021-12-20 18:55:50 0 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
      2021-12-20 18:55:50 0 [Note] MyRocks storage engine plugin has been successfully initialized.
      2021-12-20 18:55:50 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
      2021-12-20 18:55:50 0 [Note] InnoDB: Number of pools: 1
      2021-12-20 18:55:50 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
      2021-12-20 18:55:50 0 [Note] InnoDB: Using Linux native AIO
      2021-12-20 18:55:50 0 [Note] InnoDB: Initializing buffer pool, total size = 49392123904, chunk size = 134217728
      2021-12-20 18:55:50 0 [Note] InnoDB: Completed initialization of buffer pool
      2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 was in the XA prepared state.
      2021-12-20 18:55:50 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 0 row operations to undo
      2021-12-20 18:55:50 0 [Note] InnoDB: Trx id counter is 210266
      2021-12-20 18:55:50 0 [Note] InnoDB: 128 rollback segments are active.
      2021-12-20 18:55:50 0 [Note] InnoDB: Creating shared tablespace for temporary tables
      2021-12-20 18:55:50 0 [Note] InnoDB: Setting file '/rdsdbdata/db/innodb/ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
      2021-12-20 18:55:50 0 [Note] InnoDB: File '/rdsdbdata/db/innodb/ibtmp1' size is now 12 MB.
      2021-12-20 18:55:50 0 [Note] InnoDB: Starting in background the rollback of recovered transactions
      2021-12-20 18:55:50 0 [Note] InnoDB: Rollback of non-prepared transactions completed
      2021-12-20 18:55:50 0 [Note] InnoDB: 10.6.5 started; log sequence number 25054305; transaction id 210267
      2021-12-20 18:55:50 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
      2021-12-20 18:55:50 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
      2021-12-20 18:55:50 0 [Note] Starting table crash recovery...
      2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
      2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
      2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
      2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
      2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
      2021-12-20 18:55:50 0 [Note] Crash table recovery finished.
      2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
      2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
      2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
      2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
      2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
      2021-12-20 18:55:50 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
      2021-12-20 18:55:50 0 [ERROR] Aborting
      

      The issue is regarded as a regression because the same data files are able to recover with MariaDB 10.5.12 binaries.

      Some investigation:

      Based on the implementation of function init_server_components(), during the recovery, ha_recover() function is called twice

      • Stack trace of the first call: https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5292

        sql/handler.cc:2706(ha_recover(st_hash*, st_mem_root*))[0x5599477cf057]
        sql/log.cc:11169(MYSQL_BIN_LOG::recover(st_log_info*, char const*, st_io_cache*, Format_description_log_event*, bool))[0x5599478a80dd]
        sql/log.cc:11502(MYSQL_BIN_LOG::do_binlog_recovery(char const*, bool))[0x5599478a8931]
        sql/log.cc:10151(MYSQL_BIN_LOG::open(char const*))[0x5599478aceda]
        sql/mysqld.cc:5289(init_server_components())[0x5599475098ce]
        sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55994750ed55]
        ??:0(__libc_start_main)[0x14f07bff20ba]
        

      • Stack trace of the second call: https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5298

        sql/handler.cc:2710(ha_recover(st_hash*, st_mem_root*))[0x55e57611247b]
        sql/mysqld.cc:5295(init_server_components())[0x55e575e4c8df]
        sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55e575e51d55]
        ??:0(__libc_start_main)[0x1542b650e0ba]
        ??:0(_start)[0x55e575e4764a]
        

      However, in the commit the ha_recover() function call was moved from the end of function TC_LOG_BINLOG::recover to the beginning.

      Tried a quick fix

      I tried a simple fix of moving the ha_recover() back to the end of recover() then the issue was fixed:

      diff --git a/sql/log.cc b/sql/log.cc
      index 41eed80ef66..ddda2e5d330 100644
      --- a/sql/log.cc
      +++ b/sql/log.cc
      @@ -11166,8 +11166,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
         fdle->flags&= ~LOG_EVENT_BINLOG_IN_USE_F; // abort on the first error
       
         /* finds xids when root is not NULL */
      -  if (do_xa && ha_recover(&xids, &mem_root))
      -    goto err1;
      +  //if (do_xa && ha_recover(&xids, &mem_root))
      +  //  goto err1;
       
         /*
           Scan the binlog for XIDs that need to be committed if still in the
      @@ -11371,6 +11374,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
       
         if (do_xa)
         {
      +    if (ha_recover(&xids))
      +      goto err2;
           if (binlog_checkpoint_found)
           {
       #ifndef HAVE_REPLICATION
      

      Error log before and after this fix
      Before the fix, the error was in the second call of ha_recover():

      2021-12-21 20:07:45 0 [Note] InnoDB: 10.6.5 started; log sequence number 15159225; transaction id 126557
      2021-12-21 20:07:45 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
      2021-12-21 20:07:45 0 [Note] Plugin 'FEEDBACK' is disabled.
      2021-12-21 20:07:45 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
      2021-12-21 20:07:45 0 [Note] Starting table crash recovery...
      2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
      2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
      2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
      2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
      2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
      2021-12-21 20:07:45 0 [Note] Crash table recovery finished.
      2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
      2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
      2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
      2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
      2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
      2021-12-21 20:07:45 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
      2021-12-21 20:07:45 0 [ERROR] Aborting

      After the fix, ha_recover() was still called twice(verified by adding some private log) but no error/note logs of the second call:

      2021-12-21  1:00:20 0 [Note] InnoDB: 10.6.5 started; log sequence number 15158841; transaction id 126557
      2021-12-21  1:00:20 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
      2021-12-21  1:00:20 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
      2021-12-21  1:00:20 0 [Note] Starting table crash recovery...
      2021-12-21  1:00:20 0 [Note] InnoDB: Starting recovery for XA transactions...
      2021-12-21  1:00:20 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
      2021-12-21  1:00:20 0 [Note] InnoDB: Transaction contains changes to 1 rows
      2021-12-21  1:00:20 0 [Note] InnoDB: 1 transactions in prepared state after recovery
      2021-12-21  1:00:20 0 [Note] Found 1 prepared transaction(s) in InnoDB
      2021-12-21  1:00:20 0 [Note] Crash table recovery finished.
      2021-12-21  1:00:20 0 [Note] Server socket created on IP: '0.0.0.0'.
      2021-12-21  1:00:20 0 [Note] Server socket created on IP: '::'.
      2021-12-21  1:00:20 0 [Note] InnoDB: Buffer pool(s) load completed at 211221  1:00:20
      2021-12-21  1:00:20 0 [Note] /rdsdbbin/mysql/bin/mysqld: ready for connections.
      

      Could you please investigate whether we should update the function or other conditions need to be considered?

      Attachments

        Activity

          wenhug Hugo Wen created issue -
          wenhug Hugo Wen made changes -
          Field Original Value New Value
          Priority Major [ 3 ] Minor [ 4 ]
          wenhug Hugo Wen made changes -
          Affects Version/s 10.6 [ 24028 ]
          wenhug Hugo Wen made changes -
          Summary MariaDB 10.6 failed to do crash recovery due to changes in https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981 MariaDB 10.6 failed to do crash recovery, potential regression in commit 6c39eaeb
          wenhug Hugo Wen made changes -
          Description
          h3. Issue description:

          *MariaDB 10.6.5 fails to recover from crash, regression in commit [https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981]*

          *Failed to recover the database from an EBS data directory snapshot which was taken while following transaction is running:*
          {code:java}
          create database testdb;
          use testdb;
          CREATE TABLE `t1` (`col1` int(11) NOT NULL,`col2` int(11) DEFAULT NULL,PRIMARY KEY (`col1`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
          DELIMITER $$
          CREATE PROCEDURE generateredo(count int)
          BEGIN
          DECLARE counter INT DEFAULT 1;
          WHILE counter <= count DO
          insert into t1 values(counter,counter);
          update t1 set col2=counter+1 where col1=counter;
          delete from t1 where col1=counter;
          SET counter = counter + 1;
          END WHILE;
          END$$
          DELIMITER ;

          call testdb.generateredo(10000000);
          {code}
          The mysql error log when recovering:
          {code:java}
          2021-12-20 18:55:50 0 [Note] RocksDB: 2 column families found
          2021-12-20 18:55:50 0 [Note] RocksDB: Column Families at start:
          2021-12-20 18:55:50 0 [Note] cf=default
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] cf=__system__
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] RocksDB: Table_store: loaded DDL data for 0 tables
          2021-12-20 18:55:50 0 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
          2021-12-20 18:55:50 0 [Note] MyRocks storage engine plugin has been successfully initialized.
          2021-12-20 18:55:50 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
          2021-12-20 18:55:50 0 [Note] InnoDB: Number of pools: 1
          2021-12-20 18:55:50 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
          2021-12-20 18:55:50 0 [Note] InnoDB: Using Linux native AIO
          2021-12-20 18:55:50 0 [Note] InnoDB: Initializing buffer pool, total size = 49392123904, chunk size = 134217728
          2021-12-20 18:55:50 0 [Note] InnoDB: Completed initialization of buffer pool
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 was in the XA prepared state.
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 0 row operations to undo
          2021-12-20 18:55:50 0 [Note] InnoDB: Trx id counter is 210266
          2021-12-20 18:55:50 0 [Note] InnoDB: 128 rollback segments are active.
          2021-12-20 18:55:50 0 [Note] InnoDB: Creating shared tablespace for temporary tables
          2021-12-20 18:55:50 0 [Note] InnoDB: Setting file '/rdsdbdata/db/innodb/ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
          2021-12-20 18:55:50 0 [Note] InnoDB: File '/rdsdbdata/db/innodb/ibtmp1' size is now 12 MB.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting in background the rollback of recovered transactions
          2021-12-20 18:55:50 0 [Note] InnoDB: Rollback of non-prepared transactions completed
          2021-12-20 18:55:50 0 [Note] InnoDB: 10.6.5 started; log sequence number 25054305; transaction id 210267
          2021-12-20 18:55:50 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-20 18:55:50 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-20 18:55:50 0 [Note] Starting table crash recovery...
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [Note] Crash table recovery finished.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-20 18:55:50 0 [ERROR] Aborting
          {code}
           

          *The issue is regarded as a regression because the same data files are able to recover with MariaDB 10.5.12 binaries.*
          h3. Some investigation:

          Based on the implementation of function {+}{*}init_server_components{*}(){+}, during the recovery, [+ha_recover()+|https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2703] function is called *twice*
           * Stack trace of the {*}first call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5292]
          {code:java}
          sql/handler.cc:2706(ha_recover(st_hash*, st_mem_root*))[0x5599477cf057]
          sql/log.cc:11169(MYSQL_BIN_LOG::recover(st_log_info*, char const*, st_io_cache*, Format_description_log_event*, bool))[0x5599478a80dd]
          sql/log.cc:11502(MYSQL_BIN_LOG::do_binlog_recovery(char const*, bool))[0x5599478a8931]
          sql/log.cc:10151(MYSQL_BIN_LOG::open(char const*))[0x5599478aceda]
          sql/mysqld.cc:5289(init_server_components())[0x5599475098ce]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55994750ed55]
          ??:0(__libc_start_main)[0x14f07bff20ba]
          {code}

           * Stack trace of the {*}second call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5298]
          {code:java}
          sql/handler.cc:2710(ha_recover(st_hash*, st_mem_root*))[0x55e57611247b]
          sql/mysqld.cc:5295(init_server_components())[0x55e575e4c8df]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55e575e51d55]
          ??:0(__libc_start_main)[0x1542b650e0ba]
          ??:0(_start)[0x55e575e4764a]
          {code}

          *However, in [the commit|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981] the +ha_recover()+ function call was moved from the [end of function|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8L10686-L10687] +TC_LOG_BINLOG::recover+ to [the beginning|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8R11174-R11177].*
           * Before the commit, in the second call of {+}ha_recover(){+}, no actual recover was done because +trx_sys.rw_trx_hash.iterate_no_dups()+ iterate doesn't have any items - [https://github.com/MariaDB/server/blob/10.6/storage/innobase/trx/trx0trx.cc#L1980] .
           * However after this commit, the second call of +ha_recover()+ also tries to do some recovery which eventually caused the error of:
           [https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2746-L2756]
          {code:java}
          [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          {code}

          h3. Tried a quick fix

          *I tried a simple fix of moving the +ha_recover()+ back to the end of +recover()+ then the issue was fixed:*
          {code:java}
          diff --git a/sql/log.cc b/sql/log.cc
          index 41eed80ef66..ddda2e5d330 100644
          --- a/sql/log.cc
          +++ b/sql/log.cc
          @@ -11166,8 +11166,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
             fdle->flags&= ~LOG_EVENT_BINLOG_IN_USE_F; // abort on the first error
           
             /* finds xids when root is not NULL */
          - if (do_xa && ha_recover(&xids, &mem_root))
          - goto err1;
          + //if (do_xa && ha_recover(&xids, &mem_root))
          + // goto err1;
           
             /*
               Scan the binlog for XIDs that need to be committed if still in the
          @@ -11371,6 +11374,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
           
             if (do_xa)
             {
          + if (ha_recover(&xids))
          + goto err2;
               if (binlog_checkpoint_found)
               {
           #ifndef HAVE_REPLICATION
          {code}
          *Error log before and after this fix*
           Before the fix, the error was in the second call of ha_recover():
          {code:java}
          2021-12-21 20:07:45 0 [Note] InnoDB: 10.6.5 started; log sequence number 15159225; transaction id 126557
          2021-12-21 20:07:45 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 20:07:45 0 [Note] Plugin 'FEEDBACK' is disabled.
          2021-12-21 20:07:45 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 20:07:45 0 [Note] Starting table crash recovery...
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [Note] Crash table recovery finished.
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-21 20:07:45 0 [ERROR] Aborting{code}
          After the fix, ha_recover() was still called twice(verified by adding some private log) but no error/note logs of the second call:
          {code:java}
          2021-12-21 1:00:20 0 [Note] InnoDB: 10.6.5 started; log sequence number 15158841; transaction id 126557
          2021-12-21 1:00:20 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 1:00:20 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 1:00:20 0 [Note] Starting table crash recovery...
          2021-12-21 1:00:20 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 1:00:20 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 1:00:20 0 [Note] Crash table recovery finished.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '0.0.0.0'.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '::'.
          2021-12-21 1:00:20 0 [Note] InnoDB: Buffer pool(s) load completed at 211221 1:00:20
          2021-12-21 1:00:20 0 [Note] /rdsdbbin/mysql/bin/mysqld: ready for connections.
          {code}

          Could you please investigate whether we should update the function or other conditions need to be considered?
          h3. Issue description:

          *MariaDB 10.6.5 fails to recover from crash, regression in commit [https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981]*

          *Failed to recover the database from a data directory snapshot which was taken while following transaction is running:*
          {code:java}
          create database testdb;
          use testdb;
          CREATE TABLE `t1` (`col1` int(11) NOT NULL,`col2` int(11) DEFAULT NULL,PRIMARY KEY (`col1`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
          DELIMITER $$
          CREATE PROCEDURE generateredo(count int)
          BEGIN
          DECLARE counter INT DEFAULT 1;
          WHILE counter <= count DO
          insert into t1 values(counter,counter);
          update t1 set col2=counter+1 where col1=counter;
          delete from t1 where col1=counter;
          SET counter = counter + 1;
          END WHILE;
          END$$
          DELIMITER ;

          call testdb.generateredo(10000000);
          {code}
          The mysql error log when recovering:
          {code:java}
          2021-12-20 18:55:50 0 [Note] RocksDB: 2 column families found
          2021-12-20 18:55:50 0 [Note] RocksDB: Column Families at start:
          2021-12-20 18:55:50 0 [Note] cf=default
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] cf=__system__
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] RocksDB: Table_store: loaded DDL data for 0 tables
          2021-12-20 18:55:50 0 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
          2021-12-20 18:55:50 0 [Note] MyRocks storage engine plugin has been successfully initialized.
          2021-12-20 18:55:50 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
          2021-12-20 18:55:50 0 [Note] InnoDB: Number of pools: 1
          2021-12-20 18:55:50 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
          2021-12-20 18:55:50 0 [Note] InnoDB: Using Linux native AIO
          2021-12-20 18:55:50 0 [Note] InnoDB: Initializing buffer pool, total size = 49392123904, chunk size = 134217728
          2021-12-20 18:55:50 0 [Note] InnoDB: Completed initialization of buffer pool
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 was in the XA prepared state.
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 0 row operations to undo
          2021-12-20 18:55:50 0 [Note] InnoDB: Trx id counter is 210266
          2021-12-20 18:55:50 0 [Note] InnoDB: 128 rollback segments are active.
          2021-12-20 18:55:50 0 [Note] InnoDB: Creating shared tablespace for temporary tables
          2021-12-20 18:55:50 0 [Note] InnoDB: Setting file '/rdsdbdata/db/innodb/ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
          2021-12-20 18:55:50 0 [Note] InnoDB: File '/rdsdbdata/db/innodb/ibtmp1' size is now 12 MB.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting in background the rollback of recovered transactions
          2021-12-20 18:55:50 0 [Note] InnoDB: Rollback of non-prepared transactions completed
          2021-12-20 18:55:50 0 [Note] InnoDB: 10.6.5 started; log sequence number 25054305; transaction id 210267
          2021-12-20 18:55:50 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-20 18:55:50 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-20 18:55:50 0 [Note] Starting table crash recovery...
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [Note] Crash table recovery finished.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-20 18:55:50 0 [ERROR] Aborting
          {code}
           

          *The issue is regarded as a regression because the same data files are able to recover with MariaDB 10.5.12 binaries.*
          h3. Some investigation:

          Based on the implementation of function {+}{*}init_server_components{*}(){+}, during the recovery, [+ha_recover()+|https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2703] function is called *twice*
           * Stack trace of the {*}first call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5292]
          {code:java}
          sql/handler.cc:2706(ha_recover(st_hash*, st_mem_root*))[0x5599477cf057]
          sql/log.cc:11169(MYSQL_BIN_LOG::recover(st_log_info*, char const*, st_io_cache*, Format_description_log_event*, bool))[0x5599478a80dd]
          sql/log.cc:11502(MYSQL_BIN_LOG::do_binlog_recovery(char const*, bool))[0x5599478a8931]
          sql/log.cc:10151(MYSQL_BIN_LOG::open(char const*))[0x5599478aceda]
          sql/mysqld.cc:5289(init_server_components())[0x5599475098ce]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55994750ed55]
          ??:0(__libc_start_main)[0x14f07bff20ba]
          {code}

           * Stack trace of the {*}second call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5298]
          {code:java}
          sql/handler.cc:2710(ha_recover(st_hash*, st_mem_root*))[0x55e57611247b]
          sql/mysqld.cc:5295(init_server_components())[0x55e575e4c8df]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55e575e51d55]
          ??:0(__libc_start_main)[0x1542b650e0ba]
          ??:0(_start)[0x55e575e4764a]
          {code}

          *However, in [the commit|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981] the +ha_recover()+ function call was moved from the [end of function|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8L10686-L10687] +TC_LOG_BINLOG::recover+ to [the beginning|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8R11174-R11177].*
           * Before the commit, in the second call of {+}ha_recover(){+}, no actual recover was done because +trx_sys.rw_trx_hash.iterate_no_dups()+ iterate doesn't have any items - [https://github.com/MariaDB/server/blob/10.6/storage/innobase/trx/trx0trx.cc#L1980] .
           * However after this commit, the second call of +ha_recover()+ also tries to do some recovery which eventually caused the error of:
           [https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2746-L2756]
          {code:java}
          [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          {code}

          h3. Tried a quick fix

          *I tried a simple fix of moving the +ha_recover()+ back to the end of +recover()+ then the issue was fixed:*
          {code:java}
          diff --git a/sql/log.cc b/sql/log.cc
          index 41eed80ef66..ddda2e5d330 100644
          --- a/sql/log.cc
          +++ b/sql/log.cc
          @@ -11166,8 +11166,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
             fdle->flags&= ~LOG_EVENT_BINLOG_IN_USE_F; // abort on the first error
           
             /* finds xids when root is not NULL */
          - if (do_xa && ha_recover(&xids, &mem_root))
          - goto err1;
          + //if (do_xa && ha_recover(&xids, &mem_root))
          + // goto err1;
           
             /*
               Scan the binlog for XIDs that need to be committed if still in the
          @@ -11371,6 +11374,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
           
             if (do_xa)
             {
          + if (ha_recover(&xids))
          + goto err2;
               if (binlog_checkpoint_found)
               {
           #ifndef HAVE_REPLICATION
          {code}
          *Error log before and after this fix*
           Before the fix, the error was in the second call of ha_recover():
          {code:java}
          2021-12-21 20:07:45 0 [Note] InnoDB: 10.6.5 started; log sequence number 15159225; transaction id 126557
          2021-12-21 20:07:45 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 20:07:45 0 [Note] Plugin 'FEEDBACK' is disabled.
          2021-12-21 20:07:45 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 20:07:45 0 [Note] Starting table crash recovery...
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [Note] Crash table recovery finished.
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-21 20:07:45 0 [ERROR] Aborting{code}
          After the fix, ha_recover() was still called twice(verified by adding some private log) but no error/note logs of the second call:
          {code:java}
          2021-12-21 1:00:20 0 [Note] InnoDB: 10.6.5 started; log sequence number 15158841; transaction id 126557
          2021-12-21 1:00:20 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 1:00:20 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 1:00:20 0 [Note] Starting table crash recovery...
          2021-12-21 1:00:20 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 1:00:20 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 1:00:20 0 [Note] Crash table recovery finished.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '0.0.0.0'.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '::'.
          2021-12-21 1:00:20 0 [Note] InnoDB: Buffer pool(s) load completed at 211221 1:00:20
          2021-12-21 1:00:20 0 [Note] /rdsdbbin/mysql/bin/mysqld: ready for connections.
          {code}

          Could you please investigate whether we should update the function or other conditions need to be considered?
          wenhug Hugo Wen made changes -
          Description h3. Issue description:

          *MariaDB 10.6.5 fails to recover from crash, regression in commit [https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981]*

          *Failed to recover the database from a data directory snapshot which was taken while following transaction is running:*
          {code:java}
          create database testdb;
          use testdb;
          CREATE TABLE `t1` (`col1` int(11) NOT NULL,`col2` int(11) DEFAULT NULL,PRIMARY KEY (`col1`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
          DELIMITER $$
          CREATE PROCEDURE generateredo(count int)
          BEGIN
          DECLARE counter INT DEFAULT 1;
          WHILE counter <= count DO
          insert into t1 values(counter,counter);
          update t1 set col2=counter+1 where col1=counter;
          delete from t1 where col1=counter;
          SET counter = counter + 1;
          END WHILE;
          END$$
          DELIMITER ;

          call testdb.generateredo(10000000);
          {code}
          The mysql error log when recovering:
          {code:java}
          2021-12-20 18:55:50 0 [Note] RocksDB: 2 column families found
          2021-12-20 18:55:50 0 [Note] RocksDB: Column Families at start:
          2021-12-20 18:55:50 0 [Note] cf=default
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] cf=__system__
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] RocksDB: Table_store: loaded DDL data for 0 tables
          2021-12-20 18:55:50 0 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
          2021-12-20 18:55:50 0 [Note] MyRocks storage engine plugin has been successfully initialized.
          2021-12-20 18:55:50 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
          2021-12-20 18:55:50 0 [Note] InnoDB: Number of pools: 1
          2021-12-20 18:55:50 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
          2021-12-20 18:55:50 0 [Note] InnoDB: Using Linux native AIO
          2021-12-20 18:55:50 0 [Note] InnoDB: Initializing buffer pool, total size = 49392123904, chunk size = 134217728
          2021-12-20 18:55:50 0 [Note] InnoDB: Completed initialization of buffer pool
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 was in the XA prepared state.
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 0 row operations to undo
          2021-12-20 18:55:50 0 [Note] InnoDB: Trx id counter is 210266
          2021-12-20 18:55:50 0 [Note] InnoDB: 128 rollback segments are active.
          2021-12-20 18:55:50 0 [Note] InnoDB: Creating shared tablespace for temporary tables
          2021-12-20 18:55:50 0 [Note] InnoDB: Setting file '/rdsdbdata/db/innodb/ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
          2021-12-20 18:55:50 0 [Note] InnoDB: File '/rdsdbdata/db/innodb/ibtmp1' size is now 12 MB.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting in background the rollback of recovered transactions
          2021-12-20 18:55:50 0 [Note] InnoDB: Rollback of non-prepared transactions completed
          2021-12-20 18:55:50 0 [Note] InnoDB: 10.6.5 started; log sequence number 25054305; transaction id 210267
          2021-12-20 18:55:50 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-20 18:55:50 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-20 18:55:50 0 [Note] Starting table crash recovery...
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [Note] Crash table recovery finished.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-20 18:55:50 0 [ERROR] Aborting
          {code}
           

          *The issue is regarded as a regression because the same data files are able to recover with MariaDB 10.5.12 binaries.*
          h3. Some investigation:

          Based on the implementation of function {+}{*}init_server_components{*}(){+}, during the recovery, [+ha_recover()+|https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2703] function is called *twice*
           * Stack trace of the {*}first call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5292]
          {code:java}
          sql/handler.cc:2706(ha_recover(st_hash*, st_mem_root*))[0x5599477cf057]
          sql/log.cc:11169(MYSQL_BIN_LOG::recover(st_log_info*, char const*, st_io_cache*, Format_description_log_event*, bool))[0x5599478a80dd]
          sql/log.cc:11502(MYSQL_BIN_LOG::do_binlog_recovery(char const*, bool))[0x5599478a8931]
          sql/log.cc:10151(MYSQL_BIN_LOG::open(char const*))[0x5599478aceda]
          sql/mysqld.cc:5289(init_server_components())[0x5599475098ce]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55994750ed55]
          ??:0(__libc_start_main)[0x14f07bff20ba]
          {code}

           * Stack trace of the {*}second call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5298]
          {code:java}
          sql/handler.cc:2710(ha_recover(st_hash*, st_mem_root*))[0x55e57611247b]
          sql/mysqld.cc:5295(init_server_components())[0x55e575e4c8df]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55e575e51d55]
          ??:0(__libc_start_main)[0x1542b650e0ba]
          ??:0(_start)[0x55e575e4764a]
          {code}

          *However, in [the commit|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981] the +ha_recover()+ function call was moved from the [end of function|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8L10686-L10687] +TC_LOG_BINLOG::recover+ to [the beginning|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8R11174-R11177].*
           * Before the commit, in the second call of {+}ha_recover(){+}, no actual recover was done because +trx_sys.rw_trx_hash.iterate_no_dups()+ iterate doesn't have any items - [https://github.com/MariaDB/server/blob/10.6/storage/innobase/trx/trx0trx.cc#L1980] .
           * However after this commit, the second call of +ha_recover()+ also tries to do some recovery which eventually caused the error of:
           [https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2746-L2756]
          {code:java}
          [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          {code}

          h3. Tried a quick fix

          *I tried a simple fix of moving the +ha_recover()+ back to the end of +recover()+ then the issue was fixed:*
          {code:java}
          diff --git a/sql/log.cc b/sql/log.cc
          index 41eed80ef66..ddda2e5d330 100644
          --- a/sql/log.cc
          +++ b/sql/log.cc
          @@ -11166,8 +11166,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
             fdle->flags&= ~LOG_EVENT_BINLOG_IN_USE_F; // abort on the first error
           
             /* finds xids when root is not NULL */
          - if (do_xa && ha_recover(&xids, &mem_root))
          - goto err1;
          + //if (do_xa && ha_recover(&xids, &mem_root))
          + // goto err1;
           
             /*
               Scan the binlog for XIDs that need to be committed if still in the
          @@ -11371,6 +11374,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
           
             if (do_xa)
             {
          + if (ha_recover(&xids))
          + goto err2;
               if (binlog_checkpoint_found)
               {
           #ifndef HAVE_REPLICATION
          {code}
          *Error log before and after this fix*
           Before the fix, the error was in the second call of ha_recover():
          {code:java}
          2021-12-21 20:07:45 0 [Note] InnoDB: 10.6.5 started; log sequence number 15159225; transaction id 126557
          2021-12-21 20:07:45 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 20:07:45 0 [Note] Plugin 'FEEDBACK' is disabled.
          2021-12-21 20:07:45 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 20:07:45 0 [Note] Starting table crash recovery...
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [Note] Crash table recovery finished.
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-21 20:07:45 0 [ERROR] Aborting{code}
          After the fix, ha_recover() was still called twice(verified by adding some private log) but no error/note logs of the second call:
          {code:java}
          2021-12-21 1:00:20 0 [Note] InnoDB: 10.6.5 started; log sequence number 15158841; transaction id 126557
          2021-12-21 1:00:20 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 1:00:20 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 1:00:20 0 [Note] Starting table crash recovery...
          2021-12-21 1:00:20 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 1:00:20 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 1:00:20 0 [Note] Crash table recovery finished.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '0.0.0.0'.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '::'.
          2021-12-21 1:00:20 0 [Note] InnoDB: Buffer pool(s) load completed at 211221 1:00:20
          2021-12-21 1:00:20 0 [Note] /rdsdbbin/mysql/bin/mysqld: ready for connections.
          {code}

          Could you please investigate whether we should update the function or other conditions need to be considered?
          h3. Issue description:
          MariaDB is designed to be crash-safe and recent versions of MariaDB (tested with 10.5.12) are perfectly capable of starting and recovering if the /var/lib/mysql data directory is based on a filesystem level snapshot and there is a transaction in progress. However in 10.6.5 this is no longer the case and mariadbd fails to start and recover when the data directory is from a filesystem snapshot. Based on my investigation is seems to be a regression introduced in 6c39eaeb1.

          *MariaDB 10.6.5 failed to recover the database from a data directory snapshot which was taken while following transaction is running:*
          {code:java}
          create database testdb;
          use testdb;
          CREATE TABLE `t1` (`col1` int(11) NOT NULL,`col2` int(11) DEFAULT NULL,PRIMARY KEY (`col1`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
          DELIMITER $$
          CREATE PROCEDURE generateredo(count int)
          BEGIN
          DECLARE counter INT DEFAULT 1;
          WHILE counter <= count DO
          insert into t1 values(counter,counter);
          update t1 set col2=counter+1 where col1=counter;
          delete from t1 where col1=counter;
          SET counter = counter + 1;
          END WHILE;
          END$$
          DELIMITER ;

          call testdb.generateredo(10000000);
          {code}
          The mysql error log when recovering:
          {code:java}
          2021-12-20 18:55:50 0 [Note] RocksDB: 2 column families found
          2021-12-20 18:55:50 0 [Note] RocksDB: Column Families at start:
          2021-12-20 18:55:50 0 [Note] cf=default
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] cf=__system__
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] RocksDB: Table_store: loaded DDL data for 0 tables
          2021-12-20 18:55:50 0 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
          2021-12-20 18:55:50 0 [Note] MyRocks storage engine plugin has been successfully initialized.
          2021-12-20 18:55:50 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
          2021-12-20 18:55:50 0 [Note] InnoDB: Number of pools: 1
          2021-12-20 18:55:50 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
          2021-12-20 18:55:50 0 [Note] InnoDB: Using Linux native AIO
          2021-12-20 18:55:50 0 [Note] InnoDB: Initializing buffer pool, total size = 49392123904, chunk size = 134217728
          2021-12-20 18:55:50 0 [Note] InnoDB: Completed initialization of buffer pool
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 was in the XA prepared state.
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 0 row operations to undo
          2021-12-20 18:55:50 0 [Note] InnoDB: Trx id counter is 210266
          2021-12-20 18:55:50 0 [Note] InnoDB: 128 rollback segments are active.
          2021-12-20 18:55:50 0 [Note] InnoDB: Creating shared tablespace for temporary tables
          2021-12-20 18:55:50 0 [Note] InnoDB: Setting file '/rdsdbdata/db/innodb/ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
          2021-12-20 18:55:50 0 [Note] InnoDB: File '/rdsdbdata/db/innodb/ibtmp1' size is now 12 MB.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting in background the rollback of recovered transactions
          2021-12-20 18:55:50 0 [Note] InnoDB: Rollback of non-prepared transactions completed
          2021-12-20 18:55:50 0 [Note] InnoDB: 10.6.5 started; log sequence number 25054305; transaction id 210267
          2021-12-20 18:55:50 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-20 18:55:50 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-20 18:55:50 0 [Note] Starting table crash recovery...
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [Note] Crash table recovery finished.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-20 18:55:50 0 [ERROR] Aborting
          {code}
           

          *The issue is regarded as a regression because the same data files are able to recover with MariaDB 10.5.12 binaries.*
          h3. Some investigation:

          Based on the implementation of function {+}{*}init_server_components{*}(){+}, during the recovery, [+ha_recover()+|https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2703] function is called *twice*
           * Stack trace of the {*}first call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5292]
          {code:java}
          sql/handler.cc:2706(ha_recover(st_hash*, st_mem_root*))[0x5599477cf057]
          sql/log.cc:11169(MYSQL_BIN_LOG::recover(st_log_info*, char const*, st_io_cache*, Format_description_log_event*, bool))[0x5599478a80dd]
          sql/log.cc:11502(MYSQL_BIN_LOG::do_binlog_recovery(char const*, bool))[0x5599478a8931]
          sql/log.cc:10151(MYSQL_BIN_LOG::open(char const*))[0x5599478aceda]
          sql/mysqld.cc:5289(init_server_components())[0x5599475098ce]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55994750ed55]
          ??:0(__libc_start_main)[0x14f07bff20ba]
          {code}

           * Stack trace of the {*}second call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5298]
          {code:java}
          sql/handler.cc:2710(ha_recover(st_hash*, st_mem_root*))[0x55e57611247b]
          sql/mysqld.cc:5295(init_server_components())[0x55e575e4c8df]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55e575e51d55]
          ??:0(__libc_start_main)[0x1542b650e0ba]
          ??:0(_start)[0x55e575e4764a]
          {code}

          *However, in [the commit|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981] the +ha_recover()+ function call was moved from the [end of function|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8L10686-L10687] +TC_LOG_BINLOG::recover+ to [the beginning|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8R11174-R11177].*
           * Before the commit, in the second call of {+}ha_recover(){+}, no actual recover was done because +trx_sys.rw_trx_hash.iterate_no_dups()+ iterate doesn't have any items - [https://github.com/MariaDB/server/blob/10.6/storage/innobase/trx/trx0trx.cc#L1980] .
           * However after this commit, the second call of +ha_recover()+ also tries to do some recovery which eventually caused the error of:
           [https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2746-L2756]
          {code:java}
          [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          {code}

          h3. Tried a quick fix

          *I tried a simple fix of moving the +ha_recover()+ back to the end of +recover()+ then the issue was fixed:*
          {code:java}
          diff --git a/sql/log.cc b/sql/log.cc
          index 41eed80ef66..ddda2e5d330 100644
          --- a/sql/log.cc
          +++ b/sql/log.cc
          @@ -11166,8 +11166,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
             fdle->flags&= ~LOG_EVENT_BINLOG_IN_USE_F; // abort on the first error
           
             /* finds xids when root is not NULL */
          - if (do_xa && ha_recover(&xids, &mem_root))
          - goto err1;
          + //if (do_xa && ha_recover(&xids, &mem_root))
          + // goto err1;
           
             /*
               Scan the binlog for XIDs that need to be committed if still in the
          @@ -11371,6 +11374,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
           
             if (do_xa)
             {
          + if (ha_recover(&xids))
          + goto err2;
               if (binlog_checkpoint_found)
               {
           #ifndef HAVE_REPLICATION
          {code}
          *Error log before and after this fix*
           Before the fix, the error was in the second call of ha_recover():
          {code:java}
          2021-12-21 20:07:45 0 [Note] InnoDB: 10.6.5 started; log sequence number 15159225; transaction id 126557
          2021-12-21 20:07:45 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 20:07:45 0 [Note] Plugin 'FEEDBACK' is disabled.
          2021-12-21 20:07:45 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 20:07:45 0 [Note] Starting table crash recovery...
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [Note] Crash table recovery finished.
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-21 20:07:45 0 [ERROR] Aborting{code}
          After the fix, ha_recover() was still called twice(verified by adding some private log) but no error/note logs of the second call:
          {code:java}
          2021-12-21 1:00:20 0 [Note] InnoDB: 10.6.5 started; log sequence number 15158841; transaction id 126557
          2021-12-21 1:00:20 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 1:00:20 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 1:00:20 0 [Note] Starting table crash recovery...
          2021-12-21 1:00:20 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 1:00:20 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 1:00:20 0 [Note] Crash table recovery finished.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '0.0.0.0'.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '::'.
          2021-12-21 1:00:20 0 [Note] InnoDB: Buffer pool(s) load completed at 211221 1:00:20
          2021-12-21 1:00:20 0 [Note] /rdsdbbin/mysql/bin/mysqld: ready for connections.
          {code}

          Could you please investigate whether we should update the function or other conditions need to be considered?
          wenhug Hugo Wen made changes -
          Description h3. Issue description:
          MariaDB is designed to be crash-safe and recent versions of MariaDB (tested with 10.5.12) are perfectly capable of starting and recovering if the /var/lib/mysql data directory is based on a filesystem level snapshot and there is a transaction in progress. However in 10.6.5 this is no longer the case and mariadbd fails to start and recover when the data directory is from a filesystem snapshot. Based on my investigation is seems to be a regression introduced in 6c39eaeb1.

          *MariaDB 10.6.5 failed to recover the database from a data directory snapshot which was taken while following transaction is running:*
          {code:java}
          create database testdb;
          use testdb;
          CREATE TABLE `t1` (`col1` int(11) NOT NULL,`col2` int(11) DEFAULT NULL,PRIMARY KEY (`col1`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
          DELIMITER $$
          CREATE PROCEDURE generateredo(count int)
          BEGIN
          DECLARE counter INT DEFAULT 1;
          WHILE counter <= count DO
          insert into t1 values(counter,counter);
          update t1 set col2=counter+1 where col1=counter;
          delete from t1 where col1=counter;
          SET counter = counter + 1;
          END WHILE;
          END$$
          DELIMITER ;

          call testdb.generateredo(10000000);
          {code}
          The mysql error log when recovering:
          {code:java}
          2021-12-20 18:55:50 0 [Note] RocksDB: 2 column families found
          2021-12-20 18:55:50 0 [Note] RocksDB: Column Families at start:
          2021-12-20 18:55:50 0 [Note] cf=default
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] cf=__system__
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] RocksDB: Table_store: loaded DDL data for 0 tables
          2021-12-20 18:55:50 0 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
          2021-12-20 18:55:50 0 [Note] MyRocks storage engine plugin has been successfully initialized.
          2021-12-20 18:55:50 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
          2021-12-20 18:55:50 0 [Note] InnoDB: Number of pools: 1
          2021-12-20 18:55:50 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
          2021-12-20 18:55:50 0 [Note] InnoDB: Using Linux native AIO
          2021-12-20 18:55:50 0 [Note] InnoDB: Initializing buffer pool, total size = 49392123904, chunk size = 134217728
          2021-12-20 18:55:50 0 [Note] InnoDB: Completed initialization of buffer pool
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 was in the XA prepared state.
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 0 row operations to undo
          2021-12-20 18:55:50 0 [Note] InnoDB: Trx id counter is 210266
          2021-12-20 18:55:50 0 [Note] InnoDB: 128 rollback segments are active.
          2021-12-20 18:55:50 0 [Note] InnoDB: Creating shared tablespace for temporary tables
          2021-12-20 18:55:50 0 [Note] InnoDB: Setting file '/rdsdbdata/db/innodb/ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
          2021-12-20 18:55:50 0 [Note] InnoDB: File '/rdsdbdata/db/innodb/ibtmp1' size is now 12 MB.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting in background the rollback of recovered transactions
          2021-12-20 18:55:50 0 [Note] InnoDB: Rollback of non-prepared transactions completed
          2021-12-20 18:55:50 0 [Note] InnoDB: 10.6.5 started; log sequence number 25054305; transaction id 210267
          2021-12-20 18:55:50 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-20 18:55:50 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-20 18:55:50 0 [Note] Starting table crash recovery...
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [Note] Crash table recovery finished.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-20 18:55:50 0 [ERROR] Aborting
          {code}
           

          *The issue is regarded as a regression because the same data files are able to recover with MariaDB 10.5.12 binaries.*
          h3. Some investigation:

          Based on the implementation of function {+}{*}init_server_components{*}(){+}, during the recovery, [+ha_recover()+|https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2703] function is called *twice*
           * Stack trace of the {*}first call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5292]
          {code:java}
          sql/handler.cc:2706(ha_recover(st_hash*, st_mem_root*))[0x5599477cf057]
          sql/log.cc:11169(MYSQL_BIN_LOG::recover(st_log_info*, char const*, st_io_cache*, Format_description_log_event*, bool))[0x5599478a80dd]
          sql/log.cc:11502(MYSQL_BIN_LOG::do_binlog_recovery(char const*, bool))[0x5599478a8931]
          sql/log.cc:10151(MYSQL_BIN_LOG::open(char const*))[0x5599478aceda]
          sql/mysqld.cc:5289(init_server_components())[0x5599475098ce]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55994750ed55]
          ??:0(__libc_start_main)[0x14f07bff20ba]
          {code}

           * Stack trace of the {*}second call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5298]
          {code:java}
          sql/handler.cc:2710(ha_recover(st_hash*, st_mem_root*))[0x55e57611247b]
          sql/mysqld.cc:5295(init_server_components())[0x55e575e4c8df]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55e575e51d55]
          ??:0(__libc_start_main)[0x1542b650e0ba]
          ??:0(_start)[0x55e575e4764a]
          {code}

          *However, in [the commit|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981] the +ha_recover()+ function call was moved from the [end of function|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8L10686-L10687] +TC_LOG_BINLOG::recover+ to [the beginning|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8R11174-R11177].*
           * Before the commit, in the second call of {+}ha_recover(){+}, no actual recover was done because +trx_sys.rw_trx_hash.iterate_no_dups()+ iterate doesn't have any items - [https://github.com/MariaDB/server/blob/10.6/storage/innobase/trx/trx0trx.cc#L1980] .
           * However after this commit, the second call of +ha_recover()+ also tries to do some recovery which eventually caused the error of:
           [https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2746-L2756]
          {code:java}
          [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          {code}

          h3. Tried a quick fix

          *I tried a simple fix of moving the +ha_recover()+ back to the end of +recover()+ then the issue was fixed:*
          {code:java}
          diff --git a/sql/log.cc b/sql/log.cc
          index 41eed80ef66..ddda2e5d330 100644
          --- a/sql/log.cc
          +++ b/sql/log.cc
          @@ -11166,8 +11166,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
             fdle->flags&= ~LOG_EVENT_BINLOG_IN_USE_F; // abort on the first error
           
             /* finds xids when root is not NULL */
          - if (do_xa && ha_recover(&xids, &mem_root))
          - goto err1;
          + //if (do_xa && ha_recover(&xids, &mem_root))
          + // goto err1;
           
             /*
               Scan the binlog for XIDs that need to be committed if still in the
          @@ -11371,6 +11374,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
           
             if (do_xa)
             {
          + if (ha_recover(&xids))
          + goto err2;
               if (binlog_checkpoint_found)
               {
           #ifndef HAVE_REPLICATION
          {code}
          *Error log before and after this fix*
           Before the fix, the error was in the second call of ha_recover():
          {code:java}
          2021-12-21 20:07:45 0 [Note] InnoDB: 10.6.5 started; log sequence number 15159225; transaction id 126557
          2021-12-21 20:07:45 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 20:07:45 0 [Note] Plugin 'FEEDBACK' is disabled.
          2021-12-21 20:07:45 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 20:07:45 0 [Note] Starting table crash recovery...
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [Note] Crash table recovery finished.
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-21 20:07:45 0 [ERROR] Aborting{code}
          After the fix, ha_recover() was still called twice(verified by adding some private log) but no error/note logs of the second call:
          {code:java}
          2021-12-21 1:00:20 0 [Note] InnoDB: 10.6.5 started; log sequence number 15158841; transaction id 126557
          2021-12-21 1:00:20 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 1:00:20 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 1:00:20 0 [Note] Starting table crash recovery...
          2021-12-21 1:00:20 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 1:00:20 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 1:00:20 0 [Note] Crash table recovery finished.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '0.0.0.0'.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '::'.
          2021-12-21 1:00:20 0 [Note] InnoDB: Buffer pool(s) load completed at 211221 1:00:20
          2021-12-21 1:00:20 0 [Note] /rdsdbbin/mysql/bin/mysqld: ready for connections.
          {code}

          Could you please investigate whether we should update the function or other conditions need to be considered?
          h3. Issue description:
          MariaDB is designed to be crash-safe and recent versions of MariaDB (tested with 10.5.12) are perfectly capable of starting and recovering if the /var/lib/mysql data directory is based on a filesystem level snapshot and there is a transaction in progress. However in 10.6.5 this is no longer the case and mariadbd fails to start and recover when the data directory is from a filesystem snapshot. Based on my investigation is seems to be a regression introduced in [6c39eaeb1|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981].

          *MariaDB 10.6.5 failed to recover the database from a data directory snapshot which was taken while following transaction is running:*
          {code:java}
          create database testdb;
          use testdb;
          CREATE TABLE `t1` (`col1` int(11) NOT NULL,`col2` int(11) DEFAULT NULL,PRIMARY KEY (`col1`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
          DELIMITER $$
          CREATE PROCEDURE generateredo(count int)
          BEGIN
          DECLARE counter INT DEFAULT 1;
          WHILE counter <= count DO
          insert into t1 values(counter,counter);
          update t1 set col2=counter+1 where col1=counter;
          delete from t1 where col1=counter;
          SET counter = counter + 1;
          END WHILE;
          END$$
          DELIMITER ;

          call testdb.generateredo(10000000);
          {code}
          The mysql error log when recovering:
          {code:java}
          2021-12-20 18:55:50 0 [Note] RocksDB: 2 column families found
          2021-12-20 18:55:50 0 [Note] RocksDB: Column Families at start:
          2021-12-20 18:55:50 0 [Note] cf=default
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] cf=__system__
          2021-12-20 18:55:50 0 [Note] write_buffer_size=67108864
          2021-12-20 18:55:50 0 [Note] target_file_size_base=67108864
          2021-12-20 18:55:50 0 [Note] RocksDB: Table_store: loaded DDL data for 0 tables
          2021-12-20 18:55:50 0 [Note] RocksDB: global statistics using get_sched_indexer_t indexer
          2021-12-20 18:55:50 0 [Note] MyRocks storage engine plugin has been successfully initialized.
          2021-12-20 18:55:50 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
          2021-12-20 18:55:50 0 [Note] InnoDB: Number of pools: 1
          2021-12-20 18:55:50 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
          2021-12-20 18:55:50 0 [Note] InnoDB: Using Linux native AIO
          2021-12-20 18:55:50 0 [Note] InnoDB: Initializing buffer pool, total size = 49392123904, chunk size = 134217728
          2021-12-20 18:55:50 0 [Note] InnoDB: Completed initialization of buffer pool
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 was in the XA prepared state.
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 0 row operations to undo
          2021-12-20 18:55:50 0 [Note] InnoDB: Trx id counter is 210266
          2021-12-20 18:55:50 0 [Note] InnoDB: 128 rollback segments are active.
          2021-12-20 18:55:50 0 [Note] InnoDB: Creating shared tablespace for temporary tables
          2021-12-20 18:55:50 0 [Note] InnoDB: Setting file '/rdsdbdata/db/innodb/ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
          2021-12-20 18:55:50 0 [Note] InnoDB: File '/rdsdbdata/db/innodb/ibtmp1' size is now 12 MB.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting in background the rollback of recovered transactions
          2021-12-20 18:55:50 0 [Note] InnoDB: Rollback of non-prepared transactions completed
          2021-12-20 18:55:50 0 [Note] InnoDB: 10.6.5 started; log sequence number 25054305; transaction id 210267
          2021-12-20 18:55:50 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-20 18:55:50 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-20 18:55:50 0 [Note] Starting table crash recovery...
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [Note] Crash table recovery finished.
          2021-12-20 18:55:50 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction 210265 in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-20 18:55:50 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-20 18:55:50 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-20 18:55:50 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-20 18:55:50 0 [ERROR] Aborting
          {code}
           

          *The issue is regarded as a regression because the same data files are able to recover with MariaDB 10.5.12 binaries.*
          h3. Some investigation:

          Based on the implementation of function {+}{*}init_server_components{*}(){+}, during the recovery, [+ha_recover()+|https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2703] function is called *twice*
           * Stack trace of the {*}first call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5292]
          {code:java}
          sql/handler.cc:2706(ha_recover(st_hash*, st_mem_root*))[0x5599477cf057]
          sql/log.cc:11169(MYSQL_BIN_LOG::recover(st_log_info*, char const*, st_io_cache*, Format_description_log_event*, bool))[0x5599478a80dd]
          sql/log.cc:11502(MYSQL_BIN_LOG::do_binlog_recovery(char const*, bool))[0x5599478a8931]
          sql/log.cc:10151(MYSQL_BIN_LOG::open(char const*))[0x5599478aceda]
          sql/mysqld.cc:5289(init_server_components())[0x5599475098ce]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55994750ed55]
          ??:0(__libc_start_main)[0x14f07bff20ba]
          {code}

           * Stack trace of the {*}second call{*}: [https://github.com/MariaDB/server/blob/10.6/sql/mysqld.cc#L5298]
          {code:java}
          sql/handler.cc:2710(ha_recover(st_hash*, st_mem_root*))[0x55e57611247b]
          sql/mysqld.cc:5295(init_server_components())[0x55e575e4c8df]
          sql/mysqld.cc:5655(mysqld_main(int, char**))[0x55e575e51d55]
          ??:0(__libc_start_main)[0x1542b650e0ba]
          ??:0(_start)[0x55e575e4764a]
          {code}

          *However, in [the commit|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981] the +ha_recover()+ function call was moved from the [end of function|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8L10686-L10687] +TC_LOG_BINLOG::recover+ to [the beginning|https://github.com/MariaDB/server/commit/6c39eaeb126328e7813b146ecf652d51e4508981#diff-484732279f81ad1700aa2f7f19305fffa88d8891ad52af5b103a2a9ee9f766b8R11174-R11177].*
           * Before the commit, in the second call of {+}ha_recover(){+}, no actual recover was done because +trx_sys.rw_trx_hash.iterate_no_dups()+ iterate doesn't have any items - [https://github.com/MariaDB/server/blob/10.6/storage/innobase/trx/trx0trx.cc#L1980] .
           * However after this commit, the second call of +ha_recover()+ also tries to do some recovery which eventually caused the error of:
           [https://github.com/MariaDB/server/blob/10.6/sql/handler.cc#L2746-L2756]
          {code:java}
          [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          {code}

          h3. Tried a quick fix

          *I tried a simple fix of moving the +ha_recover()+ back to the end of +recover()+ then the issue was fixed:*
          {code:java}
          diff --git a/sql/log.cc b/sql/log.cc
          index 41eed80ef66..ddda2e5d330 100644
          --- a/sql/log.cc
          +++ b/sql/log.cc
          @@ -11166,8 +11166,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
             fdle->flags&= ~LOG_EVENT_BINLOG_IN_USE_F; // abort on the first error
           
             /* finds xids when root is not NULL */
          - if (do_xa && ha_recover(&xids, &mem_root))
          - goto err1;
          + //if (do_xa && ha_recover(&xids, &mem_root))
          + // goto err1;
           
             /*
               Scan the binlog for XIDs that need to be committed if still in the
          @@ -11371,6 +11374,8 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
           
             if (do_xa)
             {
          + if (ha_recover(&xids))
          + goto err2;
               if (binlog_checkpoint_found)
               {
           #ifndef HAVE_REPLICATION
          {code}
          *Error log before and after this fix*
           Before the fix, the error was in the second call of ha_recover():
          {code:java}
          2021-12-21 20:07:45 0 [Note] InnoDB: 10.6.5 started; log sequence number 15159225; transaction id 126557
          2021-12-21 20:07:45 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 20:07:45 0 [Note] Plugin 'FEEDBACK' is disabled.
          2021-12-21 20:07:45 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 20:07:45 0 [Note] Starting table crash recovery...
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [Note] Crash table recovery finished.
          2021-12-21 20:07:45 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 20:07:45 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 20:07:45 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 20:07:45 0 [ERROR] Found 1 prepared transactions! It means that server was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start server with --tc-heuristic-recover switch to commit or rollback pending transactions.
          2021-12-21 20:07:45 0 [ERROR] Aborting{code}
          After the fix, ha_recover() was still called twice(verified by adding some private log) but no error/note logs of the second call:
          {code:java}
          2021-12-21 1:00:20 0 [Note] InnoDB: 10.6.5 started; log sequence number 15158841; transaction id 126557
          2021-12-21 1:00:20 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
          2021-12-21 1:00:20 0 [Note] Recovering after a crash using /rdsdbdata/log/binlog/mysql-bin-changelog
          2021-12-21 1:00:20 0 [Note] Starting table crash recovery...
          2021-12-21 1:00:20 0 [Note] InnoDB: Starting recovery for XA transactions...
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction 126555 in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] InnoDB: Transaction contains changes to 1 rows
          2021-12-21 1:00:20 0 [Note] InnoDB: 1 transactions in prepared state after recovery
          2021-12-21 1:00:20 0 [Note] Found 1 prepared transaction(s) in InnoDB
          2021-12-21 1:00:20 0 [Note] Crash table recovery finished.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '0.0.0.0'.
          2021-12-21 1:00:20 0 [Note] Server socket created on IP: '::'.
          2021-12-21 1:00:20 0 [Note] InnoDB: Buffer pool(s) load completed at 211221 1:00:20
          2021-12-21 1:00:20 0 [Note] /rdsdbbin/mysql/bin/mysqld: ready for connections.
          {code}

          Could you please investigate whether we should update the function or other conditions need to be considered?
          serg Sergei Golubchik made changes -
          Priority Minor [ 4 ] Major [ 3 ]
          serg Sergei Golubchik made changes -
          Fix Version/s 10.6 [ 24028 ]
          serg Sergei Golubchik made changes -
          Assignee Andrei Elkin [ elkin ]
          Elkin Andrei Elkin made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Elkin Andrei Elkin made changes -
          Fix Version/s 10.6.8 [ 27506 ]
          Fix Version/s 10.7.4 [ 27504 ]
          Fix Version/s 10.8.3 [ 27502 ]
          Fix Version/s 10.6 [ 24028 ]
          Resolution Fixed [ 1 ]
          Status In Progress [ 3 ] Closed [ 6 ]

          People

            Elkin Andrei Elkin
            wenhug Hugo Wen
            Votes:
            2 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.