Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-342

XA Crash recovery fails if server crashes in the middle of binlog rotate

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 5.5.24, 5.3.7
    • 5.5.25
    • None
    • None

    Description

      When we rotate the binlog, we first close the old log file, which marks the
      file as "not crashed". We then create the new log file, and last add it to the
      binlog index file.

      Suppose we crash after closing the old file but before the new file is added
      to the index. And suppose further than at least one transaction had time to
      enter the "prepared inside InnoDB" status before we crashed.

      Then on next server restart, the binlog code will not realise that we crashed,
      as the last file in the binlog index is closed properly. And ha_recover in
      handler.cc will fail the startup due to finding a prepared innodb transaction
      in a no-crash scenario:

      120615 14:41:02 [ERROR] Found 1 prepared transactions! It means that mysqld was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start mysqld with --tc-heuristic-recover switch to commit or rollback pending transactions.

      To reproduce, apply following patch (to get needed debug_sync points) and run
      attached test case.

      === modified file 'sql/handler.cc'
      --- sql/handler.cc	2012-05-21 18:54:41 +0000
      +++ sql/handler.cc	2012-06-15 12:37:51 +0000
      @@ -1276,6 +1276,7 @@ int ha_commit_trans(THD *thd, bool all)
           need_prepare_ordered|= (ht->prepare_ordered != NULL);
           need_commit_ordered|= (ht->commit_ordered != NULL);
         }
      +  DEBUG_SYNC(thd, "ha_commit_trans_after_prepare");
         DBUG_EXECUTE_IF("crash_commit_after_prepare", DBUG_SUICIDE(););
       
         if (!is_real_trans)
       
      === modified file 'sql/log.cc'
      --- sql/log.cc	2012-05-21 18:54:41 +0000
      +++ sql/log.cc	2012-06-15 12:39:45 +0000
      @@ -3189,6 +3189,10 @@ bool MYSQL_BIN_LOG::open(const char *log
           if (write_file_name_to_index_file)
           {
       #ifdef HAVE_REPLICATION
      +#ifdef ENABLED_DEBUG_SYNC
      +      if (current_thd)
      +        DEBUG_SYNC(current_thd, "binlog_open_before_update_index");
      +#endif
             DBUG_EXECUTE_IF("crash_create_critical_before_update_index", DBUG_SUICIDE(););
       #endif

      Attachments

        Activity

          knielsen Kristian Nielsen created issue -
          serg Sergei Golubchik made changes -
          Field Original Value New Value
          Description When we rotate the binlog, we first close the old log file, which marks the
          file as "not crashed". We then create the new log file, and last add it to the
          binlog index file.

          Suppose we crash after closing the old file but before the new file is added
          to the index. And suppose further than at least one transaction had time to
          enter the "prepared inside InnoDB" status before we crashed.

          Then on next server restart, the binlog code will not realise that we crashed,
          as the last file in the binlog index is closed properly. And ha_recover in
          handler.cc will fail the startup due to finding a prepared innodb transaction
          in a no-crash scenario:

          120615 14:41:02 [ERROR] Found 1 prepared transactions! It means that mysqld was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start mysqld with --tc-heuristic-recover switch to commit or rollback pending transactions.

          To reproduce, apply following patch (to get needed debug_sync points) and run
          attached test case.

          === modified file 'sql/handler.cc'
          --- sql/handler.cc 2012-05-21 18:54:41 +0000
          +++ sql/handler.cc 2012-06-15 12:37:51 +0000
          @@ -1276,6 +1276,7 @@ int ha_commit_trans(THD *thd, bool all)
               need_prepare_ordered|= (ht->prepare_ordered != NULL);
               need_commit_ordered|= (ht->commit_ordered != NULL);
             }
          + DEBUG_SYNC(thd, "ha_commit_trans_after_prepare");
             DBUG_EXECUTE_IF("crash_commit_after_prepare", DBUG_SUICIDE(););
           
             if (!is_real_trans)

          === modified file 'sql/log.cc'
          --- sql/log.cc 2012-05-21 18:54:41 +0000
          +++ sql/log.cc 2012-06-15 12:39:45 +0000
          @@ -3189,6 +3189,10 @@ bool MYSQL_BIN_LOG::open(const char *log
               if (write_file_name_to_index_file)
               {
           #ifdef HAVE_REPLICATION
          +#ifdef ENABLED_DEBUG_SYNC
          + if (current_thd)
          + DEBUG_SYNC(current_thd, "binlog_open_before_update_index");
          +#endif
                 DBUG_EXECUTE_IF("crash_create_critical_before_update_index", DBUG_SUICIDE(););
           #endif

          When we rotate the binlog, we first close the old log file, which marks the
          file as "not crashed". We then create the new log file, and last add it to the
          binlog index file.

          Suppose we crash after closing the old file but before the new file is added
          to the index. And suppose further than at least one transaction had time to
          enter the "prepared inside InnoDB" status before we crashed.

          Then on next server restart, the binlog code will not realise that we crashed,
          as the last file in the binlog index is closed properly. And ha_recover in
          handler.cc will fail the startup due to finding a prepared innodb transaction
          in a no-crash scenario:

          120615 14:41:02 [ERROR] Found 1 prepared transactions! It means that mysqld was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start mysqld with --tc-heuristic-recover switch to commit or rollback pending transactions.

          To reproduce, apply following patch (to get needed debug_sync points) and run
          attached test case.
          {noformat}
          === modified file 'sql/handler.cc'
          --- sql/handler.cc 2012-05-21 18:54:41 +0000
          +++ sql/handler.cc 2012-06-15 12:37:51 +0000
          @@ -1276,6 +1276,7 @@ int ha_commit_trans(THD *thd, bool all)
               need_prepare_ordered|= (ht->prepare_ordered != NULL);
               need_commit_ordered|= (ht->commit_ordered != NULL);
             }
          + DEBUG_SYNC(thd, "ha_commit_trans_after_prepare");
             DBUG_EXECUTE_IF("crash_commit_after_prepare", DBUG_SUICIDE(););
           
             if (!is_real_trans)

          === modified file 'sql/log.cc'
          --- sql/log.cc 2012-05-21 18:54:41 +0000
          +++ sql/log.cc 2012-06-15 12:39:45 +0000
          @@ -3189,6 +3189,10 @@ bool MYSQL_BIN_LOG::open(const char *log
               if (write_file_name_to_index_file)
               {
           #ifdef HAVE_REPLICATION
          +#ifdef ENABLED_DEBUG_SYNC
          + if (current_thd)
          + DEBUG_SYNC(current_thd, "binlog_open_before_update_index");
          +#endif
                 DBUG_EXECUTE_IF("crash_create_critical_before_update_index", DBUG_SUICIDE(););
           #endif
          {noformat}
          knielsen Kristian Nielsen made changes -
          Status Open [ 1 ] In Progress [ 3 ]

          Fix committed - re-assigning for review

          knielsen Kristian Nielsen added a comment - Fix committed - re-assigning for review
          knielsen Kristian Nielsen made changes -
          Assignee Kristian Nielsen [ knielsen ] Sergei Golubchik [ serg ]
          knielsen Kristian Nielsen made changes -
          Assignee Sergei Golubchik [ serg ] Kristian Nielsen [ knielsen ]

          after-review fixes, push.
          Investigate a couple problems with the test case, fix, push corrections.

          knielsen Kristian Nielsen added a comment - after-review fixes, push. Investigate a couple problems with the test case, fix, push corrections.
          knielsen Kristian Nielsen made changes -
          Resolution Fixed [ 1 ]
          Status In Progress [ 3 ] Closed [ 6 ]
          knielsen Kristian Nielsen made changes -
          Resolution Fixed [ 1 ]
          Status Closed [ 6 ] Reopened [ 4 ]

          Fixed in 5.5.25

          knielsen Kristian Nielsen added a comment - Fixed in 5.5.25
          knielsen Kristian Nielsen made changes -
          Fix Version/s 5.5.25 [ 10901 ]
          Fix Version/s 5.5.26 [ 11100 ]
          Resolution Fixed [ 1 ]
          Status Reopened [ 4 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow defaullt [ 12216 ] MariaDB v2 [ 45220 ]
          ratzpo Rasmus Johansson (Inactive) made changes -
          Workflow MariaDB v2 [ 45220 ] MariaDB v3 [ 65886 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 65886 ] MariaDB v4 [ 144844 ]

          People

            knielsen Kristian Nielsen
            knielsen Kristian Nielsen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.