Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16125

Shutdown crash when innodb_force_recovery >= 2

Details

    Description

      During shutdown, InnoDB master thread encounters the shutdown state as SRV_SHUTDOWN_FLUSH_PHASE or SRV_SHUTDOWN_LAST_PHASE.
      srv_active_wake_master_thread_low() is used during ha_innobase::close().
      It wakes the master thread.

      But scheduling or waking up of master thread happened after shutdown state is set to SRV_SHUTDOWN_FLUSH_PHASE and it leads to crash in shutdown

      Attachments

        Issue Links

          Activity

            Test case to repeat the issue:

            diff --git a/storage/innobase/log/log0log.cc b/storage/innobase/log/log0log.cc
            index 47c4f25..604d0b1 100644
            --- a/storage/innobase/log/log0log.cc
            +++ b/storage/innobase/log/log0log.cc
            @@ -2011,6 +2011,8 @@ logs_empty_and_mark_files_at_shutdown(void)
                    here to let it complete the flushing of the buffer pools
                    before proceeding further. */
             
            +       DBUG_EXECUTE_IF("delay_master_thread",
            +                       os_thread_sleep(1000000););
                    count = 0;
                    service_manager_extend_timeout(COUNT_INTERVAL * CHECK_INTERVAL/1000000 * 2,
                            "Waiting for page cleaner");
            diff --git a/storage/innobase/srv/srv0srv.cc b/storage/innobase/srv/srv0srv.cc
            index 2ad5064..4dad0d6 100644
            --- a/storage/innobase/srv/srv0srv.cc
            +++ b/storage/innobase/srv/srv0srv.cc
            @@ -930,6 +930,9 @@ srv_resume_thread(srv_slot_t* slot, int64_t sig_count = 0, bool wait = true,
                            os_event_wait_low(slot->event, sig_count);
                    }
             
            +       DBUG_EXECUTE_IF("delay_master_thread",
            +                       os_thread_sleep(1000000););
            +
                    srv_sys_mutex_enter();
                    ut_ad(slot->in_use);
                    ut_ad(slot->suspended);
            
            

            Add the above diff and execute the following test case:

            --source include/have_innodb.inc
            --source include/have_debug.inc
             
            --let $restart_parameters= --innodb-force-recovery=3
            --source include/restart_mysqld.inc
            CREATE TABLE t2(f1 INT NOT NULL)ENGINE=InnoDB;
            SHOW CREATE TABLE t2;
             
            SET global debug_dbug='+d,delay_master_thread';
            --source include/restart_mysqld.inc
            

            thiru Thirunarayanan Balathandayuthapani added a comment - Test case to repeat the issue: diff --git a/storage/innobase/log/log0log.cc b/storage/innobase/log/log0log.cc index 47c4f25..604d0b1 100644 --- a/storage/innobase/log/log0log.cc +++ b/storage/innobase/log/log0log.cc @@ -2011,6 +2011,8 @@ logs_empty_and_mark_files_at_shutdown(void) here to let it complete the flushing of the buffer pools before proceeding further. */ + DBUG_EXECUTE_IF("delay_master_thread", + os_thread_sleep(1000000);); count = 0; service_manager_extend_timeout(COUNT_INTERVAL * CHECK_INTERVAL/1000000 * 2, "Waiting for page cleaner"); diff --git a/storage/innobase/srv/srv0srv.cc b/storage/innobase/srv/srv0srv.cc index 2ad5064..4dad0d6 100644 --- a/storage/innobase/srv/srv0srv.cc +++ b/storage/innobase/srv/srv0srv.cc @@ -930,6 +930,9 @@ srv_resume_thread(srv_slot_t* slot, int64_t sig_count = 0, bool wait = true, os_event_wait_low(slot->event, sig_count); } + DBUG_EXECUTE_IF("delay_master_thread", + os_thread_sleep(1000000);); + srv_sys_mutex_enter(); ut_ad(slot->in_use); ut_ad(slot->suspended); Add the above diff and execute the following test case: --source include/have_innodb.inc --source include/have_debug.inc   --let $restart_parameters= --innodb-force-recovery=3 --source include/restart_mysqld.inc CREATE TABLE t2(f1 INT NOT NULL )ENGINE=InnoDB; SHOW CREATE TABLE t2;   SET global debug_dbug= '+d,delay_master_thread' ; --source include/restart_mysqld.inc

            mdev-16125-10.2v1.patch

            Above patch solves the issue.

            thiru Thirunarayanan Balathandayuthapani added a comment - mdev-16125-10.2v1.patch Above patch solves the issue.

            I think that we can go for a minimal patch of removing the ut_ad(0) statement that causes the crash in a debug build, and adding a comment.

            If innodb_force_recovery>=2, the srv_master_thread() is not doing any useful work. So, it could as well exit immediately. In fact, we could avoid creating the thread altogether in this case (in a separate fix).

            marko Marko Mäkelä added a comment - I think that we can go for a minimal patch of removing the ut_ad(0) statement that causes the crash in a debug build, and adding a comment. If innodb_force_recovery>=2, the srv_master_thread() is not doing any useful work. So, it could as well exit immediately. In fact, we could avoid creating the thread altogether in this case (in a separate fix).

            People

              thiru Thirunarayanan Balathandayuthapani
              thiru Thirunarayanan Balathandayuthapani
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.