Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33023

Crash in mariadb-backup --prepare --export after --prepare

Details

    Description

      The following change to an existing regression test causes a crash:

      diff --git a/mysql-test/suite/mariabackup/partial.test b/mysql-test/suite/mariabackup/partial.test
      index d0d07daf2ea..035024aac3d 100644
      --- a/mysql-test/suite/mariabackup/partial.test
      +++ b/mysql-test/suite/mariabackup/partial.test
      @@ -41,6 +41,7 @@ EOF
       
       echo # xtrabackup prepare;
       --disable_result_log
      +exec $XTRABACKUP --defaults-file=$server_cnf --defaults-group-suffix=.1 --prepare --target-dir=$targetdir;
       exec $XTRABACKUP --defaults-file=$server_cnf --defaults-group-suffix=.1 --prepare --export --target-dir=$targetdir;
       --enable_result_log
       
      

      10.5 e472b682e068319e2a27373903cd46fb93093286

      mariabackup: /mariadb/10.5/storage/innobase/fil/fil0fil.cc:441: bool fil_node_open_file(fil_node_t*): Assertion `fil_is_user_tablespace_id(node->space->id) || srv_operation == SRV_OPERATION_BACKUP || srv_operation == SRV_OPERATION_RESTORE || srv_operation == SRV_OPERATION_RESTORE_DELTA' failed.
      

      10.6 736a54f49c72d89fb82ef4165e96cddb506cf555

      mariabackup: /mariadb/10.6/storage/innobase/fil/fil0fil.cc:426: bool fil_node_open_file(fil_node_t*): Assertion `fil_is_user_tablespace_id(node->space->id) || srv_operation == SRV_OPERATION_BACKUP || srv_operation == SRV_OPERATION_RESTORE || srv_operation == SRV_OPERATION_RESTORE_DELTA' failed.
      

      The 10.4 branch does not crash.

      Attachments

        Activity

          mariadb-backup with --prepare option could result in empty redo log
          file. When --prepare is followed by --prepare --export, we exit early
          in srv_start function without opening the ibdata1 tablespace. Later
          while trying to read rollback segment header page, we hit the debug
          assert which claims that the system space should already have been
          opened.

          The assert is already marked not applicable for server operation backup
          and restore. It seems appropriate to add SRV_OPERATION_RESTORE_EXPORT
          in the assert condition.

          https://github.com/MariaDB/server/pull/3009

          debarun Debarun Banerjee added a comment - mariadb-backup with --prepare option could result in empty redo log file. When --prepare is followed by --prepare --export, we exit early in srv_start function without opening the ibdata1 tablespace. Later while trying to read rollback segment header page, we hit the debug assert which claims that the system space should already have been opened. The assert is already marked not applicable for server operation backup and restore. It seems appropriate to add SRV_OPERATION_RESTORE_EXPORT in the assert condition. https://github.com/MariaDB/server/pull/3009

          The previous fix was good for 10.6 but 10.5 has another issue.
          There are two assert cases here.

          Issue-1: System tablespace object is not there in fil space hash i.e. srv_sys_space.open_or_create() is not called.
          Fix: The system tablespace can be opened before checking the redo log. This is to be fixed only for 10.5 and 10.6 already does it.

          Issue-2: The system tablespace data file ibdata1 is not opened i.e. fil_system.sys_space->open() is not called.
          Fix: The assert is already marked not applicable for server operation backup and restore. It seems appropriate to add SRV_OPERATION_RESTORE_EXPORT in the assert condition.

          Updated https://github.com/MariaDB/server/pull/3009

          debarun Debarun Banerjee added a comment - The previous fix was good for 10.6 but 10.5 has another issue. There are two assert cases here. Issue-1: System tablespace object is not there in fil space hash i.e. srv_sys_space.open_or_create() is not called. Fix: The system tablespace can be opened before checking the redo log. This is to be fixed only for 10.5 and 10.6 already does it. Issue-2: The system tablespace data file ibdata1 is not opened i.e. fil_system.sys_space->open() is not called. Fix: The assert is already marked not applicable for server operation backup and restore. It seems appropriate to add SRV_OPERATION_RESTORE_EXPORT in the assert condition. Updated https://github.com/MariaDB/server/pull/3009

          commit 7c170595d38938de516bfa0a9b7266a3a0edf001 (HEAD > 10.5MDEV-33023, origin/10.5-MDEV-33023)
          Author: mariadb-DebarunBanerjee <debarun.banerjee@mariadb.com>
          Date: Tue Jan 23 11:51:39 2024 +0530

          MDEV-33023 Crash in mariadb-backup --prepare --export after --prepare

          Address review comments from marko. Modified server_start to open
          system and undo tablespaces before checking redo log.

          Updated https://github.com/MariaDB/server/pull/3009

          The only possible impact is if there is any case where mariabackup works without the system tablespace ibdata1

          debarun Debarun Banerjee added a comment - commit 7c170595d38938de516bfa0a9b7266a3a0edf001 (HEAD > 10.5 MDEV-33023 , origin/10.5- MDEV-33023 ) Author: mariadb-DebarunBanerjee <debarun.banerjee@mariadb.com> Date: Tue Jan 23 11:51:39 2024 +0530 MDEV-33023 Crash in mariadb-backup --prepare --export after --prepare Address review comments from marko. Modified server_start to open system and undo tablespaces before checking redo log. Updated https://github.com/MariaDB/server/pull/3009 The only possible impact is if there is any case where mariabackup works without the system tablespace ibdata1

          The problem is caused by the fact that no ib_logfile0 would exist, and therefore the system and undo tablespaces would not be opened. In MDEV-14425 and MDEV-27199 the logic had been changed in such a way that an ib_logfile0 must always exist. I tested that there is no crash (but a clear error message) in 10.11, when the ib_logfile0 is missing or empty.

          marko Marko Mäkelä added a comment - The problem is caused by the fact that no ib_logfile0 would exist, and therefore the system and undo tablespaces would not be opened. In MDEV-14425 and MDEV-27199 the logic had been changed in such a way that an ib_logfile0 must always exist. I tested that there is no crash (but a clear error message) in 10.11, when the ib_logfile0 is missing or empty.

          The patch is already reviewed in github. Please mark the review complete.

          debarun Debarun Banerjee added a comment - The patch is already reviewed in github. Please mark the review complete.

          Pushed to 10.5.

          commit fb9da7f7518bc310aff4eac215e046d63acded62
          Author: mariadb-DebarunBanerjee <debarun.banerjee@mariadb.com>
          Date: Tue Jan 23 20:43:39 2024 +0530

          MDEV-33023 Crash in mariadb-backup --prepare --export after --prepare

          debarun Debarun Banerjee added a comment - Pushed to 10.5. commit fb9da7f7518bc310aff4eac215e046d63acded62 Author: mariadb-DebarunBanerjee <debarun.banerjee@mariadb.com> Date: Tue Jan 23 20:43:39 2024 +0530 MDEV-33023 Crash in mariadb-backup --prepare --export after --prepare

          The scenario for this bug hits a debug assert and is reported on debug version.

          extra/mariabackup/xtrabackup.cc

          426├─> ut_ad(fil_is_user_tablespace_id(node->space->id) ||
          427│         srv_operation == SRV_OPERATION_BACKUP ||
          428│         srv_operation == SRV_OPERATION_RESTORE ||
          429│         srv_operation == SRV_OPERATION_RESTORE_DELTA);
          

          debarun Debarun Banerjee added a comment - The scenario for this bug hits a debug assert and is reported on debug version. extra/mariabackup/xtrabackup.cc 426├─> ut_ad(fil_is_user_tablespace_id(node->space->id) || 427│ srv_operation == SRV_OPERATION_BACKUP || 428│ srv_operation == SRV_OPERATION_RESTORE || 429│ srv_operation == SRV_OPERATION_RESTORE_DELTA);

          People

            debarun Debarun Banerjee
            marko Marko Mäkelä
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.