Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29996

Duplicated call to buf_page_t::set_ibuf_exist() on recovery

Details

    Description

      With the attached data directory, MariaDB 10.6 and presumably 10.7 would fail to recover:

      mariadbd --innodb-page-size=4k --innodb-log-file-size=100663296 --innodb-fast-shutdown=0 --datadir /dev/shm/fbackup/data
      

      10.6 fef9d6ef1db9a4648a54954c38ea4fbab2a6542c

      2022-11-10 14:13:40 0 [Note] InnoDB: Initializing buffer pool, total size = 10737418240, chunk size = 134217728
      2022-11-10 14:13:41 0 [Note] InnoDB: Completed initialization of buffer pool
      2022-11-10 14:13:41 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed
      2022-11-10 14:13:41 0 [Note] InnoDB: Opened 3 undo tablespaces
      2022-11-10 14:13:41 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces
      2022-11-10 14:13:41 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=93468483,97150807
      2022-11-10 14:13:41 0 [Note] InnoDB: Opened 3 undo tablespaces
      2022-11-10 14:13:41 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces
      2022-11-10 14:13:41 0 [Note] InnoDB: Starting final batch to recover 4771 pages from redo log.
      mariadbd: /mariadb/10.6m/storage/innobase/include/buf0buf.h:715: void buf_page_t::set_ibuf_exist(): Assertion `s < IBUF_EXIST || s >= REINIT' failed.
      

      I did not check what would happen in a non-debug build, but I think that a hang is very well possible.

      I did check that 10.5 (a few changes ahead of MariaDB Server 10.5.18) recovers this data directory just fine.

      The following fixes it in 10.6:

      diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
      index b47b8d30c2a..546541e1082 100644
      --- a/storage/innobase/log/log0recv.cc
      +++ b/storage/innobase/log/log0recv.cc
      @@ -1137,6 +1137,7 @@ class mlog_init_t
       					continue;
       				}
       				mysql_mutex_unlock(&recv_sys.mutex);
      +				ut_ad(!block->page.is_ibuf_exist());
       				if (ibuf_page_exists(block->page.id(),
       						     block->zip_size())) {
       					block->page.set_ibuf_exist();
      @@ -1148,6 +1149,7 @@ class mlog_init_t
       		}
       
       		mtr.commit();
      +		clear();
       	}
       
       	/** Clear the data structure */
      

      It turns out that multiple threads can invoke recv_sys.apply(true) nearly simultaneously, causing mlog_init to be applied several times.

      The mlog_init was added already in MDEV-12699, and the ibuf_page_exists() call was added in MDEV-19514. The application of mlog_init was ‘idempotent’ until the block descriptor data structure was refactored in MDEV-27058 in MariaDB Server 10.6.6.

      Note: The InnoDB change buffer was disabled by default in MDEV-27734.

      Attachments

        Issue Links

          Activity

            I realized that the debug assertion in my above patch is duplicating the failing assertion inside buf_page_t::set_ibuf_exist(). We only need the call to clear().

            marko Marko Mäkelä added a comment - I realized that the debug assertion in my above patch is duplicating the failing assertion inside buf_page_t::set_ibuf_exist() . We only need the call to clear() .

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.