Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29435

CHECK TABLE forgets to release latches after reporting failure

    XMLWordPrintable

Details

    Description

      As part of MDEV-13542, the CHECK TABLE code was refactored so that it would avoid crashes due to corrupted data. mleich produced a core dump where something had caused corruption, and subsequently shutdown crashed:

      10.6 92032499874259bae7455130958ea7f38c4d53a3

      # 2022-08-31T09:36:38 [1748421] | Version: '10.6.10-MariaDB-debug-log'  socket: '/dev/shm/rqg/1661956911/39/1_clone/mysql.sock'  port: 25364  Source distribution
      # 2022-08-31T09:36:38 [1748421] | 2022-08-31  9:34:54 5 [Warning] InnoDB: Cannot save statistics for table test.t8 because file ./test/t8.ibd cannot be decrypted.
      # 2022-08-31T09:36:38 [1748421] | 2022-08-31  9:34:55 5 [ERROR] InnoDB: In page 14 of index PRIMARY of table test.t8
      # 2022-08-31T09:36:38 [1748421] | InnoDB: broken FIL_PAGE_NEXT link
      # 2022-08-31T09:36:38 [1748421] | 2022-08-31  9:34:55 5 [ERROR] InnoDB: In page 42 of index k of table test.t8
      # 2022-08-31T09:36:38 [1748421] | InnoDB: broken FIL_PAGE_NEXT link
      # 2022-08-31T09:36:38 [1748421] | 2022-08-31  9:35:04 0 [Note] /data/Server_bin/bb-10.6-MDEV-29374_asan/bin/mysqld (initiated by: root[root] @ localhost [127.0.0.1]): Normal shutdown
      # 2022-08-31T09:36:38 [1748421] | 2022-08-31  9:35:04 0 [Note] InnoDB: FTS optimize thread exiting.
      # 2022-08-31T09:36:38 [1748421] | 2022-08-31  9:35:04 0 [Note] InnoDB: Starting shutdown...
      # 2022-08-31T09:36:38 [1748421] | 2022-08-31  9:35:04 0 [Note] InnoDB: Dumping buffer pool(s) to /dev/shm/rqg/1661956911/39/1_clone/data/ib_buffer_pool
      # 2022-08-31T09:36:38 [1748421] | 2022-08-31  9:35:04 0 [Note] InnoDB: Restricted to 95 pages due to innodb_buf_pool_dump_pct=25
      # 2022-08-31T09:36:38 [1748421] | 2022-08-31  9:35:04 0 [Note] InnoDB: Buffer pool(s) dump completed at 220831  9:35:04
      # 2022-08-31T09:36:38 [1748421] | mysqld: /data/Server/bb-10.6-MDEV-29374/storage/innobase/include/sux_lock.h:79: void sux_lock<ssux>::free() [with ssux = ssux_lock_impl<false>]: Assertion `!writer.load(std::memory_order_relaxed)' failed.
      

      The broken FIL_PAGE_NEXT link messages were reported by btr_validate_level(), which is executed as part of non-QUICK CHECK TABLE. The crash appears to occur because we forgot to release the index latch after reporting the corruption:

      diff --git a/storage/innobase/btr/btr0btr.cc b/storage/innobase/btr/btr0btr.cc
      index 772ac99a5d5..3e48955e85a 100644
      --- a/storage/innobase/btr/btr0btr.cc
      +++ b/storage/innobase/btr/btr0btr.cc
      @@ -4879,6 +4879,7 @@ btr_validate_level(
       loop:
       	if (!block) {
       invalid_page:
      +		mtr.commit();
       func_exit:
       		mem_heap_free(heap);
       		return err;
      

      The messages about failing to save persistent statistics seem to be unrelated to this, because that code is not accessing the dict_index_t::lock at all.

      When it comes to the cause of the corruption itself, I think that an rr replay trace will be needed.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.