Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18958

InnoDB buffer pool load is opening an inaccessible data file

    XMLWordPrintable

Details

    Description

      Note: This possibly affects earlier versions as well. I have tentatively set fixVersion accordingly.

      The following test for MDEV-18644 debug instrumentation is failing to restrict access to the table after the fault injection is enabled on restart:

      --source include/have_innodb.inc
      --source include/have_debug.inc
      SET GLOBAL innodb_checksum_algorithm=strict_full_crc32;
      create table t1(f1 int not null)page_compressed=1 engine=innodb;
      insert into t1 values(1);
      let $restart_parameters="--debug-dbug='+d,fil_comp_algo_validate_fail'";
      SELECT @@GLOBAL.innodb_checksum_algorithm;
      --source include/restart_mysqld.inc
      --disable_abort_on_error
      select * from t1;
      show create table t1;
      drop table t1;
      

      Yes, we would initially notice that the table is supposedly inaccessible, but on a subsequent call we would have space->size!=0, and we would skip the check.

      Wouldn’t it be better to drop the invalid tablespace metadata as soon as we notice that the file is inaccessible?

      Here is the stack trace for the first-time opening of the file:

      10.4 6b6fa3cdb16ae7b4bc9e307c7d9b9012a055548c

      #0  fil_node_t::read_page0 (this=0x5555579d83b0, first=<optimized out>)
          at /mariadb/10.4/storage/innobase/fil/fil0fil.cc:513
      #1  0x000055555621ed39 in fil_node_open_file (node=0x5555579d83b0)
          at /mariadb/10.4/storage/innobase/fil/fil0fil.cc:661
      #2  0x000055555622897c in fil_node_prepare_for_io (node=0x5555579d83b0, 
          space=0x5555579d80d0)
          at /mariadb/10.4/storage/innobase/fil/fil0fil.cc:4074
      #3  0x0000555556227cd9 in fil_io (type=..., sync=<optimized out>, 
          page_id=..., zip_size=0, byte_offset=0, len=16384, buf=0x7ffff07f4000, 
          message=0x7ffff02fc3a0, ignore_missing_space=<optimized out>)
          at /mariadb/10.4/storage/innobase/fil/fil0fil.cc:4313
      #4  0x00005555561cb77c in buf_read_page_low (err=<optimized out>, 
          sync=<optimized out>, type=<optimized out>, mode=<optimized out>, 
          page_id=..., zip_size=0, unzip=<optimized out>, 
          ignore_missing_space=<optimized out>)
          at /mariadb/10.4/storage/innobase/buf/buf0rea.cc:180
      #5  0x00005555561cc2c4 in buf_read_page_background (page_id=..., 
          zip_size=140736307369143, sync=<optimized out>)
          at /mariadb/10.4/storage/innobase/buf/buf0rea.cc:440
      #6  0x00005555561b2282 in buf_load ()
          at /mariadb/10.4/storage/innobase/buf/buf0dump.cc:719
      #7  0x00005555561b1c14 in buf_dump_thread ()
          at /mariadb/10.4/storage/innobase/buf/buf0dump.cc:825
      #8  0x00007ffff7f8cfa3 in start_thread (arg=<optimized out>)
          at pthread_create.c:486
      

      After the second time, we will keep the file open, and thus also the user will be able to access the table, although it was supposed to be inaccessible:

      #0  0x000055555621ec39 in fil_node_open_file (node=0x5555579d83b0)
          at /mariadb/10.4/storage/innobase/include/os0file.ic:170
      #1  0x000055555622897c in fil_node_prepare_for_io (node=0x5555579d83b0, 
          space=0x5555579d80d0)
          at /mariadb/10.4/storage/innobase/fil/fil0fil.cc:4074
      #2  0x0000555556227cd9 in fil_io (type=..., sync=<optimized out>, 
          page_id=..., zip_size=0, byte_offset=0, len=16384, buf=0x7ffff07f4000, 
          message=0x7ffff02fc3a0, ignore_missing_space=<optimized out>)
          at /mariadb/10.4/storage/innobase/fil/fil0fil.cc:4313
      #3  0x00005555561cb77c in buf_read_page_low (err=<optimized out>, 
          sync=<optimized out>, type=<optimized out>, mode=<optimized out>, 
          page_id=..., zip_size=0, unzip=<optimized out>, 
          ignore_missing_space=<optimized out>)
          at /mariadb/10.4/storage/innobase/buf/buf0rea.cc:180
      #4  0x00005555561cc2c4 in buf_read_page_background (page_id=..., 
          zip_size=140736364446903, sync=<optimized out>)
          at /mariadb/10.4/storage/innobase/buf/buf0rea.cc:440
      #5  0x00005555561b2282 in buf_load ()
          at /mariadb/10.4/storage/innobase/buf/buf0dump.cc:719
      

      On this second access, we would skip the validity check, because node->size had been assigned on the first access.

      Note: Certainly we don’t want to fix this by assigning node->size=0 after detecting corruption. That would cause us to repeatedly open, read, and close this file on every access.

      While working on the fix, please write a test to also cover the space->recv_size recovery.

      Attachments

        1. mdev-18958-10.4v1.patch
          3 kB
          Thirunarayanan Balathandayuthapani

        Issue Links

          Activity

            People

              thiru Thirunarayanan Balathandayuthapani
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.