Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23456

fil_space_crypt_t::write_page0() is accessing an uninitialized page

    XMLWordPrintable

Details

    Description

      This bug was repeated with the test case of MDEV-17596, which was originally filed about something else. Here is the .opt file:

      --innodb-encrypt-tables=ON
      --innodb-encryption-threads=4
      --innodb-log-optimize-ddl=OFF
      

      And here the .test file:

      --source include/have_innodb.inc
      --source include/have_file_key_management_plugin.inc
       
      SET @scrub.save= @@innodb_background_scrub_data_uncompressed;
       
      --connect (con1,localhost,root,,)
      CREATE TABLE t1 (pk INT, a INT, PRIMARY KEY(pk)) ENGINE=InnoDB;
      INSERT INTO t1 VALUES (1,2),(2,3),(3,4),(4,5),(5,6),(6,7),(7,8),(8,9);
      SET GLOBAL innodb_background_scrub_data_uncompressed=ON;
      --disconnect con1
       
      --connection default
      --let $i= 1000
      while ($i)
      {
         --eval ALTER TABLE t1 FORCE /* $i */
         --dec $i
      }
       
      # Cleanup
      --delimiter ;
      DROP TABLE t1;
      EOF
      

      With a small tweak, I got an rr replay trace:

      diff --git a/mysql-test/lib/My/SafeProcess/safe_process.cc b/mysql-test/lib/My/SafeProcess/safe_process.cc
      index 13d09198a0c..1a20f5ad748 100644
      --- a/mysql-test/lib/My/SafeProcess/safe_process.cc
      +++ b/mysql-test/lib/My/SafeProcess/safe_process.cc
      @@ -97,7 +97,7 @@ static int kill_child(bool was_killed)
         message("Killing child: %d", child_pid);
         // Terminate whole process group
         if (! was_killed)
      -    kill(-child_pid, SIGKILL);
      +    kill(-child_pid, SIGABRT);
       
         pid_t ret_pid= waitpid(child_pid, &status, 0);
         if (ret_pid == child_pid)
      

      The page was freed earlier, and it had been reallocated to a new page 0 of an encrypted tablespace, but the initialization of that page frame was still in progress at the time of invoking fil_space_crypt_t::write_page0():

      10.2 01738d08f30315f10c14faa605d0087e54156cd0

      Thread 11 hit Hardware watchpoint 1: -location page[4]@4
       
      Old value = "\000\000\001}"
      New value = "\377\377\377\377"
      buf_LRU_block_remove_hashed (bpage=0x644315d20900, zip=true) at /mariadb/10.2o/storage/innobase/buf/buf0lru.cc:1860
      1860			memset(((buf_block_t*) bpage)->frame
      (rr) when
      Current event: 1220914
      (rr) bt
      #0  0x000055f5c18afa43 in buf_LRU_block_remove_hashed (bpage=0x644315d20900, zip=true) at /mariadb/10.2o/storage/innobase/buf/buf0lru.cc:1858
      #1  0x000055f5c18ae113 in buf_LRU_free_page (bpage=0x644315d20900, zip=true) at /mariadb/10.2o/storage/innobase/buf/buf0lru.cc:1476
      #2  0x000055f5c18a267d in buf_flush_LRU_list_batch (buf_pool=<optimized out>, max=100, n=<optimized out>) at /mariadb/10.2o/storage/innobase/buf/buf0flu.cc:1656
      #3  0x000055f5c18a3c84 in buf_do_LRU_batch (buf_pool=0x55f5c2df4780, max=100, n=0x30ec25dbcb40) at /mariadb/10.2o/storage/innobase/buf/buf0flu.cc:1727
      #4  buf_flush_batch (buf_pool=<optimized out>, flush_type=<optimized out>, min_n=100, lsn_limit=0, n=0x55f5c2e44d40) at /mariadb/10.2o/storage/innobase/buf/buf0flu.cc:1855
      #5  0x000055f5c18a7d86 in buf_flush_do_batch (type=BUF_FLUSH_LRU, min_n=100, lsn_limit=0, n=0x30ec25dbcb40, buf_pool=<optimized out>) at /mariadb/10.2o/storage/innobase/buf/buf0flu.cc:2028
      #6  buf_flush_LRU_list (buf_pool=<optimized out>) at /mariadb/10.2o/storage/innobase/buf/buf0flu.cc:2299
      #7  pc_flush_slot () at /mariadb/10.2o/storage/innobase/buf/buf0flu.cc:2820
      #8  0x000055f5c18a60d5 in buf_flush_page_cleaner_coordinator () at /mariadb/10.2o/storage/innobase/buf/buf0flu.cc:3256
      #9  0x000057bc2d6e0ea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #10 0x00007fdd4f3abeaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      (rr) c
      Continuing.
       
      Thread 11 hit Hardware watchpoint 2: -location block.page.id
       
      Old value = {m_space = 0, m_page_no = 381}
      New value = {m_space = 4294967295, m_page_no = 4294967295}
      buf_LRU_block_free_hashed_page (block=0x644315d20900) at /mariadb/10.2o/storage/innobase/buf/buf0lru.cc:1943
      1943		buf_block_set_state(block, BUF_BLOCK_MEMORY);
      (rr) when
      Current event: 1220920
      (rr) c
      Continuing.
       
      Thread 11 hit Hardware watchpoint 1: -location page[4]@4
       
      Old value = "\377\377\377\377"
      New value = "\376\376\376\376"
      buf_LRU_block_free_non_file_page (block=0x644315d20900) at /mariadb/10.2o/storage/innobase/buf/buf0lru.cc:1666
      1666		memset(block->frame + FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID, 0xfe, 4);
      (rr) c
      Continuing.
      [Switching to Thread 33579.33908]
       
      Thread 32 hit Hardware watchpoint 2: -location block.page.id
       
      Old value = {m_space = 4294967295, m_page_no = 4294967295}
      New value = {m_space = 86, m_page_no = 0}
      buf_page_init (buf_pool=0x55f5c2df4780, page_id={m_space = 86, m_page_no = 0}, page_size=@0x7faf707e9248: {m_physical = 16384, m_logical = 16384, m_is_compressed = 0}, block=0x644315d20900)
          at /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:5261
      5261		buf_block_init_low(block);
      (rr) bt
      #0  buf_page_init (buf_pool=0x55f5c2df4780, page_id={m_space = 86, m_page_no = 0}, page_size=@0x7faf707e9248: {m_physical = 16384, m_logical = 16384, m_is_compressed = 0}, block=0x644315d20900)
          at /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:5261
      #1  0x000055f5c188f50b in buf_page_create (page_id=<optimized out>, page_size=@0x7faf707e9248: {m_physical = 16384, m_logical = 16384, m_is_compressed = 0}, mtr=0x7faf707e92d8)
          at /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:5634
      #2  0x000055f5c1926011 in fsp_header_init (space_id=86, size=4, mtr=0x7faf707e92d8) at /mariadb/10.2o/storage/innobase/fsp/fsp0fsp.cc:703
      #3  0x000055f5c18c064c in dict_build_table_def_step (thr=0x7faf680d8e18, node=0x7faf6813aec0) at /mariadb/10.2o/storage/innobase/dict/dict0crea.cc:443
      #4  dict_create_table_step (thr=0x7faf680d8e18) at /mariadb/10.2o/storage/innobase/dict/dict0crea.cc:1321
      #5  0x000055f5c17209db in que_thr_step (thr=0x7faf680d8e18) at /mariadb/10.2o/storage/innobase/que/que0que.cc:1050
      #6  que_run_threads_low (thr=0x7faf680d8e18) at /mariadb/10.2o/storage/innobase/que/que0que.cc:1104
      #7  que_run_threads (thr=<optimized out>) at /mariadb/10.2o/storage/innobase/que/que0que.cc:1144
      #8  0x000055f5c1765e1f in row_create_table_for_mysql (table=<optimized out>, trx=0x22ef1f780268, mode=FIL_ENCRYPTION_DEFAULT, key_id=<optimized out>) at /mariadb/10.2o/storage/innobase/row/row0mysql.cc:2273
      #9  0x000055f5c166f692 in prepare_inplace_alter_table_dict (ha_alter_info=0x7faf707ed730, altered_table=0x7faf680d8088, old_table=0x7faf680c3de8, table_name=0x7faf680d1fc5 "t1", flags=33, 
          flags2=<optimized out>, fts_doc_id_col=<optimized out>, add_fts_doc_id=false, add_fts_doc_id_idx=<optimized out>) at /mariadb/10.2o/storage/innobase/handler/handler0alter.cc:4711
      #10 0x000055f5c166afd2 in ha_innobase::prepare_inplace_alter_table (this=<optimized out>, altered_table=<optimized out>, ha_alter_info=0x7faf707ed730)
          at /mariadb/10.2o/storage/innobase/handler/handler0alter.cc:6048
      #11 0x000055f5c1358032 in mysql_inplace_alter_table (thd=0x7faf68000d28, table_list=0x7faf68011a58, table=0x7faf680c3de8, altered_table=0x7faf680d8088, ha_alter_info=0x7faf707ed730, 
          inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=<optimized out>, alter_ctx=0x7faf707ec970) at /mariadb/10.2o/sql/sql_table.cc:7420
      #12 mysql_alter_table (thd=<optimized out>, new_db=<optimized out>, new_name=<optimized out>, create_info=<optimized out>, table_list=0x7faf68011a58, alter_info=<optimized out>, order_num=<optimized out>, 
          order=<optimized out>, ignore=<optimized out>) at /mariadb/10.2o/sql/sql_table.cc:9622
      #13 0x000055f5c13b1818 in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x7faf68000d28) at /mariadb/10.2o/sql/sql_alter.cc:333
      #14 0x000055f5c12b5796 in mysql_execute_command (thd=0x7faf68000d28) at /mariadb/10.2o/sql/sql_parse.cc:5964
      #15 0x000055f5c12b2e14 in mysql_parse (thd=0x7faf68000d28, rawbuf=0x7faf68011970 "ALTER TABLE t1 FORCE /* 919 */", length=<optimized out>, parser_state=0x7faf707ef6d0, is_com_multi=<optimized out>, 
          is_next_command=false) at /mariadb/10.2o/sql/sql_parse.cc:7733
      (rr) c
      Continuing.
       
      Thread 3 received signal SIGABRT, Aborted.
      [Switching to Thread 33579.33881]
      __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      50	../sysdeps/unix/sysv/linux/raise.c: Tiedostoa tai hakemistoa ei ole.
      (rr) f 8
      #8  0x000055f5c191e56f in fil_space_crypt_t::write_page0 (this=0x7faf680b0ca0, space=0x7faf680722e0, page=0x6443160a8000 "\351<\376\247\376\376\376\376\377\377\377\377\377\377\377\377", mtr=0x663c7a2b32d8)
          at /mariadb/10.2o/storage/innobase/fil/fil0crypt.cc:398
      398		mlog_write_string(page + offset, CRYPT_MAGIC, MAGIC_SZ, mtr);
      (rr) p *space
      $3 = {<ilist_node<unflushed_spaces_tag_t>> = {next = 0x0, prev = 0x0}, <ilist_node<rotation_list_tag_t>> = {next = 0x0, prev = 0x0}, id = 86, hash = 0x0, name = 0x7faf68081fb0 "test/#sql-832b_9", 
      (rr) up
      #9  0x000055f5c1922b21 in fil_crypt_flush_space (state=<optimized out>) at /mariadb/10.2o/storage/innobase/fil/fil0crypt.cc:2069
      2069			crypt_data->write_page0(space, block->frame, &mtr);
      (rr) 
      #10 fil_crypt_complete_rotate_space (key_state=<optimized out>, state=<optimized out>) at /mariadb/10.2o/storage/innobase/fil/fil0crypt.cc:2150
      2150				fil_crypt_flush_space(state);
      (rr) 
      #11 0x000055f5c1921af4 in fil_crypt_thread () at /mariadb/10.2o/storage/innobase/fil/fil0crypt.cc:2254
      2254					fil_crypt_complete_rotate_space(&new_state, &thr);
      (rr) thr 32
      (rr) thr bt
      #5  0x000055f5c17dd24b in rw_lock_debug_mutex_enter () at /mariadb/10.2o/storage/innobase/sync/sync0debug.cc:1264
      #6  0x000055f5c17d75a9 in rw_lock_own (lock=0x55f5c2e47928, lock_type=5) at /mariadb/10.2o/storage/innobase/sync/sync0rw.cc:1009
      #7  0x000055f5c1884361 in buf_page_hash_get_low (buf_pool=0x55f5c2df4780, page_id=<optimized out>) at /mariadb/10.2o/storage/innobase/include/buf0buf.ic:1117
      #8  0x000055f5c18877ee in buf_page_get_low (page_id={m_space = 86, m_page_no = 0}, page_size=@0x7faf707e9248: {m_physical = 16384, m_logical = 16384, m_is_compressed = 0}, rw_latch=4, guess=0x0, 
          mode=<optimized out>, file=0x55f5c1cd9678 "/mariadb/10.2o/storage/innobase/fsp/fsp0fsp.cc", line=704, mtr=0x7faf707e92d8, err=0x0) at /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:4309
      #9  0x000055f5c188b481 in buf_page_get_gen (page_id=<optimized out>, page_size=<optimized out>, rw_latch=<optimized out>, guess=<optimized out>, mode=<optimized out>, file=<optimized out>, line=327713, 
          mtr=0x7faf680d8e18, err=0x0) at /mariadb/10.2o/storage/innobase/buf/buf0buf.cc:4907
      #10 0x000055f5c1926042 in fsp_header_init (space_id=86, size=4, mtr=0x7faf707e92d8) at /mariadb/10.2o/storage/innobase/fsp/fsp0fsp.cc:704
      

      So, the page was freed and then reused for something else. We can see that fil_space_crypt_t::write_page0() is accessing page 86:0, which has not been initialized yet by fsp_header_init().

      I think that to fix this remaining bug, we should not allow fil_crypt_thread to access tablespaces whose first page has not been initialized yet. This bug should affect 10.5 as well, even though no assertion should fail there. In 10.5, we could be writing redo log for a not-yet-initialized page 0, but thanks to MDEV-12353 we should get the correct page number straight from the buf_page_t.

      Attachments

        Issue Links

          Activity

            People

              thiru Thirunarayanan Balathandayuthapani
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.