Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21347

innodb_log_optimize_ddl=OFF is not crash safe

Details

    Description

      In MDEV-16809, MariaDB introduced a feature to enable redo log recording for bulk load index creation, which disabled redo log recording originally.

      This is a nice feature, it solves the issue for PXB backup with concurrent DDL, we ported this feature to our MySQL branch and report a feature request to MySQL upstream by https://bugs.mysql.com/bug.php?id=92099.

      But recently, we encountered several data corruption case in our pro env, and after some investigation, we found that this feature is not crash safe. If mysqld restart abnormally during DDL, crash happened during crash recovery. And I managed repro this bug using the latest mariadb (10.5, fetch source from github).

      Thread 7 "mysqld" received signal SIGSEGV, Segmentation fault.
       
      >>> bt
      #0  page_rec_find_owner_rec (rec=0x7fff7830c09d "\200") at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0page.ic:770
      #1  page_cur_insert_rec_low (cur=cur@entry=0x7fff33ffd6f0, index=index@entry=0x7fff2c0016f0, rec=rec@entry=0x7fff33ffdab7 "\200", offsets=<optimized out>, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1532
      #2  0x0000555556173aaa in page_cur_rec_insert (mtr=0x7fff33ffe420, offsets=<optimized out>, index=0x7fff2c0016f0, rec=0x7fff33ffdab7 "\200", cursor=0x7fff33ffd6f0) at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0cur.ic:319
      #3  page_cur_parse_insert_rec (is_short=is_short@entry=false, ptr=<optimized out>, end_ptr=end_ptr@entry=0x55555ab30b51 "", block=block@entry=0x7fff78001590, index=0x7fff2c0016f0, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1146
      #4  0x0000555556159559 in recv_parse_or_apply_log_rec_body (type=MLOG_COMP_REC_INSERT, ptr=<optimized out>, end_ptr=0x55555ab30b51 "", space_id=6, page_no=15, apply=apply@entry=true, block=block@entry=0x7fff78001590, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1497
      #5  0x0000555556159d7b in recv_recover_page (block=block@entry=0x7fff78001590, mtr=..., p=..., init=init@entry=0x0) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1951
      #6  0x0000555555b4fb25 in recv_recover_page (bpage=bpage@entry=0x7fff78001590) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:2048
      #7  0x000055555627945a in buf_page_io_complete (bpage=bpage@entry=0x7fff78001590, dblwr=dblwr@entry=true, evict=evict@entry=false) at /home/fungo/Projects/mariadb-server/storage/innobase/buf/buf0buf.cc:5993
      #8  0x00005555562d8098 in fil_aio_callback (cb=cb@entry=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/fil/fil0fil.cc:4375
      #9  0x000055555616e363 in io_callback (cb=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/os/os0file.cc:3880
      #10 0x0000555556350707 in tpool::task_group::execute (this=0x555557a3d480, t=0x555557a60ea8) at /home/fungo/Projects/mariadb-server/tpool/task_group.cc:55
      #11 0x000055555634efc1 in tpool::thread_pool_generic::worker_main (this=0x5555579fe090, thread_var=0x555557a0d8a0) at /home/fungo/Projects/mariadb-server/tpool/tpool_generic.cc:509
      #12 0x00007ffff6c36360 in ?? () from /lib64/libstdc++.so.6
      #13 0x00007ffff7bc6e25 in start_thread () from /lib64/libpthread.so.0
      #14 0x00007ffff6398f1d in clone () from /lib64/libc.so.6
      

      >>> p rec
      $1 = (rec_t *) 0x0
      
      

      Analysis:

      The reason is that
      1. redo log is recorded(page_cur_insert_rec_write_log())along with every record insertion in PageBulk::insert()
      2. the index page header info is fixed at BtrBulk::pageCommit() by invoking PageBulk::finishPage(), then we use PageBulk::commit() to commit the mtr (relese buf_fix_count, rwlock, and write local redo to global buffer).
      3. But we may commit the mtr too early by PageBulk::release()

      If the mtr is comimtted by PageBulk::release(), the rwlock is released, and the dirty page can be flush to disk. But the index page header info is not fixed.

      So if
      1. a checkpoint happened after PageBulk::release() and before PageBulk::commit().
      2. mysqld is killed (OOM or somehow) before InnoDB do a new checkpoint
      3. the crash recover after 2 will crash as showed in the previous stack

      Bellow is how I manually reproed this (need gdb asssitance):

      0. in my.cnf set innodb_sort_buffer_size=65536, make sure PageBulk::release() will be invoked.

      1. prepare data into t1
      create table t1(id int auto_increment, name varchar(30), primary key(id)) engine=innodb;

      insert into t1 values (1, "MySQL"), (2, "MariaDB"), (3, "AlisQL"), (4, "PolarDB"), (5, "hahaha");

      insert into t1(name) select a.name from t1 a, t1 b limit 5000;
      insert into t1(name) select a.name from t1 a, t1 b limit 5000;
      insert into t1(name) select a.name from t1 a, t1 b limit 5000;
      insert into t1(name) select a.name from t1 a, t1 b limit 5000;
      insert into t1(name) select a.name from t1 a, t1 b limit 5000;
      insert into t1(name) select a.name from t1 a, t1 b limit 5000;

      2. set gdb breakpoint at PageBulk::release()

      3. run
      optimize table t1;

      4. gdb will break at PageBulk::release()
      make PageBulk::release() finish, by 2 or more finish commands

      then using gdb maually call os_thread_sleep(30*1000000) to block current bulk load thread, and this will give page cleaner enougth time to flush all dirty pages and advance checkpoint

      5. after os_thread_sleep() return, using same gdb trick in step 4 to block page cleaner thread forever
      , such as call os_thread_sleep(3000*1000000)

      6. optimize query in step 3 will finish

      7. using `show engine innodb status`, we can see there is some un-checkpointed redo

      8. then kill -9 mysqld, and the crash recovery will crash

      Attachments

        Issue Links

          Activity

            MDEV-23156 suggests that that this does affect 10.5.

            marko Marko Mäkelä added a comment - MDEV-23156 suggests that that this does affect 10.5.

            mleich, please test bb-10.5-MDEV-21347. I hope that MDEV-23156 will be fixed by that.

            Apart from the test innodb.innodb-index, innodb.innodb_bug34300 was making some headache in the 10.2 version (bb-10.5-MDEV-21347^2^2^2). Apparently it is rebuilding a table and writing BLOBs in the process.

            marko Marko Mäkelä added a comment - mleich , please test bb-10.5-MDEV-21347 . I hope that MDEV-23156 will be fixed by that. Apart from the test innodb.innodb-index , innodb.innodb_bug34300 was making some headache in the 10.2 version ( bb-10.5-MDEV-21347^2^2^2 ). Apparently it is rebuilding a table and writing BLOBs in the process.

            origin/bb-10.2-marko bb1099f7e720aabcba127b77b0e5c98894f52a74 2020-07-15T19:41:01+03:00
            containing the fix for MDEV-21347 performed well during RQG testing based on the InnoDB
            standard test battery modified towards

            • go with innodb-log-optimize-ddl=OFF during the complete test
            • go with innodb-log-optimize-ddl=ON during the complete test
            • switch innodb-log-optimize-ddl during the ongoing test
              None of the tests failed with some unknown assert, data corruption or backup failure.
            mleich Matthias Leich added a comment - origin/bb-10.2-marko bb1099f7e720aabcba127b77b0e5c98894f52a74 2020-07-15T19:41:01+03:00 containing the fix for MDEV-21347 performed well during RQG testing based on the InnoDB standard test battery modified towards go with innodb-log-optimize-ddl=OFF during the complete test go with innodb-log-optimize-ddl=ON during the complete test switch innodb-log-optimize-ddl during the ongoing test None of the tests failed with some unknown assert, data corruption or backup failure.

            I merged this up to 10.5. In 10.5, the server always behaves as if innodb_log_optimize_ddl=OFF, and the parameter is deprecated and ignored.

            marko Marko Mäkelä added a comment - I merged this up to 10.5. In 10.5, the server always behaves as if innodb_log_optimize_ddl=OFF , and the parameter is deprecated and ignored.

            Also origin/bb-10.5-MDEV-21347 c97af34d6a9286c04076b1f25493bd52ecc6b459 2020-07-15T21:13:42+03:00
            behaved well during RQG testing.

            mleich Matthias Leich added a comment - Also origin/bb-10.5- MDEV-21347 c97af34d6a9286c04076b1f25493bd52ecc6b459 2020-07-15T21:13:42+03:00 behaved well during RQG testing.

            People

              marko Marko Mäkelä
              fungo Fungo Wang
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.