Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21347

innodb_log_optimize_ddl=OFF is not crash safe

Details

    Description

      In MDEV-16809, MariaDB introduced a feature to enable redo log recording for bulk load index creation, which disabled redo log recording originally.

      This is a nice feature, it solves the issue for PXB backup with concurrent DDL, we ported this feature to our MySQL branch and report a feature request to MySQL upstream by https://bugs.mysql.com/bug.php?id=92099.

      But recently, we encountered several data corruption case in our pro env, and after some investigation, we found that this feature is not crash safe. If mysqld restart abnormally during DDL, crash happened during crash recovery. And I managed repro this bug using the latest mariadb (10.5, fetch source from github).

      Thread 7 "mysqld" received signal SIGSEGV, Segmentation fault.
       
      >>> bt
      #0  page_rec_find_owner_rec (rec=0x7fff7830c09d "\200") at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0page.ic:770
      #1  page_cur_insert_rec_low (cur=cur@entry=0x7fff33ffd6f0, index=index@entry=0x7fff2c0016f0, rec=rec@entry=0x7fff33ffdab7 "\200", offsets=<optimized out>, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1532
      #2  0x0000555556173aaa in page_cur_rec_insert (mtr=0x7fff33ffe420, offsets=<optimized out>, index=0x7fff2c0016f0, rec=0x7fff33ffdab7 "\200", cursor=0x7fff33ffd6f0) at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0cur.ic:319
      #3  page_cur_parse_insert_rec (is_short=is_short@entry=false, ptr=<optimized out>, end_ptr=end_ptr@entry=0x55555ab30b51 "", block=block@entry=0x7fff78001590, index=0x7fff2c0016f0, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1146
      #4  0x0000555556159559 in recv_parse_or_apply_log_rec_body (type=MLOG_COMP_REC_INSERT, ptr=<optimized out>, end_ptr=0x55555ab30b51 "", space_id=6, page_no=15, apply=apply@entry=true, block=block@entry=0x7fff78001590, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1497
      #5  0x0000555556159d7b in recv_recover_page (block=block@entry=0x7fff78001590, mtr=..., p=..., init=init@entry=0x0) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1951
      #6  0x0000555555b4fb25 in recv_recover_page (bpage=bpage@entry=0x7fff78001590) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:2048
      #7  0x000055555627945a in buf_page_io_complete (bpage=bpage@entry=0x7fff78001590, dblwr=dblwr@entry=true, evict=evict@entry=false) at /home/fungo/Projects/mariadb-server/storage/innobase/buf/buf0buf.cc:5993
      #8  0x00005555562d8098 in fil_aio_callback (cb=cb@entry=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/fil/fil0fil.cc:4375
      #9  0x000055555616e363 in io_callback (cb=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/os/os0file.cc:3880
      #10 0x0000555556350707 in tpool::task_group::execute (this=0x555557a3d480, t=0x555557a60ea8) at /home/fungo/Projects/mariadb-server/tpool/task_group.cc:55
      #11 0x000055555634efc1 in tpool::thread_pool_generic::worker_main (this=0x5555579fe090, thread_var=0x555557a0d8a0) at /home/fungo/Projects/mariadb-server/tpool/tpool_generic.cc:509
      #12 0x00007ffff6c36360 in ?? () from /lib64/libstdc++.so.6
      #13 0x00007ffff7bc6e25 in start_thread () from /lib64/libpthread.so.0
      #14 0x00007ffff6398f1d in clone () from /lib64/libc.so.6
      

      >>> p rec
      $1 = (rec_t *) 0x0
      
      

      Analysis:

      The reason is that
      1. redo log is recorded(page_cur_insert_rec_write_log())along with every record insertion in PageBulk::insert()
      2. the index page header info is fixed at BtrBulk::pageCommit() by invoking PageBulk::finishPage(), then we use PageBulk::commit() to commit the mtr (relese buf_fix_count, rwlock, and write local redo to global buffer).
      3. But we may commit the mtr too early by PageBulk::release()

      If the mtr is comimtted by PageBulk::release(), the rwlock is released, and the dirty page can be flush to disk. But the index page header info is not fixed.

      So if
      1. a checkpoint happened after PageBulk::release() and before PageBulk::commit().
      2. mysqld is killed (OOM or somehow) before InnoDB do a new checkpoint
      3. the crash recover after 2 will crash as showed in the previous stack

      Bellow is how I manually reproed this (need gdb asssitance):

      0. in my.cnf set innodb_sort_buffer_size=65536, make sure PageBulk::release() will be invoked.

      1. prepare data into t1
      create table t1(id int auto_increment, name varchar(30), primary key(id)) engine=innodb;

      insert into t1 values (1, "MySQL"), (2, "MariaDB"), (3, "AlisQL"), (4, "PolarDB"), (5, "hahaha");

      insert into t1(name) select a.name from t1 a, t1 b limit 5000;
      insert into t1(name) select a.name from t1 a, t1 b limit 5000;
      insert into t1(name) select a.name from t1 a, t1 b limit 5000;
      insert into t1(name) select a.name from t1 a, t1 b limit 5000;
      insert into t1(name) select a.name from t1 a, t1 b limit 5000;
      insert into t1(name) select a.name from t1 a, t1 b limit 5000;

      2. set gdb breakpoint at PageBulk::release()

      3. run
      optimize table t1;

      4. gdb will break at PageBulk::release()
      make PageBulk::release() finish, by 2 or more finish commands

      then using gdb maually call os_thread_sleep(30*1000000) to block current bulk load thread, and this will give page cleaner enougth time to flush all dirty pages and advance checkpoint

      5. after os_thread_sleep() return, using same gdb trick in step 4 to block page cleaner thread forever
      , such as call os_thread_sleep(3000*1000000)

      6. optimize query in step 3 will finish

      7. using `show engine innodb status`, we can see there is some un-checkpointed redo

      8. then kill -9 mysqld, and the crash recovery will crash

      Attachments

        Issue Links

          Activity

            fungo Fungo Wang created issue -
            alice Alice Sherepa made changes -
            Field Original Value New Value
            alice Alice Sherepa made changes -
            fungo Fungo Wang added a comment - - edited

            I found that MySQL upstream 8.0.13 has a bugfix, which can fix this crash.
            https://github.com/mysql/mysql-server/commit/d1254b947354e0f5b7223b09c521bd85f22e1e31

            commit d1254b947354e0f5b7223b09c521bd85f22e1e31
            Author: Krzysztof Kapuścik <krzysztof.kapuscik@oracle.com>
            Date:   Wed Apr 18 12:05:19 2018 +0200
             
                Bug #27802098 - BTRBULK OPERATIONS MIGHT LEAD A PAGE IN HALF INITIALIZED STATE
             
                Implementation of btr bulk changed so the page is valid when it
                is unlatched or commited. In such case proper values are set to
                page variables and valid dictionary is built.
             
                Also some code cleanup was done in the bulk code so it is easier
                to understand and provides better encapsulation.
             
                Added test that injects errors during bulk operations (that also
                checks page validity in debug builds).
             
                RB: 19484
            

            -/** Release block by commiting mtr
            +/** Release block by committing mtr
             Note: log_free_check requires holding no lock/latch in current thread. */
             void PageBulk::release() {
            +  /* Make sure page is valid before it is released. */
            +  if (m_modified) {
            +    finish();
            +    ut_ad(!m_modified);
            +  }
            +  ut_ad(page_validate(m_page, m_index));
            

            The index page header is fixed before mtr_commit in PageBulk::release(), so the released page is always valid when flushed to disk.

            To my curiosity, the bugfix is not back ported to MySQL 5.7. Maybe because MySQL don't have redo log recorded during DDL, so it will never encounter this crash issue. Maybe MySQL 8.0 fixed this to solve other issue (not clear to me), and the bugfix can solve our crash issue.

            fungo Fungo Wang added a comment - - edited I found that MySQL upstream 8.0.13 has a bugfix, which can fix this crash. https://github.com/mysql/mysql-server/commit/d1254b947354e0f5b7223b09c521bd85f22e1e31 commit d1254b947354e0f5b7223b09c521bd85f22e1e31 Author: Krzysztof KapuĹ›cik <krzysztof.kapuscik@oracle.com> Date: Wed Apr 18 12:05:19 2018 +0200   Bug #27802098 - BTRBULK OPERATIONS MIGHT LEAD A PAGE IN HALF INITIALIZED STATE   Implementation of btr bulk changed so the page is valid when it is unlatched or commited. In such case proper values are set to page variables and valid dictionary is built.   Also some code cleanup was done in the bulk code so it is easier to understand and provides better encapsulation.   Added test that injects errors during bulk operations (that also checks page validity in debug builds).   RB: 19484 -/** Release block by commiting mtr +/** Release block by committing mtr Note: log_free_check requires holding no lock/latch in current thread. */ void PageBulk::release() { + /* Make sure page is valid before it is released. */ + if (m_modified) { + finish(); + ut_ad(!m_modified); + } + ut_ad(page_validate(m_page, m_index)); The index page header is fixed before mtr_commit in PageBulk::release(), so the released page is always valid when flushed to disk. To my curiosity, the bugfix is not back ported to MySQL 5.7. Maybe because MySQL don't have redo log recorded during DDL, so it will never encounter this crash issue. Maybe MySQL 8.0 fixed this to solve other issue (not clear to me), and the bugfix can solve our crash issue.
            elenst Elena Stepanova made changes -
            Description In MDEV-16809, MariaDB introduced a feature to enable redo log recording for bulk load index creation, which disabled redo log recording originally.

            This is a nice feature, it solves the issue for PXB backup with concurrent DDL, we ported this feature to our MySQL branch and report a feature request to MySQL upstream by https://bugs.mysql.com/bug.php?id=92099.

            But recently, we encountered several data corruption case in our pro env, and after some investigation, we found that this feature is not crash safe. If mysqld restart abnormally during DDL, crash happened during crash recovery. And I managed repro this bug using the latest mariadb (10.5, fetch source from github).



            ```
            Thread 7 "mysqld" received signal SIGSEGV, Segmentation fault.

            >>> bt
            #0 page_rec_find_owner_rec (rec=0x7fff7830c09d "\200") at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0page.ic:770
            #1 page_cur_insert_rec_low (cur=cur@entry=0x7fff33ffd6f0, index=index@entry=0x7fff2c0016f0, rec=rec@entry=0x7fff33ffdab7 "\200", offsets=<optimized out>, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1532
            #2 0x0000555556173aaa in page_cur_rec_insert (mtr=0x7fff33ffe420, offsets=<optimized out>, index=0x7fff2c0016f0, rec=0x7fff33ffdab7 "\200", cursor=0x7fff33ffd6f0) at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0cur.ic:319
            #3 page_cur_parse_insert_rec (is_short=is_short@entry=false, ptr=<optimized out>, end_ptr=end_ptr@entry=0x55555ab30b51 "", block=block@entry=0x7fff78001590, index=0x7fff2c0016f0, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1146
            #4 0x0000555556159559 in recv_parse_or_apply_log_rec_body (type=MLOG_COMP_REC_INSERT, ptr=<optimized out>, end_ptr=0x55555ab30b51 "", space_id=6, page_no=15, apply=apply@entry=true, block=block@entry=0x7fff78001590, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1497
            #5 0x0000555556159d7b in recv_recover_page (block=block@entry=0x7fff78001590, mtr=..., p=..., init=init@entry=0x0) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1951
            #6 0x0000555555b4fb25 in recv_recover_page (bpage=bpage@entry=0x7fff78001590) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:2048
            #7 0x000055555627945a in buf_page_io_complete (bpage=bpage@entry=0x7fff78001590, dblwr=dblwr@entry=true, evict=evict@entry=false) at /home/fungo/Projects/mariadb-server/storage/innobase/buf/buf0buf.cc:5993
            #8 0x00005555562d8098 in fil_aio_callback (cb=cb@entry=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/fil/fil0fil.cc:4375
            #9 0x000055555616e363 in io_callback (cb=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/os/os0file.cc:3880
            #10 0x0000555556350707 in tpool::task_group::execute (this=0x555557a3d480, t=0x555557a60ea8) at /home/fungo/Projects/mariadb-server/tpool/task_group.cc:55
            #11 0x000055555634efc1 in tpool::thread_pool_generic::worker_main (this=0x5555579fe090, thread_var=0x555557a0d8a0) at /home/fungo/Projects/mariadb-server/tpool/tpool_generic.cc:509
            #12 0x00007ffff6c36360 in ?? () from /lib64/libstdc++.so.6
            #13 0x00007ffff7bc6e25 in start_thread () from /lib64/libpthread.so.0
            #14 0x00007ffff6398f1d in clone () from /lib64/libc.so.6



            >>> p rec
            $1 = (rec_t *) 0x0

            ```

            Analysis:

            The reason is that
            1. redo log is recorded(page_cur_insert_rec_write_log())along with every record insertion in PageBulk::insert()
            2. the index page header info is fixed at BtrBulk::pageCommit() by invoking PageBulk::finishPage(), then we use PageBulk::commit() to commit the mtr (relese buf_fix_count, rwlock, and write local redo to global buffer).
            3. But we may commit the mtr too early by PageBulk::release()



            If the mtr is comimtted by PageBulk::release(), the rwlock is released, and the dirty page can be flush to disk. But the index page header info is not fixed.

            So if
            1. a checkpoint happened after PageBulk::release() and before PageBulk::commit().
            2. mysqld is killed (OOM or somehow) before InnoDB do a new checkpoint
            3. the crash recover after 2 will crash as showed in the previous stack



            Bellow is how I manually reproed this (need gdb asssitance):

            0. in my.cnf set innodb_sort_buffer_size=65536, make sure PageBulk::release() will be invoked.

            1. prepare data into t1
            create table t1(id int auto_increment, name varchar(30), primary key(id)) engine=innodb;

            insert into t1 values (1, "MySQL"), (2, "MariaDB"), (3, "AlisQL"), (4, "PolarDB"), (5, "hahaha");


            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;

            2. set gdb breakpoint at PageBulk::release()

            3. run
            optimize table t1;

            4. gdb will break at PageBulk::release()
            make PageBulk::release() finish, by 2 or more finish commands

            then using gdb maually call os_thread_sleep(30*1000000) to block current bulk load thread, and this will give page cleaner enougth time to flush all dirty pages and advance checkpoint

            5. after os_thread_sleep() return, using same gdb trick in step 4 to block page cleaner thread forever
            , such as call os_thread_sleep(3000*1000000)

            6. optimize query in step 3 will finish

            7. using `show engine innodb status`, we can see there is some un-checkpointed redo

            8. then kill -9 mysqld, and the crash recovery will crash
            In MDEV-16809, MariaDB introduced a feature to enable redo log recording for bulk load index creation, which disabled redo log recording originally.

            This is a nice feature, it solves the issue for PXB backup with concurrent DDL, we ported this feature to our MySQL branch and report a feature request to MySQL upstream by https://bugs.mysql.com/bug.php?id=92099.

            But recently, we encountered several data corruption case in our pro env, and after some investigation, we found that this feature is not crash safe. If mysqld restart abnormally during DDL, crash happened during crash recovery. And I managed repro this bug using the latest mariadb (10.5, fetch source from github).

            {noformat}
            Thread 7 "mysqld" received signal SIGSEGV, Segmentation fault.

            >>> bt
            #0 page_rec_find_owner_rec (rec=0x7fff7830c09d "\200") at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0page.ic:770
            #1 page_cur_insert_rec_low (cur=cur@entry=0x7fff33ffd6f0, index=index@entry=0x7fff2c0016f0, rec=rec@entry=0x7fff33ffdab7 "\200", offsets=<optimized out>, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1532
            #2 0x0000555556173aaa in page_cur_rec_insert (mtr=0x7fff33ffe420, offsets=<optimized out>, index=0x7fff2c0016f0, rec=0x7fff33ffdab7 "\200", cursor=0x7fff33ffd6f0) at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0cur.ic:319
            #3 page_cur_parse_insert_rec (is_short=is_short@entry=false, ptr=<optimized out>, end_ptr=end_ptr@entry=0x55555ab30b51 "", block=block@entry=0x7fff78001590, index=0x7fff2c0016f0, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1146
            #4 0x0000555556159559 in recv_parse_or_apply_log_rec_body (type=MLOG_COMP_REC_INSERT, ptr=<optimized out>, end_ptr=0x55555ab30b51 "", space_id=6, page_no=15, apply=apply@entry=true, block=block@entry=0x7fff78001590, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1497
            #5 0x0000555556159d7b in recv_recover_page (block=block@entry=0x7fff78001590, mtr=..., p=..., init=init@entry=0x0) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1951
            #6 0x0000555555b4fb25 in recv_recover_page (bpage=bpage@entry=0x7fff78001590) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:2048
            #7 0x000055555627945a in buf_page_io_complete (bpage=bpage@entry=0x7fff78001590, dblwr=dblwr@entry=true, evict=evict@entry=false) at /home/fungo/Projects/mariadb-server/storage/innobase/buf/buf0buf.cc:5993
            #8 0x00005555562d8098 in fil_aio_callback (cb=cb@entry=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/fil/fil0fil.cc:4375
            #9 0x000055555616e363 in io_callback (cb=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/os/os0file.cc:3880
            #10 0x0000555556350707 in tpool::task_group::execute (this=0x555557a3d480, t=0x555557a60ea8) at /home/fungo/Projects/mariadb-server/tpool/task_group.cc:55
            #11 0x000055555634efc1 in tpool::thread_pool_generic::worker_main (this=0x5555579fe090, thread_var=0x555557a0d8a0) at /home/fungo/Projects/mariadb-server/tpool/tpool_generic.cc:509
            #12 0x00007ffff6c36360 in ?? () from /lib64/libstdc++.so.6
            #13 0x00007ffff7bc6e25 in start_thread () from /lib64/libpthread.so.0
            #14 0x00007ffff6398f1d in clone () from /lib64/libc.so.6



            >>> p rec
            $1 = (rec_t *) 0x0

            {noformat}

            Analysis:

            The reason is that
            1. redo log is recorded(page_cur_insert_rec_write_log())along with every record insertion in PageBulk::insert()
            2. the index page header info is fixed at BtrBulk::pageCommit() by invoking PageBulk::finishPage(), then we use PageBulk::commit() to commit the mtr (relese buf_fix_count, rwlock, and write local redo to global buffer).
            3. But we may commit the mtr too early by PageBulk::release()



            If the mtr is comimtted by PageBulk::release(), the rwlock is released, and the dirty page can be flush to disk. But the index page header info is not fixed.

            So if
            1. a checkpoint happened after PageBulk::release() and before PageBulk::commit().
            2. mysqld is killed (OOM or somehow) before InnoDB do a new checkpoint
            3. the crash recover after 2 will crash as showed in the previous stack



            Bellow is how I manually reproed this (need gdb asssitance):

            0. in my.cnf set innodb_sort_buffer_size=65536, make sure PageBulk::release() will be invoked.

            1. prepare data into t1
            create table t1(id int auto_increment, name varchar(30), primary key(id)) engine=innodb;

            insert into t1 values (1, "MySQL"), (2, "MariaDB"), (3, "AlisQL"), (4, "PolarDB"), (5, "hahaha");


            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;

            2. set gdb breakpoint at PageBulk::release()

            3. run
            optimize table t1;

            4. gdb will break at PageBulk::release()
            make PageBulk::release() finish, by 2 or more finish commands

            then using gdb maually call os_thread_sleep(30*1000000) to block current bulk load thread, and this will give page cleaner enougth time to flush all dirty pages and advance checkpoint

            5. after os_thread_sleep() return, using same gdb trick in step 4 to block page cleaner thread forever
            , such as call os_thread_sleep(3000*1000000)

            6. optimize query in step 3 will finish

            7. using `show engine innodb status`, we can see there is some un-checkpointed redo

            8. then kill -9 mysqld, and the crash recovery will crash
            elenst Elena Stepanova made changes -
            Description In MDEV-16809, MariaDB introduced a feature to enable redo log recording for bulk load index creation, which disabled redo log recording originally.

            This is a nice feature, it solves the issue for PXB backup with concurrent DDL, we ported this feature to our MySQL branch and report a feature request to MySQL upstream by https://bugs.mysql.com/bug.php?id=92099.

            But recently, we encountered several data corruption case in our pro env, and after some investigation, we found that this feature is not crash safe. If mysqld restart abnormally during DDL, crash happened during crash recovery. And I managed repro this bug using the latest mariadb (10.5, fetch source from github).

            {noformat}
            Thread 7 "mysqld" received signal SIGSEGV, Segmentation fault.

            >>> bt
            #0 page_rec_find_owner_rec (rec=0x7fff7830c09d "\200") at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0page.ic:770
            #1 page_cur_insert_rec_low (cur=cur@entry=0x7fff33ffd6f0, index=index@entry=0x7fff2c0016f0, rec=rec@entry=0x7fff33ffdab7 "\200", offsets=<optimized out>, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1532
            #2 0x0000555556173aaa in page_cur_rec_insert (mtr=0x7fff33ffe420, offsets=<optimized out>, index=0x7fff2c0016f0, rec=0x7fff33ffdab7 "\200", cursor=0x7fff33ffd6f0) at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0cur.ic:319
            #3 page_cur_parse_insert_rec (is_short=is_short@entry=false, ptr=<optimized out>, end_ptr=end_ptr@entry=0x55555ab30b51 "", block=block@entry=0x7fff78001590, index=0x7fff2c0016f0, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1146
            #4 0x0000555556159559 in recv_parse_or_apply_log_rec_body (type=MLOG_COMP_REC_INSERT, ptr=<optimized out>, end_ptr=0x55555ab30b51 "", space_id=6, page_no=15, apply=apply@entry=true, block=block@entry=0x7fff78001590, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1497
            #5 0x0000555556159d7b in recv_recover_page (block=block@entry=0x7fff78001590, mtr=..., p=..., init=init@entry=0x0) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1951
            #6 0x0000555555b4fb25 in recv_recover_page (bpage=bpage@entry=0x7fff78001590) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:2048
            #7 0x000055555627945a in buf_page_io_complete (bpage=bpage@entry=0x7fff78001590, dblwr=dblwr@entry=true, evict=evict@entry=false) at /home/fungo/Projects/mariadb-server/storage/innobase/buf/buf0buf.cc:5993
            #8 0x00005555562d8098 in fil_aio_callback (cb=cb@entry=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/fil/fil0fil.cc:4375
            #9 0x000055555616e363 in io_callback (cb=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/os/os0file.cc:3880
            #10 0x0000555556350707 in tpool::task_group::execute (this=0x555557a3d480, t=0x555557a60ea8) at /home/fungo/Projects/mariadb-server/tpool/task_group.cc:55
            #11 0x000055555634efc1 in tpool::thread_pool_generic::worker_main (this=0x5555579fe090, thread_var=0x555557a0d8a0) at /home/fungo/Projects/mariadb-server/tpool/tpool_generic.cc:509
            #12 0x00007ffff6c36360 in ?? () from /lib64/libstdc++.so.6
            #13 0x00007ffff7bc6e25 in start_thread () from /lib64/libpthread.so.0
            #14 0x00007ffff6398f1d in clone () from /lib64/libc.so.6



            >>> p rec
            $1 = (rec_t *) 0x0

            {noformat}

            Analysis:

            The reason is that
            1. redo log is recorded(page_cur_insert_rec_write_log())along with every record insertion in PageBulk::insert()
            2. the index page header info is fixed at BtrBulk::pageCommit() by invoking PageBulk::finishPage(), then we use PageBulk::commit() to commit the mtr (relese buf_fix_count, rwlock, and write local redo to global buffer).
            3. But we may commit the mtr too early by PageBulk::release()



            If the mtr is comimtted by PageBulk::release(), the rwlock is released, and the dirty page can be flush to disk. But the index page header info is not fixed.

            So if
            1. a checkpoint happened after PageBulk::release() and before PageBulk::commit().
            2. mysqld is killed (OOM or somehow) before InnoDB do a new checkpoint
            3. the crash recover after 2 will crash as showed in the previous stack



            Bellow is how I manually reproed this (need gdb asssitance):

            0. in my.cnf set innodb_sort_buffer_size=65536, make sure PageBulk::release() will be invoked.

            1. prepare data into t1
            create table t1(id int auto_increment, name varchar(30), primary key(id)) engine=innodb;

            insert into t1 values (1, "MySQL"), (2, "MariaDB"), (3, "AlisQL"), (4, "PolarDB"), (5, "hahaha");


            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;

            2. set gdb breakpoint at PageBulk::release()

            3. run
            optimize table t1;

            4. gdb will break at PageBulk::release()
            make PageBulk::release() finish, by 2 or more finish commands

            then using gdb maually call os_thread_sleep(30*1000000) to block current bulk load thread, and this will give page cleaner enougth time to flush all dirty pages and advance checkpoint

            5. after os_thread_sleep() return, using same gdb trick in step 4 to block page cleaner thread forever
            , such as call os_thread_sleep(3000*1000000)

            6. optimize query in step 3 will finish

            7. using `show engine innodb status`, we can see there is some un-checkpointed redo

            8. then kill -9 mysqld, and the crash recovery will crash
            In MDEV-16809, MariaDB introduced a feature to enable redo log recording for bulk load index creation, which disabled redo log recording originally.

            This is a nice feature, it solves the issue for PXB backup with concurrent DDL, we ported this feature to our MySQL branch and report a feature request to MySQL upstream by https://bugs.mysql.com/bug.php?id=92099.

            But recently, we encountered several data corruption case in our pro env, and after some investigation, we found that this feature is not crash safe. If mysqld restart abnormally during DDL, crash happened during crash recovery. And I managed repro this bug using the latest mariadb (10.5, fetch source from github).

            {noformat}
            Thread 7 "mysqld" received signal SIGSEGV, Segmentation fault.

            >>> bt
            #0 page_rec_find_owner_rec (rec=0x7fff7830c09d "\200") at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0page.ic:770
            #1 page_cur_insert_rec_low (cur=cur@entry=0x7fff33ffd6f0, index=index@entry=0x7fff2c0016f0, rec=rec@entry=0x7fff33ffdab7 "\200", offsets=<optimized out>, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1532
            #2 0x0000555556173aaa in page_cur_rec_insert (mtr=0x7fff33ffe420, offsets=<optimized out>, index=0x7fff2c0016f0, rec=0x7fff33ffdab7 "\200", cursor=0x7fff33ffd6f0) at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0cur.ic:319
            #3 page_cur_parse_insert_rec (is_short=is_short@entry=false, ptr=<optimized out>, end_ptr=end_ptr@entry=0x55555ab30b51 "", block=block@entry=0x7fff78001590, index=0x7fff2c0016f0, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1146
            #4 0x0000555556159559 in recv_parse_or_apply_log_rec_body (type=MLOG_COMP_REC_INSERT, ptr=<optimized out>, end_ptr=0x55555ab30b51 "", space_id=6, page_no=15, apply=apply@entry=true, block=block@entry=0x7fff78001590, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1497
            #5 0x0000555556159d7b in recv_recover_page (block=block@entry=0x7fff78001590, mtr=..., p=..., init=init@entry=0x0) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1951
            #6 0x0000555555b4fb25 in recv_recover_page (bpage=bpage@entry=0x7fff78001590) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:2048
            #7 0x000055555627945a in buf_page_io_complete (bpage=bpage@entry=0x7fff78001590, dblwr=dblwr@entry=true, evict=evict@entry=false) at /home/fungo/Projects/mariadb-server/storage/innobase/buf/buf0buf.cc:5993
            #8 0x00005555562d8098 in fil_aio_callback (cb=cb@entry=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/fil/fil0fil.cc:4375
            #9 0x000055555616e363 in io_callback (cb=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/os/os0file.cc:3880
            #10 0x0000555556350707 in tpool::task_group::execute (this=0x555557a3d480, t=0x555557a60ea8) at /home/fungo/Projects/mariadb-server/tpool/task_group.cc:55
            #11 0x000055555634efc1 in tpool::thread_pool_generic::worker_main (this=0x5555579fe090, thread_var=0x555557a0d8a0) at /home/fungo/Projects/mariadb-server/tpool/tpool_generic.cc:509
            #12 0x00007ffff6c36360 in ?? () from /lib64/libstdc++.so.6
            #13 0x00007ffff7bc6e25 in start_thread () from /lib64/libpthread.so.0
            #14 0x00007ffff6398f1d in clone () from /lib64/libc.so.6
            {noformat}

            {noformat}
            >>> p rec
            $1 = (rec_t *) 0x0

            {noformat}

            Analysis:

            The reason is that
            1. redo log is recorded(page_cur_insert_rec_write_log())along with every record insertion in PageBulk::insert()
            2. the index page header info is fixed at BtrBulk::pageCommit() by invoking PageBulk::finishPage(), then we use PageBulk::commit() to commit the mtr (relese buf_fix_count, rwlock, and write local redo to global buffer).
            3. But we may commit the mtr too early by PageBulk::release()



            If the mtr is comimtted by PageBulk::release(), the rwlock is released, and the dirty page can be flush to disk. But the index page header info is not fixed.

            So if
            1. a checkpoint happened after PageBulk::release() and before PageBulk::commit().
            2. mysqld is killed (OOM or somehow) before InnoDB do a new checkpoint
            3. the crash recover after 2 will crash as showed in the previous stack



            Bellow is how I manually reproed this (need gdb asssitance):

            0. in my.cnf set innodb_sort_buffer_size=65536, make sure PageBulk::release() will be invoked.

            1. prepare data into t1
            create table t1(id int auto_increment, name varchar(30), primary key(id)) engine=innodb;

            insert into t1 values (1, "MySQL"), (2, "MariaDB"), (3, "AlisQL"), (4, "PolarDB"), (5, "hahaha");


            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;
            insert into t1(name) select a.name from t1 a, t1 b limit 5000;

            2. set gdb breakpoint at PageBulk::release()

            3. run
            optimize table t1;

            4. gdb will break at PageBulk::release()
            make PageBulk::release() finish, by 2 or more finish commands

            then using gdb maually call os_thread_sleep(30*1000000) to block current bulk load thread, and this will give page cleaner enougth time to flush all dirty pages and advance checkpoint

            5. after os_thread_sleep() return, using same gdb trick in step 4 to block page cleaner thread forever
            , such as call os_thread_sleep(3000*1000000)

            6. optimize query in step 3 will finish

            7. using `show engine innodb status`, we can see there is some un-checkpointed redo

            8. then kill -9 mysqld, and the crash recovery will crash
            elenst Elena Stepanova made changes -
            Fix Version/s 10.5 [ 23123 ]
            Assignee Marko Mäkelä [ marko ]
            Bernardo Perez Bernardo Perez added a comment - - edited

            We seem to have hit this same issue in 10.3.8

            Thread 7 seems to show a similar stack as the one provided by Fungo on the initial bug report

            (gdb) 
             
            Thread 18 (Thread 0x2b94f1e00700 (LWP 25428)):
            #0  0x00002b6ef9692a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681d431 in os_event::timed_wait (this=this@entry=0x2b6efafde1e0, abstime=abstime@entry=0x2b94f1dffdb0) at /local/MariaDB/storage/innobase/os/os0event.cc:283
            #2  0x000055c76681da59 in wait_time_low (reset_sig_count=1, time_in_usec=100000, this=0x2b6efafde1e0) at /local/MariaDB/storage/innobase/os/os0event.cc:405
            #3  os_event_wait_time_low (event=0x2b6efafde1e0, time_in_usec=time_in_usec@entry=100000, reset_sig_count=<optimized out>) at /local/MariaDB/storage/innobase/os/os0event.cc:505
            #4  0x000055c766804a20 in recv_writer_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/log/log0recv.cc:526
            #5  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #6  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 16 (Thread 0x2b94f0a00700 (LWP 25420)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b935f52a270) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=15, this=0x2b935f52a270) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=0x2b935f52a270, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766947b43 in buf_flush_page_cleaner_worker (arg=<optimized out>) at /local/MariaDB/storage/innobase/buf/buf0flu.cc:3498
            #5  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #6  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 15 (Thread 0x2b94f0400700 (LWP 25419)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b935f52a270) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=15, this=0x2b935f52a270) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=0x2b935f52a270, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766947b43 in buf_flush_page_cleaner_worker (arg=<optimized out>) at /local/MariaDB/storage/innobase/buf/buf0flu.cc:3498
            #5  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #6  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 14 (Thread 0x2b94efe00700 (LWP 25418)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b935f52a270) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=15, this=0x2b935f52a270) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=0x2b935f52a270, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766947b43 in buf_flush_page_cleaner_worker (arg=<optimized out>) at /local/MariaDB/storage/innobase/buf/buf0flu.cc:3498
            #5  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #6  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 13 (Thread 0x2b94ef800700 (LWP 25417)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b935f529fd0) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=13, this=0x2b935f529fd0) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=0x2b935f529fd0, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766949695 in buf_flush_page_cleaner_coordinator () at /local/MariaDB/storage/innobase/buf/buf0flu.cc:3092
            #5  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #6  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 12 (Thread 0x2b94ef200700 (LWP 25416)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b6efafdee20) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=1, this=0x2b6efafdee20, this@entry=<error reading variable: Cannot access memory at address 0xfffffffffffffea1>) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=event@entry=0x2b6efafdee20, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766819bda in os_aio_simulated_handler (type=0x2b94ef1ffcb0, m2=0x2b94ef1ffca8, m1=0x2b94ef1ffca0, global_segment=9) at /local/MariaDB/storage/innobase/os/os0file.cc:7233
            #5  os_aio_handler (segment=segment@entry=9, m1=m1@entry=0x2b94ef1ffca0, m2=m2@entry=0x2b94ef1ffca8, request=request@entry=0x2b94ef1ffcb0) at /local/MariaDB/storage/innobase/os/os0file.cc:5670
            #6  0x000055c76699b7ad in fil_aio_wait (segment=segment@entry=9) at /local/MariaDB/storage/innobase/fil/fil0fil.cc:4559
            #7  0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc:331
            #8  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #9  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 11 (Thread 0x2b94eec00700 (LWP 25415)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b6efafdedb0) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=1, this=0x2b6efafdedb0, this@entry=<error reading variable: Cannot access memory at address 0xfffffffffffffea1>) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=event@entry=0x2b6efafdedb0, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766819bda in os_aio_simulated_handler (type=0x2b94eebffcb0, m2=0x2b94eebffca8, m1=0x2b94eebffca0, global_segment=8) at /local/MariaDB/storage/innobase/os/os0file.cc:7233
            #5  os_aio_handler (segment=segment@entry=8, m1=m1@entry=0x2b94eebffca0, m2=m2@entry=0x2b94eebffca8, request=request@entry=0x2b94eebffcb0) at /local/MariaDB/storage/innobase/os/os0file.cc:5670
            #6  0x000055c76699b7ad in fil_aio_wait (segment=segment@entry=8) at /local/MariaDB/storage/innobase/fil/fil0fil.cc:4559
            #7  0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc:331
            #8  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #9  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 10 (Thread 0x2b94ee600700 (LWP 25414)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b6efafded40) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=1, this=0x2b6efafded40, this@entry=<error reading variable: Cannot access memory at address 0xfffffffffffffea1>) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=event@entry=0x2b6efafded40, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766819bda in os_aio_simulated_handler (type=0x2b94ee5ffcb0, m2=0x2b94ee5ffca8, m1=0x2b94ee5ffca0, global_segment=7) at /local/MariaDB/storage/innobase/os/os0file.cc:7233
            #5  os_aio_handler (segment=segment@entry=7, m1=m1@entry=0x2b94ee5ffca0, m2=m2@entry=0x2b94ee5ffca8, request=request@entry=0x2b94ee5ffcb0) at /local/MariaDB/storage/innobase/os/os0file.cc:5670
            #6  0x000055c76699b7ad in fil_aio_wait (segment=segment@entry=7) at /local/MariaDB/storage/innobase/fil/fil0fil.cc:4559
            #7  0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc:331
            #8  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #9  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 9 (Thread 0x2b94ee000700 (LWP 25413)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b6efafdecd0) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=1, this=0x2b6efafdecd0, this@entry=<error reading variable: Cannot access memory at address 0xfffffffffffffea1>) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=event@entry=0x2b6efafdecd0, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766819bda in os_aio_simulated_handler (type=0x2b94edfffcb0, m2=0x2b94edfffca8, m1=0x2b94edfffca0, global_segment=6) at /local/MariaDB/storage/innobase/os/os0file.cc:7233
            #5  os_aio_handler (segment=segment@entry=6, m1=m1@entry=0x2b94edfffca0, m2=m2@entry=0x2b94edfffca8, request=request@entry=0x2b94edfffcb0) at /local/MariaDB/storage/innobase/os/os0file.cc:5670
            #6  0x000055c76699b7ad in fil_aio_wait (segment=segment@entry=6) at /local/MariaDB/storage/innobase/fil/fil0fil.cc:4559
            #7  0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc:331
            #8  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #9  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 8 (Thread 0x2b94eda00700 (LWP 25412)):
            #0  0x00002b6ef9695a93 in pread64 () from /lib64/libpthread.so.0
            #1  0x000055c766818935 in pread (__offset=33390592, __nbytes=16384, __buf=0x2b6efdfd8000, __fd=10) at /usr/include/bits/unistd.h:100
            #2  execute (request=..., this=<synthetic pointer>) at /local/MariaDB/storage/innobase/os/os0file.cc:1613
            #3  os_file_io (in_type=..., file=file@entry=10, buf=buf@entry=0x2b6efdfd8000, n=n@entry=16384, offset=offset@entry=33390592, err=err@entry=0x2b94ed9ff92c) at /local/MariaDB/storage/innobase/os/os0file.cc:4869
            #4  0x000055c766819361 in os_file_pread (err=0x2b94ed9ff92c, offset=33390592, n=16384, buf=0x2b6efdfd8000, file=10, type=...) at /local/MariaDB/storage/innobase/os/os0file.cc:5038
            #5  os_file_read_page (type=..., file=file@entry=10, buf=buf@entry=0x2b6efdfd8000, offset=offset@entry=33390592, n=n@entry=16384, o=o@entry=0x0, exit_on_err=true) at /local/MariaDB/storage/innobase/os/os0file.cc:5072
            #6  0x000055c76681a215 in os_file_read_func (n=16384, offset=33390592, buf=0x2b6efdfd8000, file=<optimized out>, type=...) at /local/MariaDB/storage/innobase/os/os0file.cc:5431
            #7  pfs_os_file_read_func (src_file=0x55c766e695d0 "/local/MariaDB/storage/innobase/os/os0file.cc", src_line=6958, n=16384, offset=33390592, buf=0x2b6efdfd8000, type=..., file=...) at /local/MariaDB/storage/innobase/include/os0file.ic:296
            #8  read (this=<synthetic pointer>, slot=0x2b6efda42398) at /local/MariaDB/storage/innobase/os/os0file.cc:6953
            #9  io (this=<synthetic pointer>) at /local/MariaDB/storage/innobase/os/os0file.cc:6917
            #10 os_aio_simulated_handler (type=0x2b94ed9ffcb0, m2=0x2b94ed9ffca8, m1=0x2b94ed9ffca0, global_segment=5) at /local/MariaDB/storage/innobase/os/os0file.cc:7262
            #11 os_aio_handler (segment=segment@entry=5, m1=m1@entry=0x2b94ed9ffca0, m2=m2@entry=0x2b94ed9ffca8, request=request@entry=0x2b94ed9ffcb0) at /local/MariaDB/storage/innobase/os/os0file.cc:5670
            #12 0x000055c76699b7ad in fil_aio_wait (segment=segment@entry=5) at /local/MariaDB/storage/innobase/fil/fil0fil.cc:4559
            #13 0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc:331
            #14 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #15 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 7 (Thread 0x2b94ed400700 (LWP 25411)):
            #0  0x00002b6ef9df05f7 in raise () from /lib64/libc.so.6
            #1  0x00002b6ef9df1ce8 in abort () from /lib64/libc.so.6
            #2  0x000055c766285687 in ut_dbg_assertion_failed (expr=expr@entry=0x0, file=file@entry=0x55c766e6a640 "/local/MariaDB/storage/innobase/page/page0cur.cc", line=line@entry=1189) at /local/MariaDB/storage/innobase/ut/ut0dbg.cc:61
            #3  0x000055c766821f53 in page_cur_parse_insert_rec (is_short=is_short@entry=0, ptr=<optimized out>, ptr@entry=0x2b6efdf00360 <incomplete sequence \366\201\241>, end_ptr=end_ptr@entry=0x2b6efdf00437 "1\b", block=block@entry=0x2b8ed5600000, index=0x2b94edc1f070, mtr=mtr@entry=0x2b94ed3ff590)
                at /local/MariaDB/storage/innobase/page/page0cur.cc:1189
            #4  0x000055c76680674f in recv_parse_or_apply_log_rec_body (type=MLOG_COMP_REC_INSERT, ptr=0x2b6efdf00360 <incomplete sequence \366\201\241>, ptr@entry=0x2b6efdf00330 "", end_ptr=0x2b6efdf00437 "1\b", space_id=<optimized out>, page_no=2482, apply=apply@entry=true, block=0x2b8ed5600000, mtr=0x2b94ed3ff590)
                at /local/MariaDB/storage/innobase/log/log0recv.cc:1248
            #5  0x000055c766807ce5 in recv_recover_page (just_read_in=just_read_in@entry=true, block=block@entry=0x2b8ed5600000) at /local/MariaDB/storage/innobase/log/log0recv.cc:1799
            #6  0x000055c7669324c2 in buf_page_io_complete (bpage=0x2b8ed5600000, dblwr=dblwr@entry=true, evict=evict@entry=false) at /local/MariaDB/storage/innobase/buf/buf0buf.cc:6241
            #7  0x000055c76699b8f7 in fil_aio_wait (segment=segment@entry=4) at /local/MariaDB/storage/innobase/fil/fil0fil.cc:4624
            #8  0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc:331
            #9  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #10 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 6 (Thread 0x2b94ece00700 (LWP 25410)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b6efafdeb80) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=3, this=0x2b6efafdeb80, this@entry=<error reading variable: Cannot access memory at address 0xfffffffffffffea3>) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=event@entry=0x2b6efafdeb80, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766819bda in os_aio_simulated_handler (type=0x2b94ecdffcb0, m2=0x2b94ecdffca8, m1=0x2b94ecdffca0, global_segment=3) at /local/MariaDB/storage/innobase/os/os0file.cc:7233
            #5  os_aio_handler (segment=segment@entry=3, m1=m1@entry=0x2b94ecdffca0, m2=m2@entry=0x2b94ecdffca8, request=request@entry=0x2b94ecdffcb0) at /local/MariaDB/storage/innobase/os/os0file.cc:5670
            #6  0x000055c76699b7ad in fil_aio_wait (segment=segment@entry=3) at /local/MariaDB/storage/innobase/fil/fil0fil.cc:4559
            #7  0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc:331
            #8  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #9  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 5 (Thread 0x2b94ec800700 (LWP 25409)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b6efafdeb10) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=2, this=0x2b6efafdeb10, this@entry=<error reading variable: Cannot access memory at address 0xfffffffffffffea2>) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=event@entry=0x2b6efafdeb10, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766819bda in os_aio_simulated_handler (type=0x2b94ec7ffcb0, m2=0x2b94ec7ffca8, m1=0x2b94ec7ffca0, global_segment=2) at /local/MariaDB/storage/innobase/os/os0file.cc:7233
            #5  os_aio_handler (segment=segment@entry=2, m1=m1@entry=0x2b94ec7ffca0, m2=m2@entry=0x2b94ec7ffca8, request=request@entry=0x2b94ec7ffcb0) at /local/MariaDB/storage/innobase/os/os0file.cc:5670
            #6  0x000055c76699b7ad in fil_aio_wait (segment=segment@entry=2) at /local/MariaDB/storage/innobase/fil/fil0fil.cc:4559
            #7  0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc:331
            #8  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #9  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 4 (Thread 0x2b94ec201700 (LWP 25408)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b6efafdeaa0) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=2, this=0x2b6efafdeaa0, this@entry=<error reading variable: Cannot access memory at address 0xfffffffffffffea2>) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=event@entry=0x2b6efafdeaa0, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766819bda in os_aio_simulated_handler (type=0x2b94ec200cb0, m2=0x2b94ec200ca8, m1=0x2b94ec200ca0, global_segment=1) at /local/MariaDB/storage/innobase/os/os0file.cc:7233
            #5  os_aio_handler (segment=segment@entry=1, m1=m1@entry=0x2b94ec200ca0, m2=m2@entry=0x2b94ec200ca8, request=request@entry=0x2b94ec200cb0) at /local/MariaDB/storage/innobase/os/os0file.cc:5670
            #6  0x000055c76699b7ad in fil_aio_wait (segment=segment@entry=1) at /local/MariaDB/storage/innobase/fil/fil0fil.cc:4559
            #7  0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc:331
            #8  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #9  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 3 (Thread 0x2b94ec000700 (LWP 25407)):
            #0  0x00002b6ef96926d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c76681db9f in wait (this=0x2b6efafdea30) at /local/MariaDB/storage/innobase/os/os0event.cc:163
            #2  wait_low (reset_sig_count=1, this=0x2b6efafdea30, this@entry=<error reading variable: Cannot access memory at address 0xfffffffffffffea1>) at /local/MariaDB/storage/innobase/os/os0event.cc:333
            #3  os_event_wait_low (event=event@entry=0x2b6efafdea30, reset_sig_count=reset_sig_count@entry=0) at /local/MariaDB/storage/innobase/os/os0event.cc:522
            #4  0x000055c766819bda in os_aio_simulated_handler (type=0x2b94ebfffcb0, m2=0x2b94ebfffca8, m1=0x2b94ebfffca0, global_segment=0) at /local/MariaDB/storage/innobase/os/os0file.cc:7233
            #5  os_aio_handler (segment=segment@entry=0, m1=m1@entry=0x2b94ebfffca0, m2=m2@entry=0x2b94ebfffca8, request=request@entry=0x2b94ebfffcb0) at /local/MariaDB/storage/innobase/os/os0file.cc:5670
            #6  0x000055c76699b7ad in fil_aio_wait (segment=segment@entry=0) at /local/MariaDB/storage/innobase/fil/fil0fil.cc:4559
            #7  0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc:331
            #8  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #9  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 2 (Thread 0x2b6efb200700 (LWP 25233)):
            #0  0x00002b6ef9692a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
            #1  0x000055c766a92d59 in inline_mysql_cond_timedwait (that=0x55c767fa6460 <COND_timer>, mutex=0x55c767fa64a0 <LOCK_timer>, src_file=0x55c766ea30d8 "/local/MariaDB/mysys/thr_timer.c", src_line=292, abstime=0x2b6efb1ffe10) at /local/MariaDB/include/mysql/psi/mysql_thread.h:1215
            #2  timer_handler (arg=<optimized out>) at /local/MariaDB/mysys/thr_timer.c:292
            #3  0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so.0
            #4  0x00002b6ef9eb1c9d in clone () from /lib64/libc.so.6
             
            Thread 1 (Thread 0x2b6ef87d1f00 (LWP 25232)):
            #0  0x00002b6ef969596d in nanosleep () from /lib64/libpthread.so.0
            #1  0x000055c76681ddec in os_thread_sleep (tm=tm@entry=500000) at /local/MariaDB/storage/innobase/os/os0thread.cc:230
            #2  0x000055c766808760 in recv_apply_hashed_log_recs (last_batch=last_batch@entry=true) at /local/MariaDB/storage/innobase/log/log0recv.cc:2029
            #3  0x000055c7668b5cbc in srv_start (create_new_db=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc:2033
            #4  0x000055c7667af722 in innodb_init (p=<optimized out>) at /local/MariaDB/storage/innobase/handler/ha_innodb.cc:4271
            #5  0x000055c766523887 in ha_initialize_handlerton (plugin=0x2b6efadb7d08) at /local/MariaDB/sql/handler.cc:522
            #6  0x000055c76636954b in plugin_initialize (tmp_root=tmp_root@entry=0x7ffff2724fe0, plugin=plugin@entry=0x2b6efadb7d08, argc=argc@entry=0x55c767718e28 <remaining_argc>, argv=argv@entry=0x2b6efabd9138, options_only=options_only@entry=false) at /local/MariaDB/sql/sql_plugin.cc:1432
            #7  0x000055c76636a69a in plugin_init (argc=argc@entry=0x55c767718e28 <remaining_argc>, argv=0x2b6efabd9138, flags=2) at /local/MariaDB/sql/sql_plugin.cc:1714
            #8  0x000055c7662a6051 in init_server_components () at /local/MariaDB/sql/mysqld.cc:5375
            #9  0x000055c7662ac095 in mysqld_main (argc=<optimized out>, argv=<optimized out>) at /local/MariaDB/sql/mysqld.cc:5982
            #10 0x00002b6ef9ddcb15 in __libc_start_main () from /lib64/libc.so.6
            #11 0x000055c76629f545 in _start ()
            

            Bernardo Perez Bernardo Perez added a comment - - edited We seem to have hit this same issue in 10.3.8 Thread 7 seems to show a similar stack as the one provided by Fungo on the initial bug report (gdb)   Thread 18 (Thread 0x2b94f1e00700 (LWP 25428 )): # 0 0x00002b6ef9692a82 in pthread_cond_timedwait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681d431 in os_event::timed_wait ( this = this @entry = 0x2b6efafde1e0 , abstime=abstime @entry = 0x2b94f1dffdb0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 283 # 2 0x000055c76681da59 in wait_time_low (reset_sig_count= 1 , time_in_usec= 100000 , this = 0x2b6efafde1e0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 405 # 3 os_event_wait_time_low (event= 0x2b6efafde1e0 , time_in_usec=time_in_usec @entry = 100000 , reset_sig_count=<optimized out>) at /local/MariaDB/storage/innobase/os/os0event.cc: 505 # 4 0x000055c766804a20 in recv_writer_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/log/log0recv.cc: 526 # 5 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 6 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 16 (Thread 0x2b94f0a00700 (LWP 25420 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b935f52a270 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 15 , this = 0x2b935f52a270 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event= 0x2b935f52a270 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766947b43 in buf_flush_page_cleaner_worker (arg=<optimized out>) at /local/MariaDB/storage/innobase/buf/buf0flu.cc: 3498 # 5 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 6 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 15 (Thread 0x2b94f0400700 (LWP 25419 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b935f52a270 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 15 , this = 0x2b935f52a270 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event= 0x2b935f52a270 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766947b43 in buf_flush_page_cleaner_worker (arg=<optimized out>) at /local/MariaDB/storage/innobase/buf/buf0flu.cc: 3498 # 5 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 6 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 14 (Thread 0x2b94efe00700 (LWP 25418 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b935f52a270 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 15 , this = 0x2b935f52a270 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event= 0x2b935f52a270 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766947b43 in buf_flush_page_cleaner_worker (arg=<optimized out>) at /local/MariaDB/storage/innobase/buf/buf0flu.cc: 3498 # 5 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 6 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 13 (Thread 0x2b94ef800700 (LWP 25417 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b935f529fd0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 13 , this = 0x2b935f529fd0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event= 0x2b935f529fd0 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766949695 in buf_flush_page_cleaner_coordinator () at /local/MariaDB/storage/innobase/buf/buf0flu.cc: 3092 # 5 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 6 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 12 (Thread 0x2b94ef200700 (LWP 25416 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b6efafdee20 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 1 , this = 0x2b6efafdee20 , this @entry =<error reading variable: Cannot access memory at address 0xfffffffffffffea1 >) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event=event @entry = 0x2b6efafdee20 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766819bda in os_aio_simulated_handler (type= 0x2b94ef1ffcb0 , m2= 0x2b94ef1ffca8 , m1= 0x2b94ef1ffca0 , global_segment= 9 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 7233 # 5 os_aio_handler (segment=segment @entry = 9 , m1=m1 @entry = 0x2b94ef1ffca0 , m2=m2 @entry = 0x2b94ef1ffca8 , request=request @entry = 0x2b94ef1ffcb0 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 5670 # 6 0x000055c76699b7ad in fil_aio_wait (segment=segment @entry = 9 ) at /local/MariaDB/storage/innobase/fil/fil0fil.cc: 4559 # 7 0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc: 331 # 8 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 9 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 11 (Thread 0x2b94eec00700 (LWP 25415 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b6efafdedb0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 1 , this = 0x2b6efafdedb0 , this @entry =<error reading variable: Cannot access memory at address 0xfffffffffffffea1 >) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event=event @entry = 0x2b6efafdedb0 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766819bda in os_aio_simulated_handler (type= 0x2b94eebffcb0 , m2= 0x2b94eebffca8 , m1= 0x2b94eebffca0 , global_segment= 8 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 7233 # 5 os_aio_handler (segment=segment @entry = 8 , m1=m1 @entry = 0x2b94eebffca0 , m2=m2 @entry = 0x2b94eebffca8 , request=request @entry = 0x2b94eebffcb0 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 5670 # 6 0x000055c76699b7ad in fil_aio_wait (segment=segment @entry = 8 ) at /local/MariaDB/storage/innobase/fil/fil0fil.cc: 4559 # 7 0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc: 331 # 8 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 9 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 10 (Thread 0x2b94ee600700 (LWP 25414 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b6efafded40 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 1 , this = 0x2b6efafded40 , this @entry =<error reading variable: Cannot access memory at address 0xfffffffffffffea1 >) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event=event @entry = 0x2b6efafded40 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766819bda in os_aio_simulated_handler (type= 0x2b94ee5ffcb0 , m2= 0x2b94ee5ffca8 , m1= 0x2b94ee5ffca0 , global_segment= 7 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 7233 # 5 os_aio_handler (segment=segment @entry = 7 , m1=m1 @entry = 0x2b94ee5ffca0 , m2=m2 @entry = 0x2b94ee5ffca8 , request=request @entry = 0x2b94ee5ffcb0 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 5670 # 6 0x000055c76699b7ad in fil_aio_wait (segment=segment @entry = 7 ) at /local/MariaDB/storage/innobase/fil/fil0fil.cc: 4559 # 7 0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc: 331 # 8 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 9 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 9 (Thread 0x2b94ee000700 (LWP 25413 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b6efafdecd0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 1 , this = 0x2b6efafdecd0 , this @entry =<error reading variable: Cannot access memory at address 0xfffffffffffffea1 >) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event=event @entry = 0x2b6efafdecd0 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766819bda in os_aio_simulated_handler (type= 0x2b94edfffcb0 , m2= 0x2b94edfffca8 , m1= 0x2b94edfffca0 , global_segment= 6 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 7233 # 5 os_aio_handler (segment=segment @entry = 6 , m1=m1 @entry = 0x2b94edfffca0 , m2=m2 @entry = 0x2b94edfffca8 , request=request @entry = 0x2b94edfffcb0 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 5670 # 6 0x000055c76699b7ad in fil_aio_wait (segment=segment @entry = 6 ) at /local/MariaDB/storage/innobase/fil/fil0fil.cc: 4559 # 7 0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc: 331 # 8 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 9 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 8 (Thread 0x2b94eda00700 (LWP 25412 )): # 0 0x00002b6ef9695a93 in pread64 () from /lib64/libpthread.so. 0 # 1 0x000055c766818935 in pread (__offset= 33390592 , __nbytes= 16384 , __buf= 0x2b6efdfd8000 , __fd= 10 ) at /usr/include/bits/unistd.h: 100 # 2 execute (request=..., this =<synthetic pointer>) at /local/MariaDB/storage/innobase/os/os0file.cc: 1613 # 3 os_file_io (in_type=..., file=file @entry = 10 , buf=buf @entry = 0x2b6efdfd8000 , n=n @entry = 16384 , offset=offset @entry = 33390592 , err=err @entry = 0x2b94ed9ff92c ) at /local/MariaDB/storage/innobase/os/os0file.cc: 4869 # 4 0x000055c766819361 in os_file_pread (err= 0x2b94ed9ff92c , offset= 33390592 , n= 16384 , buf= 0x2b6efdfd8000 , file= 10 , type=...) at /local/MariaDB/storage/innobase/os/os0file.cc: 5038 # 5 os_file_read_page (type=..., file=file @entry = 10 , buf=buf @entry = 0x2b6efdfd8000 , offset=offset @entry = 33390592 , n=n @entry = 16384 , o=o @entry = 0x0 , exit_on_err= true ) at /local/MariaDB/storage/innobase/os/os0file.cc: 5072 # 6 0x000055c76681a215 in os_file_read_func (n= 16384 , offset= 33390592 , buf= 0x2b6efdfd8000 , file=<optimized out>, type=...) at /local/MariaDB/storage/innobase/os/os0file.cc: 5431 # 7 pfs_os_file_read_func (src_file= 0x55c766e695d0 "/local/MariaDB/storage/innobase/os/os0file.cc" , src_line= 6958 , n= 16384 , offset= 33390592 , buf= 0x2b6efdfd8000 , type=..., file=...) at /local/MariaDB/storage/innobase/include/os0file.ic: 296 # 8 read ( this =<synthetic pointer>, slot= 0x2b6efda42398 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 6953 # 9 io ( this =<synthetic pointer>) at /local/MariaDB/storage/innobase/os/os0file.cc: 6917 # 10 os_aio_simulated_handler (type= 0x2b94ed9ffcb0 , m2= 0x2b94ed9ffca8 , m1= 0x2b94ed9ffca0 , global_segment= 5 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 7262 # 11 os_aio_handler (segment=segment @entry = 5 , m1=m1 @entry = 0x2b94ed9ffca0 , m2=m2 @entry = 0x2b94ed9ffca8 , request=request @entry = 0x2b94ed9ffcb0 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 5670 # 12 0x000055c76699b7ad in fil_aio_wait (segment=segment @entry = 5 ) at /local/MariaDB/storage/innobase/fil/fil0fil.cc: 4559 # 13 0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc: 331 # 14 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 15 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 7 (Thread 0x2b94ed400700 (LWP 25411 )): # 0 0x00002b6ef9df05f7 in raise () from /lib64/libc.so. 6 # 1 0x00002b6ef9df1ce8 in abort () from /lib64/libc.so. 6 # 2 0x000055c766285687 in ut_dbg_assertion_failed (expr=expr @entry = 0x0 , file=file @entry = 0x55c766e6a640 "/local/MariaDB/storage/innobase/page/page0cur.cc" , line=line @entry = 1189 ) at /local/MariaDB/storage/innobase/ut/ut0dbg.cc: 61 # 3 0x000055c766821f53 in page_cur_parse_insert_rec (is_short=is_short @entry = 0 , ptr=<optimized out>, ptr @entry = 0x2b6efdf00360 <incomplete sequence \ 366 \ 201 \ 241 >, end_ptr=end_ptr @entry = 0x2b6efdf00437 "1\b" , block=block @entry = 0x2b8ed5600000 , index= 0x2b94edc1f070 , mtr=mtr @entry = 0x2b94ed3ff590 ) at /local/MariaDB/storage/innobase/page/page0cur.cc: 1189 # 4 0x000055c76680674f in recv_parse_or_apply_log_rec_body (type=MLOG_COMP_REC_INSERT, ptr= 0x2b6efdf00360 <incomplete sequence \ 366 \ 201 \ 241 >, ptr @entry = 0x2b6efdf00330 "" , end_ptr= 0x2b6efdf00437 "1\b" , space_id=<optimized out>, page_no= 2482 , apply=apply @entry = true , block= 0x2b8ed5600000 , mtr= 0x2b94ed3ff590 ) at /local/MariaDB/storage/innobase/log/log0recv.cc: 1248 # 5 0x000055c766807ce5 in recv_recover_page (just_read_in=just_read_in @entry = true , block=block @entry = 0x2b8ed5600000 ) at /local/MariaDB/storage/innobase/log/log0recv.cc: 1799 # 6 0x000055c7669324c2 in buf_page_io_complete (bpage= 0x2b8ed5600000 , dblwr=dblwr @entry = true , evict=evict @entry = false ) at /local/MariaDB/storage/innobase/buf/buf0buf.cc: 6241 # 7 0x000055c76699b8f7 in fil_aio_wait (segment=segment @entry = 4 ) at /local/MariaDB/storage/innobase/fil/fil0fil.cc: 4624 # 8 0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc: 331 # 9 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 10 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 6 (Thread 0x2b94ece00700 (LWP 25410 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b6efafdeb80 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 3 , this = 0x2b6efafdeb80 , this @entry =<error reading variable: Cannot access memory at address 0xfffffffffffffea3 >) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event=event @entry = 0x2b6efafdeb80 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766819bda in os_aio_simulated_handler (type= 0x2b94ecdffcb0 , m2= 0x2b94ecdffca8 , m1= 0x2b94ecdffca0 , global_segment= 3 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 7233 # 5 os_aio_handler (segment=segment @entry = 3 , m1=m1 @entry = 0x2b94ecdffca0 , m2=m2 @entry = 0x2b94ecdffca8 , request=request @entry = 0x2b94ecdffcb0 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 5670 # 6 0x000055c76699b7ad in fil_aio_wait (segment=segment @entry = 3 ) at /local/MariaDB/storage/innobase/fil/fil0fil.cc: 4559 # 7 0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc: 331 # 8 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 9 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 5 (Thread 0x2b94ec800700 (LWP 25409 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b6efafdeb10 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 2 , this = 0x2b6efafdeb10 , this @entry =<error reading variable: Cannot access memory at address 0xfffffffffffffea2 >) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event=event @entry = 0x2b6efafdeb10 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766819bda in os_aio_simulated_handler (type= 0x2b94ec7ffcb0 , m2= 0x2b94ec7ffca8 , m1= 0x2b94ec7ffca0 , global_segment= 2 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 7233 # 5 os_aio_handler (segment=segment @entry = 2 , m1=m1 @entry = 0x2b94ec7ffca0 , m2=m2 @entry = 0x2b94ec7ffca8 , request=request @entry = 0x2b94ec7ffcb0 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 5670 # 6 0x000055c76699b7ad in fil_aio_wait (segment=segment @entry = 2 ) at /local/MariaDB/storage/innobase/fil/fil0fil.cc: 4559 # 7 0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc: 331 # 8 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 9 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 4 (Thread 0x2b94ec201700 (LWP 25408 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b6efafdeaa0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 2 , this = 0x2b6efafdeaa0 , this @entry =<error reading variable: Cannot access memory at address 0xfffffffffffffea2 >) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event=event @entry = 0x2b6efafdeaa0 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766819bda in os_aio_simulated_handler (type= 0x2b94ec200cb0 , m2= 0x2b94ec200ca8 , m1= 0x2b94ec200ca0 , global_segment= 1 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 7233 # 5 os_aio_handler (segment=segment @entry = 1 , m1=m1 @entry = 0x2b94ec200ca0 , m2=m2 @entry = 0x2b94ec200ca8 , request=request @entry = 0x2b94ec200cb0 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 5670 # 6 0x000055c76699b7ad in fil_aio_wait (segment=segment @entry = 1 ) at /local/MariaDB/storage/innobase/fil/fil0fil.cc: 4559 # 7 0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc: 331 # 8 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 9 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 3 (Thread 0x2b94ec000700 (LWP 25407 )): # 0 0x00002b6ef96926d5 in pthread_cond_wait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c76681db9f in wait ( this = 0x2b6efafdea30 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 163 # 2 wait_low (reset_sig_count= 1 , this = 0x2b6efafdea30 , this @entry =<error reading variable: Cannot access memory at address 0xfffffffffffffea1 >) at /local/MariaDB/storage/innobase/os/os0event.cc: 333 # 3 os_event_wait_low (event=event @entry = 0x2b6efafdea30 , reset_sig_count=reset_sig_count @entry = 0 ) at /local/MariaDB/storage/innobase/os/os0event.cc: 522 # 4 0x000055c766819bda in os_aio_simulated_handler (type= 0x2b94ebfffcb0 , m2= 0x2b94ebfffca8 , m1= 0x2b94ebfffca0 , global_segment= 0 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 7233 # 5 os_aio_handler (segment=segment @entry = 0 , m1=m1 @entry = 0x2b94ebfffca0 , m2=m2 @entry = 0x2b94ebfffca8 , request=request @entry = 0x2b94ebfffcb0 ) at /local/MariaDB/storage/innobase/os/os0file.cc: 5670 # 6 0x000055c76699b7ad in fil_aio_wait (segment=segment @entry = 0 ) at /local/MariaDB/storage/innobase/fil/fil0fil.cc: 4559 # 7 0x000055c7668b06f0 in io_handler_thread (arg=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc: 331 # 8 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 9 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 2 (Thread 0x2b6efb200700 (LWP 25233 )): # 0 0x00002b6ef9692a82 in pthread_cond_timedwait@ @GLIBC_2 . 3.2 () from /lib64/libpthread.so. 0 # 1 0x000055c766a92d59 in inline_mysql_cond_timedwait (that= 0x55c767fa6460 <COND_timer>, mutex= 0x55c767fa64a0 <LOCK_timer>, src_file= 0x55c766ea30d8 "/local/MariaDB/mysys/thr_timer.c" , src_line= 292 , abstime= 0x2b6efb1ffe10 ) at /local/MariaDB/include/mysql/psi/mysql_thread.h: 1215 # 2 timer_handler (arg=<optimized out>) at /local/MariaDB/mysys/thr_timer.c: 292 # 3 0x00002b6ef968edc5 in start_thread () from /lib64/libpthread.so. 0 # 4 0x00002b6ef9eb1c9d in clone () from /lib64/libc.so. 6   Thread 1 (Thread 0x2b6ef87d1f00 (LWP 25232 )): # 0 0x00002b6ef969596d in nanosleep () from /lib64/libpthread.so. 0 # 1 0x000055c76681ddec in os_thread_sleep (tm=tm @entry = 500000 ) at /local/MariaDB/storage/innobase/os/os0thread.cc: 230 # 2 0x000055c766808760 in recv_apply_hashed_log_recs (last_batch=last_batch @entry = true ) at /local/MariaDB/storage/innobase/log/log0recv.cc: 2029 # 3 0x000055c7668b5cbc in srv_start (create_new_db=<optimized out>) at /local/MariaDB/storage/innobase/srv/srv0start.cc: 2033 # 4 0x000055c7667af722 in innodb_init (p=<optimized out>) at /local/MariaDB/storage/innobase/handler/ha_innodb.cc: 4271 # 5 0x000055c766523887 in ha_initialize_handlerton (plugin= 0x2b6efadb7d08 ) at /local/MariaDB/sql/handler.cc: 522 # 6 0x000055c76636954b in plugin_initialize (tmp_root=tmp_root @entry = 0x7ffff2724fe0 , plugin=plugin @entry = 0x2b6efadb7d08 , argc=argc @entry = 0x55c767718e28 <remaining_argc>, argv=argv @entry = 0x2b6efabd9138 , options_only=options_only @entry = false ) at /local/MariaDB/sql/sql_plugin.cc: 1432 # 7 0x000055c76636a69a in plugin_init (argc=argc @entry = 0x55c767718e28 <remaining_argc>, argv= 0x2b6efabd9138 , flags= 2 ) at /local/MariaDB/sql/sql_plugin.cc: 1714 # 8 0x000055c7662a6051 in init_server_components () at /local/MariaDB/sql/mysqld.cc: 5375 # 9 0x000055c7662ac095 in mysqld_main (argc=<optimized out>, argv=<optimized out>) at /local/MariaDB/sql/mysqld.cc: 5982 # 10 0x00002b6ef9ddcb15 in __libc_start_main () from /lib64/libc.so. 6 # 11 0x000055c76629f545 in _start ()
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Affects Version/s 10.4.0 [ 23115 ]
            Affects Version/s 10.3.9 [ 23114 ]
            Affects Version/s 10.2.17 [ 23111 ]
            Labels corruption

            Thank you, fungo! I would tend to believe that this bug does not affects MySQL 10.5.2 or later, because MDEV-12353 changed BtrBulk to perform physical logging.

            If I understand correctly, porting the fix from MySQL 8.0 could allow us to finally remove buf_block_t::skip_flush_check. I think that it is worth the effort for that effect alone.

            The hang of thread 1 posted by Bernardo Perez might be due to page corruption. We fixed something in that area in MDEV-12699 and even later. Besides, I would not recommend using any 10.3 version older than 10.3.17 because of the corruption bug MDEV-19916.

            mleich, can you please try to reproduce the problem on the latest 10.2, 10.3, or 10.4, or even on mariadb-10.5.0? I think that we first need a somewhat repeatable test case so that we can be confident about applying the fix.

            marko Marko Mäkelä added a comment - Thank you, fungo ! I would tend to believe that this bug does not affects MySQL 10.5.2 or later, because MDEV-12353 changed BtrBulk to perform physical logging. If I understand correctly, porting the fix from MySQL 8.0 could allow us to finally remove buf_block_t::skip_flush_check . I think that it is worth the effort for that effect alone. The hang of thread 1 posted by Bernardo Perez might be due to page corruption. We fixed something in that area in MDEV-12699 and even later. Besides, I would not recommend using any 10.3 version older than 10.3.17 because of the corruption bug MDEV-19916 . mleich , can you please try to reproduce the problem on the latest 10.2, 10.3, or 10.4, or even on mariadb-10.5.0? I think that we first need a somewhat repeatable test case so that we can be confident about applying the fix.
            elenst Elena Stepanova made changes -
            Affects Version/s 10.2 [ 14601 ]
            marko Marko Mäkelä made changes -
            Affects Version/s 10.5 [ 23123 ]
            Affects Version/s 10.2 [ 14601 ]

            After I removed buf_block_t::skip_flush_check from 10.2, I finally reproduced a test failure:

            nice ./mtr --repeat=100 --parallel=auto --mysqld=--skip-innodb-log-optimize-ddl innodb.innodb-index{,,,,,,,,}
            

            10.2 83d0e72b34154dd24bb5b66f1732fb7753665d09 without skip_flush_check

            mysqltest: At line 715: query 'alter table t1 change f5 f2n int not null,change f2n f5 int not null,
            add column f8 int not null' failed: 2013: Lost connection to MySQL server during query
            …
            2020-06-02  9:48:34 139942911010560 [ERROR] [FATAL] InnoDB: Apparent corruption of an index page [page id: space=60, page number=6] to be written to data file. We intentionally crash the server to prevent corrupt data from ending up in data files.
            

            Porting the fix from MySQL 8.0 will involve some effort. In 10.5, there will be conflicts due to our rewrite of PageBulk::finish() in MDEV-12353.

            fungo, can you please try to port a minimal version of the MySQL fix to MariaDB 10.2 (but including the removal of buf_block_t::skip_flush_check)? Do you think that we really need the m_last_slotted_rec and m_slotted_rec_no?

            marko Marko Mäkelä added a comment - After I removed buf_block_t::skip_flush_check from 10.2, I finally reproduced a test failure: nice . /mtr --repeat=100 --parallel=auto --mysqld=--skip-innodb-log-optimize-ddl innodb.innodb-index{,,,,,,,,} 10.2 83d0e72b34154dd24bb5b66f1732fb7753665d09 without skip_flush_check mysqltest: At line 715: query 'alter table t1 change f5 f2n int not null,change f2n f5 int not null, add column f8 int not null' failed: 2013: Lost connection to MySQL server during query … 2020-06-02 9:48:34 139942911010560 [ERROR] [FATAL] InnoDB: Apparent corruption of an index page [page id: space=60, page number=6] to be written to data file. We intentionally crash the server to prevent corrupt data from ending up in data files. Porting the fix from MySQL 8.0 will involve some effort. In 10.5, there will be conflicts due to our rewrite of PageBulk::finish() in MDEV-12353 . fungo , can you please try to port a minimal version of the MySQL fix to MariaDB 10.2 (but including the removal of buf_block_t::skip_flush_check )? Do you think that we really need the m_last_slotted_rec and m_slotted_rec_no ?
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            marko Marko Mäkelä made changes -
            Summary Allow full redo logging for ALTER TABLE is not crash safe innodb_log_optimize_ddl=OFF is not crash safe
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Status Confirmed [ 10101 ] In Progress [ 3 ]

            MDEV-23156 suggests that that this does affect 10.5.

            marko Marko Mäkelä added a comment - MDEV-23156 suggests that that this does affect 10.5.

            mleich, please test bb-10.5-MDEV-21347. I hope that MDEV-23156 will be fixed by that.

            Apart from the test innodb.innodb-index, innodb.innodb_bug34300 was making some headache in the 10.2 version (bb-10.5-MDEV-21347^2^2^2). Apparently it is rebuilding a table and writing BLOBs in the process.

            marko Marko Mäkelä added a comment - mleich , please test bb-10.5-MDEV-21347 . I hope that MDEV-23156 will be fixed by that. Apart from the test innodb.innodb-index , innodb.innodb_bug34300 was making some headache in the 10.2 version ( bb-10.5-MDEV-21347^2^2^2 ). Apparently it is rebuilding a table and writing BLOBs in the process.

            origin/bb-10.2-marko bb1099f7e720aabcba127b77b0e5c98894f52a74 2020-07-15T19:41:01+03:00
            containing the fix for MDEV-21347 performed well during RQG testing based on the InnoDB
            standard test battery modified towards

            • go with innodb-log-optimize-ddl=OFF during the complete test
            • go with innodb-log-optimize-ddl=ON during the complete test
            • switch innodb-log-optimize-ddl during the ongoing test
              None of the tests failed with some unknown assert, data corruption or backup failure.
            mleich Matthias Leich added a comment - origin/bb-10.2-marko bb1099f7e720aabcba127b77b0e5c98894f52a74 2020-07-15T19:41:01+03:00 containing the fix for MDEV-21347 performed well during RQG testing based on the InnoDB standard test battery modified towards go with innodb-log-optimize-ddl=OFF during the complete test go with innodb-log-optimize-ddl=ON during the complete test switch innodb-log-optimize-ddl during the ongoing test None of the tests failed with some unknown assert, data corruption or backup failure.

            I merged this up to 10.5. In 10.5, the server always behaves as if innodb_log_optimize_ddl=OFF, and the parameter is deprecated and ignored.

            marko Marko Mäkelä added a comment - I merged this up to 10.5. In 10.5, the server always behaves as if innodb_log_optimize_ddl=OFF , and the parameter is deprecated and ignored.
            marko Marko Mäkelä made changes -
            issue.field.resolutiondate 2020-07-16 04:28:12.0 2020-07-16 04:28:12.358
            marko Marko Mäkelä made changes -
            Fix Version/s 10.2.33 [ 24307 ]
            Fix Version/s 10.3.24 [ 24306 ]
            Fix Version/s 10.4.14 [ 24305 ]
            Fix Version/s 10.5.5 [ 24423 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.5 [ 23123 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]

            Also origin/bb-10.5-MDEV-21347 c97af34d6a9286c04076b1f25493bd52ecc6b459 2020-07-15T21:13:42+03:00
            behaved well during RQG testing.

            mleich Matthias Leich added a comment - Also origin/bb-10.5- MDEV-21347 c97af34d6a9286c04076b1f25493bd52ecc6b459 2020-07-15T21:13:42+03:00 behaved well during RQG testing.
            marko Marko Mäkelä made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 102533 ] MariaDB v4 [ 157118 ]

            People

              marko Marko Mäkelä
              fungo Fungo Wang
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.