Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21353

ut0byte.ic:91: void* ut_align_down(const void*, ulint): Assertion `ptr' failed.

Details

    Description

      Assert hit during RQG testing
      10.5.1 commit 44be8652c567ccaa0067830fa327cbffbe66b4cd 2019-12-18  compiled with debug
      Version: '10.5.1-MariaDB-debug-log'  socket: '... Source distribution
      mysqld: /home/mleich/Server/10.5/storage/innobase/include/ut0byte.ic:91: void* ut_align_down(const void*, ulint): Assertion `ptr' failed.
      ...
      Query (0x7ff33c0115a0): DELETE FROM t3 WHERE col2 = @MY_INT
      Status: NOT_KILLED
       
      The attached file 'prt' is the log of that MTR test run.
       
      The MTR test could also show some other assert
      mysqld: storage/innobase/btr/btr0pcur.cc:346: ulint btr_pcur_restore_position_func(ulint, btr_pcur_t*, const char*, unsigned int, mtr_t*): Assertion `!cmp_rec_rec(cursor->old_rec, rec, offsets1, offsets2, index)' failed.
      ...
      Query (0x7f26d80115a0): DELETE FROM t3 WHERE col2 = @MY_INT
      Status: NOT_KILLED
       
      The attached file 'prt_other' is the log that MTR test run.
       
       
      10.4.9 commit 576c96a9385f20cf4b6fe685e079dcf8026c3a24 2019-10-29
      10.4.12 commit 088de81d965f1b5ae2f50b1ed6e7cfccc0366098 2019-12-18
      did not replay that crash.
      
      

      Attachments

        1. MDEV-21353.tgz
          3 kB
        2. prt
          99 kB
        3. prt_other
          85 kB

        Activity

          mleich Matthias Leich added a comment - - edited

          How to replay the problem
          0. Use a Linux because the test needs a shellscript (in archive)
          1. cd <source_dir>/mysql-test
          2. tar xvzf MDEV-21353.tgz
          3. cd <bin_dir>/mysql-test
          4. mysql-test-run.pl --mem --parallel=8 MDEV-21353 MDEV-21353 MDEV-21353 MDEV-21353 MDEV-21353 MDEV-21353 MDEV-21353 MDEV-21353 2>&1 | tee prt
          Please adjust the 8 and the list with 8 "MDEV-21353" to the number of CPU's of your box (nproc)
          because it looks like exceptional CPU load is required for replay.
          4. In case of hitting an error mysql-test-run.pl will stop. But at least in my runs it was likely that certain mysqld and perl processes did not stop immediate (but a few minutes later).
          Some killall -9 perl ; killall -9 mysqld helps.
          But please be aware that all processes with such names get killed.

          mleich Matthias Leich added a comment - - edited How to replay the problem 0. Use a Linux because the test needs a shellscript (in archive) 1. cd <source_dir>/mysql-test 2. tar xvzf MDEV-21353 .tgz 3. cd <bin_dir>/mysql-test 4. mysql-test-run.pl --mem --parallel=8 MDEV-21353 MDEV-21353 MDEV-21353 MDEV-21353 MDEV-21353 MDEV-21353 MDEV-21353 MDEV-21353 2>&1 | tee prt Please adjust the 8 and the list with 8 " MDEV-21353 " to the number of CPU's of your box (nproc) because it looks like exceptional CPU load is required for replay. 4. In case of hitting an error mysql-test-run.pl will stop. But at least in my runs it was likely that certain mysqld and perl processes did not stop immediate (but a few minutes later). Some killall -9 perl ; killall -9 mysqld helps. But please be aware that all processes with such names get killed.

          In prt the problem is rec=NULL:

          10.5 44be8652c567ccaa0067830fa327cbffbe66b4cd

          #10 0x0000557505245500 in page_rec_is_comp (rec=0x0) at /home/mleich/Server/10.5/storage/innobase/include/page0page.h:274
          #11 0x00005575053e183e in row_search_mvcc (buf=0x7ff34801e4c8 "\372\233\f", mode=PAGE_CUR_G, prebuilt=0x7ff34801f488, match_mode=0, direction=1) at /home/mleich/Server/10.5/storage/innobase/row/row0sel.cc:4679
          #12 0x00005575051f2452 in ha_innobase::general_fetch (this=0x7ff34801e8b0, buf=0x7ff34801e4c8 "\372\233\f", direction=1, match_mode=0) at /home/mleich/Server/10.5/storage/innobase/handler/ha_innodb.cc:9321
          #13 0x00005575051f2997 in ha_innobase::rnd_next (this=0x7ff34801e8b0, buf=0x7ff34801e4c8 "\372\233\f") at /home/mleich/Server/10.5/storage/innobase/handler/ha_innodb.cc:9530
          #14 0x0000557504e19342 in handler::ha_rnd_next (this=0x7ff34801e8b0, buf=0x7ff34801e4c8 "\372\233\f") at /home/mleich/Server/10.5/sql/handler.cc:2816
          

          In prt_other the problem is a failure to restore a persistent cursor position:

          10.5

          #7  0x00007f27483bffc2 in __GI___assert_fail (assertion=0x5577f52e9da0 "!cmp_rec_rec(cursor->old_rec, rec, offsets1, offsets2, index)", file=0x5577f52e99b0 "/home/mleich/Server/10.5/storage/innobase/btr/btr0pcur.cc", line=346, function=0x5577f52ea9c0 <btr_pcur_restore_position_func(unsigned long, btr_pcur_t*, char const*, unsigned int, mtr_t*)::__PRETTY_FUNCTION__> "ulint btr_pcur_restore_position_func(ulint, btr_pcur_t*, const char*, unsigned int, mtr_t*)") at assert.c:101
          #8  0x00005577f4bd0fef in btr_pcur_restore_position_func (latch_mode=1, cursor=0x7f26d801f658, file=0x5577f5294d40 "/home/mleich/Server/10.5/storage/innobase/row/row0sel.cc", line=3511, mtr=0x7f2741744d60) at /home/mleich/Server/10.5/storage/innobase/btr/btr0pcur.cc:346
          #9  0x00005577f4adde4a in sel_restore_position_for_mysql (same_user_rec=0x7f2741744888, latch_mode=1, pcur=0x7f26d801f658, moves_up=1, mtr=0x7f2741744d60) at /home/mleich/Server/10.5/storage/innobase/row/row0sel.cc:3511
          #10 0x00005577f4ae128d in row_search_mvcc (buf=0x7f26d801e4c8 "\372l\021", mode=PAGE_CUR_G, prebuilt=0x7f26d801f488, match_mode=0, direction=1) at /home/mleich/Server/10.5/storage/innobase/row/row0sel.cc:4554
          #11 0x00005577f48f2452 in ha_innobase::general_fetch (this=0x7f26d801e8b0, buf=0x7f26d801e4c8 "\372l\021", direction=1, match_mode=0) at /home/mleich/Server/10.5/storage/innobase/handler/ha_innodb.cc:9321
          #12 0x00005577f48f2997 in ha_innobase::rnd_next (this=0x7f26d801e8b0, buf=0x7f26d801e4c8 "\372l\021") at /home/mleich/Server/10.5/storage/innobase/handler/ha_innodb.cc:9530
          

          marko Marko Mäkelä added a comment - In prt the problem is rec=NULL : 10.5 44be8652c567ccaa0067830fa327cbffbe66b4cd #10 0x0000557505245500 in page_rec_is_comp (rec=0x0) at /home/mleich/Server/10.5/storage/innobase/include/page0page.h:274 #11 0x00005575053e183e in row_search_mvcc (buf=0x7ff34801e4c8 "\372\233\f", mode=PAGE_CUR_G, prebuilt=0x7ff34801f488, match_mode=0, direction=1) at /home/mleich/Server/10.5/storage/innobase/row/row0sel.cc:4679 #12 0x00005575051f2452 in ha_innobase::general_fetch (this=0x7ff34801e8b0, buf=0x7ff34801e4c8 "\372\233\f", direction=1, match_mode=0) at /home/mleich/Server/10.5/storage/innobase/handler/ha_innodb.cc:9321 #13 0x00005575051f2997 in ha_innobase::rnd_next (this=0x7ff34801e8b0, buf=0x7ff34801e4c8 "\372\233\f") at /home/mleich/Server/10.5/storage/innobase/handler/ha_innodb.cc:9530 #14 0x0000557504e19342 in handler::ha_rnd_next (this=0x7ff34801e8b0, buf=0x7ff34801e4c8 "\372\233\f") at /home/mleich/Server/10.5/sql/handler.cc:2816 In prt_other the problem is a failure to restore a persistent cursor position: 10.5 #7 0x00007f27483bffc2 in __GI___assert_fail (assertion=0x5577f52e9da0 "!cmp_rec_rec(cursor->old_rec, rec, offsets1, offsets2, index)", file=0x5577f52e99b0 "/home/mleich/Server/10.5/storage/innobase/btr/btr0pcur.cc", line=346, function=0x5577f52ea9c0 <btr_pcur_restore_position_func(unsigned long, btr_pcur_t*, char const*, unsigned int, mtr_t*)::__PRETTY_FUNCTION__> "ulint btr_pcur_restore_position_func(ulint, btr_pcur_t*, const char*, unsigned int, mtr_t*)") at assert.c:101 #8 0x00005577f4bd0fef in btr_pcur_restore_position_func (latch_mode=1, cursor=0x7f26d801f658, file=0x5577f5294d40 "/home/mleich/Server/10.5/storage/innobase/row/row0sel.cc", line=3511, mtr=0x7f2741744d60) at /home/mleich/Server/10.5/storage/innobase/btr/btr0pcur.cc:346 #9 0x00005577f4adde4a in sel_restore_position_for_mysql (same_user_rec=0x7f2741744888, latch_mode=1, pcur=0x7f26d801f658, moves_up=1, mtr=0x7f2741744d60) at /home/mleich/Server/10.5/storage/innobase/row/row0sel.cc:3511 #10 0x00005577f4ae128d in row_search_mvcc (buf=0x7f26d801e4c8 "\372l\021", mode=PAGE_CUR_G, prebuilt=0x7f26d801f488, match_mode=0, direction=1) at /home/mleich/Server/10.5/storage/innobase/row/row0sel.cc:4554 #11 0x00005577f48f2452 in ha_innobase::general_fetch (this=0x7f26d801e8b0, buf=0x7f26d801e4c8 "\372l\021", direction=1, match_mode=0) at /home/mleich/Server/10.5/storage/innobase/handler/ha_innodb.cc:9321 #12 0x00005577f48f2997 in ha_innobase::rnd_next (this=0x7f26d801e8b0, buf=0x7f26d801e4c8 "\372l\021") at /home/mleich/Server/10.5/storage/innobase/handler/ha_innodb.cc:9530

          MDEV-21353.tgz failed to work in my environment, apparently something unexpected during the setup phase. With my help, mleich narrowed the cause down to the changes to page_cur_delete_rec() in befde6e97e8d14eb120fa28a012474dce2b7c185.

          marko Marko Mäkelä added a comment - MDEV-21353.tgz failed to work in my environment, apparently something unexpected during the setup phase. With my help, mleich narrowed the cause down to the changes to page_cur_delete_rec() in befde6e97e8d14eb120fa28a012474dce2b7c185 .

          I suspect that the patch added a condition that incorrectly skips a buf_block_modify_clock_inc() call. To rescue the situation, maybe that call could be added to the callers? Here is a patch that reverts the condition (and breaks ALTER TABLE…IMPORT TABLESPACE:

          diff --git a/storage/innobase/page/page0cur.cc b/storage/innobase/page/page0cur.cc
          index 771fbfae1ef..d7b0fa094f9 100644
          --- a/storage/innobase/page/page0cur.cc
          +++ b/storage/innobase/page/page0cur.cc
          @@ -2525,7 +2525,7 @@ page_cur_delete_rec(
           	/* The page gets invalid for optimistic searches: increment
           	the frame modify clock. Avoid this during IMPORT; the block is
           	not actually in the buffer pool. */
          -	if (mtr->get_log_mode() != MTR_LOG_NONE) {
          +	if (true||mtr->get_log_mode() != MTR_LOG_NONE) {
           		buf_block_modify_clock_inc(block);
           		page_cur_delete_rec_write_log(current_rec, index, mtr);
           	}
          

          mleich, can you please test with this patch? Since the buf_block_t::modify_clock affects optimistic btr_pcur_restore_position(), the failure to call buf_block_modify_clock_inc() should explain all the observed symptoms.

          marko Marko Mäkelä added a comment - I suspect that the patch added a condition that incorrectly skips a buf_block_modify_clock_inc() call. To rescue the situation, maybe that call could be added to the callers? Here is a patch that reverts the condition (and breaks ALTER TABLE…IMPORT TABLESPACE : diff --git a/storage/innobase/page/page0cur.cc b/storage/innobase/page/page0cur.cc index 771fbfae1ef..d7b0fa094f9 100644 --- a/storage/innobase/page/page0cur.cc +++ b/storage/innobase/page/page0cur.cc @@ -2525,7 +2525,7 @@ page_cur_delete_rec( /* The page gets invalid for optimistic searches: increment the frame modify clock. Avoid this during IMPORT; the block is not actually in the buffer pool. */ - if (mtr->get_log_mode() != MTR_LOG_NONE) { + if (true||mtr->get_log_mode() != MTR_LOG_NONE) { buf_block_modify_clock_inc(block); page_cur_delete_rec_write_log(current_rec, index, mtr); } mleich , can you please test with this patch? Since the buf_block_t::modify_clock affects optimistic btr_pcur_restore_position() , the failure to call buf_block_modify_clock_inc() should explain all the observed symptoms.

          People

            marko Marko Mäkelä
            mleich Matthias Leich
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.