Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14409

Assertion `page_rec_is_leaf(rec)' failed in lock_rec_validate_page upon SHOW ENGINE INNODB STATUS

Details

    Description

      10.2 3728b11f879

      mysqld: /data/src/10.2/storage/innobase/lock/lock0lock.cc:5653: ulint lock_rec_validate_page(const buf_block_t*): Assertion `page_rec_is_leaf(rec)' failed.
      181124 23:15:12 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007f6516333ee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
      #8  0x00005555f9151a5d in lock_rec_validate_page (block=0x7f64f39f2df8) at /data/src/10.2/storage/innobase/lock/lock0lock.cc:5653
      #9  0x00005555f91520a9 in lock_rec_block_validate (space_id=9, page_no=3) at /data/src/10.2/storage/innobase/lock/lock0lock.cc:5804
      #10 0x00005555f9152331 in lock_validate () at /data/src/10.2/storage/innobase/lock/lock0lock.cc:5857
      #11 0x00005555f9150729 in lock_print_info_all_transactions (file=0x5555fb540410) at /data/src/10.2/storage/innobase/lock/lock0lock.cc:5348
      #12 0x00005555f9277830 in srv_printf_innodb_monitor (file=0x5555fb540410, nowait=0, trx_start_pos=0x7f6500b09500, trx_end=0x7f6500b09508) at /data/src/10.2/storage/innobase/srv/srv0srv.cc:1303
      #13 0x00005555f90e2f5a in innodb_show_status (hton=0x5555fb4f8730, thd=0x7f64a4000b00, stat_print=0x5555f8dc844a <stat_print(THD*, char const*, size_t, char const*, size_t, char const*, size_t)>) at /data/src/10.2/storage/innobase/handler/ha_innodb.cc:15950
      #14 0x00005555f90e39a3 in innobase_show_status (hton=0x5555fb4f8730, thd=0x7f64a4000b00, stat_print=0x5555f8dc844a <stat_print(THD*, char const*, size_t, char const*, size_t, char const*, size_t)>, stat_type=HA_ENGINE_STATUS) at /data/src/10.2/storage/innobase/handler/ha_innodb.cc:16364
      #15 0x00005555f8dc87e4 in ha_show_status (thd=0x7f64a4000b00, db_type=0x5555fb4f8730, stat=HA_ENGINE_STATUS) at /data/src/10.2/sql/handler.cc:5619
      #16 0x00005555f8b42627 in mysql_execute_command (thd=0x7f64a4000b00) at /data/src/10.2/sql/sql_parse.cc:3725
      #17 0x00005555f8b4fb5d in mysql_parse (thd=0x7f64a4000b00, rawbuf=0x7f64a40131d8 "SHOW ENGINE INNODB STATUS /* QNO 66 CON_ID 12 */", length=48, parser_state=0x7f6500b0a250, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:8013
      #18 0x00005555f8b3d495 in dispatch_command (command=COM_QUERY, thd=0x7f64a4000b00, packet=0x7f64a444b941 "", packet_length=49, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:1824
      #19 0x00005555f8b3bdf8 in do_command (thd=0x7f64a4000b00) at /data/src/10.2/sql/sql_parse.cc:1378
      #20 0x00005555f8c8e537 in do_handle_one_connection (connect=0x5555fb4e4090) at /data/src/10.2/sql/sql_connect.cc:1335
      #21 0x00005555f8c8e2c4 in handle_one_connection (arg=0x5555fb4e4090) at /data/src/10.2/sql/sql_connect.cc:1241
      #22 0x00007f6517def494 in start_thread (arg=0x7f6500b0b700) at pthread_create.c:333
      #23 0x00007f65163f093f in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      Strangely, MDEV-14055 seems to claim that it was a bogus debug assertion, but it's still there even though MDEV-14055 has been closed.

      RQG grammar mdev14409.yy

      thread2_init:
        CREATE TABLE `t1` (
          `t1` TEXT,
          `v1` VARCHAR(1),
          `t2` TEXT,
          `v2` VARCHAR(1),
          UNIQUE (`v2`),
          UNIQUE (`t2`(255))
        ) ENGINE=InnoDB ROW_FORMAT=DYNAMIC
        ; CREATE TABLE `t2` LIKE `t1`
        ; CREATE TABLE `t3` LIKE `t1`
        ; CREATE TABLE `t4` LIKE `t1`
        ; INSERT IGNORE INTO `t1` VALUES vals,vals,vals,vals,vals,vals,vals,vals,vals,vals 
        ; INSERT IGNORE INTO `t2` VALUES vals,vals,vals,vals,vals,vals,vals,vals,vals,vals 
        ; INSERT IGNORE INTO `t3` VALUES vals,vals,vals,vals,vals,vals,vals,vals,vals,vals 
        ; INSERT IGNORE INTO `t4` VALUES vals,vals,vals,vals,vals,vals,vals,vals,vals,vals 
        ; SELECT * FROM `t1` INTO OUTFILE 'load.data.t1'
        ; SELECT * FROM `t2` INTO OUTFILE 'load.data.t2'
        ; SELECT * FROM `t3` INTO OUTFILE 'load.data.t3'
        ; SELECT * FROM `t4` INTO OUTFILE 'load.data.t4'
      ;
       
      vals:
        (_char(1024),_char(1),_char(1024),_char(1));
       
      thread1:
        SHOW ENGINE INNODB STATUS;
       
      query:
        DELETE FROM my_table LIMIT _digit |
        LOAD DATA INFILE { "'load.data.t".$prng->int(1,4)."'" } REPLACE INTO TABLE my_table
      ;
       
      my_table:
        t1 | t2 | t3 | t4 ;
      

      Run as

      perl ./runall-new.pl  --basedir=/data/bld/10.2 --vardir=/dev/shm/vardir --duration=300 --threads=2 --grammar=mdev14409.yy --skip-gendata
      

      Don't forget to use your basedir and check whether the vardir path is okay for you.

      Note: If something does not work with your version of RQG, try https://github.com/MariaDB/randgen branch elenst-mdev14409.

      Attachments

        Issue Links

          Activity

            The fix for MDEV-14055 was to make lock_validate() ignore record lock bitmaps that are empty. The assertion itself was not removed.

            In this case, it crashes on the root page of a table (page number is 3). The root page is no longer a leaf page; there are 2 child pages. There are no bits set in the record lock bitmap:

            (gdb) p *lock
            $1 = {trx = 0x7fe3f9fa2378, trx_locks = {prev = 0x7fe3f9fa2458, 
                next = 0x7fe3f9fa2708}, index = 0x7fe37c031728, hash = 0x0, 
              requested_time = 0, wait_time = 0, un_member = {tab_lock = {
                  table = 0x300000009, locks = {prev = 0x50, next = 0x0}}, rec_lock = {
                  space = 9, page_no = 3, n_bits = 80}}, type_mode = 1059}
            (gdb) p/x *(byte*)&lock[1]@10
            $2 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
            

            The problem is in the code:

            	for (i = nth_bit; i < lock_rec_get_n_bits(lock); i++) {
             
            		if (i == 1 || lock_rec_get_nth_bit(lock, i)) {
             
            			rec = page_find_rec_with_heap_no(block->frame, i);
            			ut_a(rec);
            			ut_ad(page_rec_is_leaf(rec));
            			lock_rec_queue_validate(
            				TRUE, block, rec, lock->index, offsets);
            

            Here, we have i == 1, that is, the supremum record of the page. That condition already existed in the very first published version of InnoDB.

            lock_rec_queue_validate() on the page supremum record would only execute one loop:

            	if (!page_rec_is_user_rec(rec)) {
             
            		for (lock = lock_rec_get_first(lock_sys->rec_hash,
            					       block, heap_no);
            		     lock != NULL;
            		     lock = lock_rec_get_next_const(heap_no, lock)) {
             
            			ut_ad(!trx_is_ac_nl_ro(lock->trx));
             
            			if (lock_get_wait(lock)) {
            				ut_a(lock_rec_has_to_wait_in_queue(lock));
            			}
             
            			if (index != NULL) {
            				ut_a(lock->index == index);
            			}
            		}
             
            		goto func_exit;
            	}
            

            It looks like the assertion should be moved below this code, and that the original assertion should be relaxed to the following:

            ut_ad(!lock_rec_get_nth_bit(lock, i) || page_rec_is_leaf(rec));
            

            With these changes, the test case does not fail.

            marko Marko Mäkelä added a comment - The fix for MDEV-14055 was to make lock_validate() ignore record lock bitmaps that are empty . The assertion itself was not removed. In this case, it crashes on the root page of a table (page number is 3). The root page is no longer a leaf page; there are 2 child pages. There are no bits set in the record lock bitmap: (gdb) p *lock $1 = {trx = 0x7fe3f9fa2378, trx_locks = {prev = 0x7fe3f9fa2458, next = 0x7fe3f9fa2708}, index = 0x7fe37c031728, hash = 0x0, requested_time = 0, wait_time = 0, un_member = {tab_lock = { table = 0x300000009, locks = {prev = 0x50, next = 0x0}}, rec_lock = { space = 9, page_no = 3, n_bits = 80}}, type_mode = 1059} (gdb) p/x *(byte*)&lock[1]@10 $2 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} The problem is in the code: for (i = nth_bit; i < lock_rec_get_n_bits(lock); i++) {   if (i == 1 || lock_rec_get_nth_bit(lock, i)) {   rec = page_find_rec_with_heap_no(block->frame, i); ut_a(rec); ut_ad(page_rec_is_leaf(rec)); … lock_rec_queue_validate( TRUE, block, rec, lock->index, offsets); Here, we have i == 1 , that is, the supremum record of the page. That condition already existed in the very first published version of InnoDB . lock_rec_queue_validate() on the page supremum record would only execute one loop: if (!page_rec_is_user_rec(rec)) {   for (lock = lock_rec_get_first(lock_sys->rec_hash, block, heap_no); lock != NULL; lock = lock_rec_get_next_const(heap_no, lock)) {   ut_ad(!trx_is_ac_nl_ro(lock->trx));   if (lock_get_wait(lock)) { ut_a(lock_rec_has_to_wait_in_queue(lock)); }   if (index != NULL) { ut_a(lock->index == index); } }   goto func_exit; } It looks like the assertion should be moved below this code, and that the original assertion should be relaxed to the following: ut_ad(!lock_rec_get_nth_bit(lock, i) || page_rec_is_leaf(rec)); With these changes, the test case does not fail.

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.