Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17843

Assertion `page_rec_is_leaf(rec)' failed in lock_rec_queue_validate upon SHOW ENGINE INNODB STATUS

Details

    Description

      Note: MDEV-14409 has been fixed, and indeed, 10.2 does not fail anymore, but 10.3 and 10.4 still do, a bit differently.

      10.3 95f3c142a4f

      mysqld: /data/src/10.3/storage/innobase/lock/lock0lock.cc:4883: bool lock_rec_queue_validate(bool, const buf_block_t*, const rec_t*, const dict_index_t*, const ulint*): Assertion `page_rec_is_leaf(rec)' failed.
      181204 14:09:53 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007f8cc4682ee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
      #8  0x000055a03583663d in lock_rec_queue_validate (locked_lock_trx_sys=true, block=0x7f8c9fa4ee68, rec=0x7f8ca054c070 "supremum", index=0x7f8c5402eae8, offsets=0x7f8cc11af800) at /data/src/10.3/storage/innobase/lock/lock0lock.cc:4883
      #9  0x000055a0358377a3 in lock_rec_validate_page (block=0x7f8c9fa4ee68) at /data/src/10.3/storage/innobase/lock/lock0lock.cc:5096
      #10 0x000055a035837bed in lock_rec_block_validate (space_id=9, page_no=3) at /data/src/10.3/storage/innobase/lock/lock0lock.cc:5198
      #11 0x000055a035837fb9 in lock_validate () at /data/src/10.3/storage/innobase/lock/lock0lock.cc:5272
      #12 0x000055a035836072 in lock_print_info_all_transactions (file=0x55a037dafdd0) at /data/src/10.3/storage/innobase/lock/lock0lock.cc:4775
      #13 0x000055a035971397 in srv_printf_innodb_monitor (file=0x55a037dafdd0, nowait=0, trx_start_pos=0x7f8cc11b03b0, trx_end=0x7f8cc11b03b8) at /data/src/10.3/storage/innobase/srv/srv0srv.cc:1259
      #14 0x000055a0357bcc97 in innodb_show_status (hton=0x55a037c94280, thd=0x7f8c50000b00, stat_print=0x55a035496581 <stat_print(THD*, char const*, size_t, char const*, size_t, char const*, size_t)>) at /data/src/10.3/storage/innobase/handler/ha_innodb.cc:15665
      #15 0x000055a0357bd704 in innobase_show_status (hton=0x55a037c94280, thd=0x7f8c50000b00, stat_print=0x55a035496581 <stat_print(THD*, char const*, size_t, char const*, size_t, char const*, size_t)>, stat_type=HA_ENGINE_STATUS) at /data/src/10.3/storage/innobase/handler/ha_innodb.cc:16083
      #16 0x000055a03549691b in ha_show_status (thd=0x7f8c50000b00, db_type=0x55a037c94280, stat=HA_ENGINE_STATUS) at /data/src/10.3/sql/handler.cc:5896
      #17 0x000055a035191899 in mysql_execute_command (thd=0x7f8c50000b00) at /data/src/10.3/sql/sql_parse.cc:4007
      #18 0x000055a03519e7cb in mysql_parse (thd=0x7f8c50000b00, rawbuf=0x7f8c50015a68 "SHOW ENGINE INNODB STATUS /* QNO 107 CON_ID 12 */", length=49, parser_state=0x7f8cc11b1640, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:8091
      #19 0x000055a03518b97f in dispatch_command (command=COM_QUERY, thd=0x7f8c50000b00, packet=0x7f8c50421231 "", packet_length=50, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:1850
      #20 0x000055a03518a3a3 in do_command (thd=0x7f8c50000b00) at /data/src/10.3/sql/sql_parse.cc:1395
      #21 0x000055a0352f20a0 in do_handle_one_connection (connect=0x55a0387bbe60) at /data/src/10.3/sql/sql_connect.cc:1402
      #22 0x000055a0352f1e24 in handle_one_connection (arg=0x55a0387bbe60) at /data/src/10.3/sql/sql_connect.cc:1308
      #23 0x00007f8cc613e494 in start_thread (arg=0x7f8cc11b2700) at pthread_create.c:333
      #24 0x00007f8cc473f93f in clone () from /lib/x86_64-linux-gnu/libc.so.6
       
      Query (0x7f8c50015a68): SHOW ENGINE INNODB STATUS /* QNO 107 CON_ID 12 */
      Connection ID (thread ID): 12
      Status: NOT_KILLED
      

      The grammar and test are the same as in MDEV-14409:

      RQG grammar mdev14409.yy

      thread2_init:
        CREATE TABLE `t1` (
          `t1` TEXT,
          `v1` VARCHAR(1),
          `t2` TEXT,
          `v2` VARCHAR(1),
          UNIQUE (`v2`),
          UNIQUE (`t2`(255))
        ) ENGINE=InnoDB ROW_FORMAT=DYNAMIC
        ; CREATE TABLE `t2` LIKE `t1`
        ; CREATE TABLE `t3` LIKE `t1`
        ; CREATE TABLE `t4` LIKE `t1`
        ; INSERT IGNORE INTO `t1` VALUES vals,vals,vals,vals,vals,vals,vals,vals,vals,vals 
        ; INSERT IGNORE INTO `t2` VALUES vals,vals,vals,vals,vals,vals,vals,vals,vals,vals 
        ; INSERT IGNORE INTO `t3` VALUES vals,vals,vals,vals,vals,vals,vals,vals,vals,vals 
        ; INSERT IGNORE INTO `t4` VALUES vals,vals,vals,vals,vals,vals,vals,vals,vals,vals 
        ; SELECT * FROM `t1` INTO OUTFILE 'load.data.t1'
        ; SELECT * FROM `t2` INTO OUTFILE 'load.data.t2'
        ; SELECT * FROM `t3` INTO OUTFILE 'load.data.t3'
        ; SELECT * FROM `t4` INTO OUTFILE 'load.data.t4'
      ;
       
      vals:
        (_char(1024),_char(1),_char(1024),_char(1));
       
      thread1:
        SHOW ENGINE INNODB STATUS;
       
      query:
        DELETE FROM my_table LIMIT _digit |
        LOAD DATA INFILE { "'load.data.t".$prng->int(1,4)."'" } REPLACE INTO TABLE my_table
      ;
       
      my_table:
        t1 | t2 | t3 | t4 ;
      

      Run as

      perl ./runall-new.pl  --basedir=/data/bld/10.2 --vardir=/dev/shm/vardir --duration=300 --threads=2 --grammar=mdev14409.yy --skip-gendata
      

      Don't forget to use your basedir and check whether the vardir path is okay for you.

      Note: If something does not work with your version of RQG, try https://github.com/MariaDB/randgen branch elenst-mdev14409.

      Attachments

        1. MDEV-17843.cfg
          43 kB
        2. MDEV-17843.yy
          1 kB
        3. RR_REPLAY_SIMP.sh
          11 kB

        Issue Links

          Activity

            Exact that assert upon SHOW ENGINE INNODB STATUS observed
            - on 10.5 51fb39bf175b275c91468ea6cd1cc82c1395c16d 2019-11-04
            - the server claims to be a 10.5.0
            too.
            

            mleich Matthias Leich added a comment - Exact that assert upon SHOW ENGINE INNODB STATUS observed - on 10.5 51fb39bf175b275c91468ea6cd1cc82c1395c16d 2019-11-04 - the server claims to be a 10.5.0 too.

             
            Replay of the problem on
            origin/bb-10.5-MDEV-23855 82ffb426bcb4832631d86443ab3781bbd6de3b23 2020-10-14T17:28:26+03:00
            compiled with debug, asan and Og.
            The server claims to be a 10.5.7.
             
            [rr 2815157 156984]mysqld: /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/lock/lock0lock.cc:4663: bool lock_rec_queue_validate(bool, const buf_block_t*, const rec_t*, const dict_index_t*, const rec_offs*): Assertion `page_rec_is_leaf(rec)' failed.
            201014 17:03:47 [ERROR] mysqld got signal 6 ;
             
            Query (0x62b0002c3238): SHOW ENGINE INNODB STATUS
             
            Connection ID (thread ID): 47
            Status: NOT_KILLED
             
            rr:/home/mleich/RQG/storage/1602686246/tmp/dev/shm/vardir/1602686246/91/1/rr
            _RR_TRACE_DIR="." rr replay --mark-stdio
             
            Other files which might be of interest:
            /home/mleich/RQG/storage/1602686246/001535.tgz     <= Archive with the remainings of the test.
            /home/mleich/RQG/storage/1602686246/tmp/dev/shm/vardir/1602686246/91/1/mysql.err
            /home/mleich/RQG/storage/1602686246/tmp/rqg.log   <= RQG log
             
            RQG
            ====
            git clone https://github.com/mleich1/rqg --branch experimental RQG
             
            perl rqg.pl \
            --gendata=conf/mariadb/oltp.zz \
            --grammar=conf/mariadb/oltp.yy \
            --reporters=Mariabackup_linux \
            --mysqld=--innodb_use_native_aio=1 \
            --mysqld=--innodb_lock_schedule_algorithm=fcfs \
            --mysqld=--loose-idle_write_transaction_timeout=0 \
            --mysqld=--loose-idle_transaction_timeout=0 \
            --mysqld=--loose-idle_readonly_transaction_timeout=0 \
            --mysqld=--connect_timeout=60 \
            --mysqld=--interactive_timeout=28800 \
            --mysqld=--slave_net_timeout=60 \
            --mysqld=--net_read_timeout=30 \
            --mysqld=--net_write_timeout=60 \
            --mysqld=--loose-table_lock_wait_timeout=50 \
            --mysqld=--wait_timeout=28800 \
            --mysqld=--lock-wait-timeout=86400 \
            --mysqld=--innodb-lock-wait-timeout=50 \
            --no-mask \
            --queries=10000000 \
            --seed=random \
            --reporters=Backtrace \
            --reporters=ErrorLog \
            --reporters=Deadlock1 \
            --validators=None \
            --mysqld=--log_output=none \
            --mysqld=--log-bin \
            --mysqld=--log_bin_trust_function_creators=1 \
            --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
            --engine=InnoDB \
            --restart_timeout=120 \
            --duration=400 \
            --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=400 \
            --mysqld=--loose-innodb-sync-debug \
            --mysqld=--innodb_stats_persistent=off \
            --mysqld=--loose-max-statement-time=30 \
            --threads=33 \
            --mysqld=--innodb_page_size=8K \
            --mysqld=--innodb-buffer-pool-size=8M \
            --duration=400 \
            --no_mask \
            --workdir=<local settings> \
            --vardir=<local settings> \
            --mtr-build-thread=<local settings> \
            --basedir1=<local settings> \
            --script_debug=_nix_ \
            --rr=Server \
            --rr_options=--chaos
            

            mleich Matthias Leich added a comment -   Replay of the problem on origin/bb-10.5-MDEV-23855 82ffb426bcb4832631d86443ab3781bbd6de3b23 2020-10-14T17:28:26+03:00 compiled with debug, asan and Og. The server claims to be a 10.5.7.   [rr 2815157 156984]mysqld: /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/lock/lock0lock.cc:4663: bool lock_rec_queue_validate(bool, const buf_block_t*, const rec_t*, const dict_index_t*, const rec_offs*): Assertion `page_rec_is_leaf(rec)' failed. 201014 17:03:47 [ERROR] mysqld got signal 6 ;   Query (0x62b0002c3238): SHOW ENGINE INNODB STATUS   Connection ID (thread ID): 47 Status: NOT_KILLED   rr:/home/mleich/RQG/storage/1602686246/tmp/dev/shm/vardir/1602686246/91/1/rr _RR_TRACE_DIR="." rr replay --mark-stdio   Other files which might be of interest: /home/mleich/RQG/storage/1602686246/001535.tgz <= Archive with the remainings of the test. /home/mleich/RQG/storage/1602686246/tmp/dev/shm/vardir/1602686246/91/1/mysql.err /home/mleich/RQG/storage/1602686246/tmp/rqg.log <= RQG log   RQG ==== git clone https://github.com/mleich1/rqg --branch experimental RQG   perl rqg.pl \ --gendata=conf/mariadb/oltp.zz \ --grammar=conf/mariadb/oltp.yy \ --reporters=Mariabackup_linux \ --mysqld=--innodb_use_native_aio=1 \ --mysqld=--innodb_lock_schedule_algorithm=fcfs \ --mysqld=--loose-idle_write_transaction_timeout=0 \ --mysqld=--loose-idle_transaction_timeout=0 \ --mysqld=--loose-idle_readonly_transaction_timeout=0 \ --mysqld=--connect_timeout=60 \ --mysqld=--interactive_timeout=28800 \ --mysqld=--slave_net_timeout=60 \ --mysqld=--net_read_timeout=30 \ --mysqld=--net_write_timeout=60 \ --mysqld=--loose-table_lock_wait_timeout=50 \ --mysqld=--wait_timeout=28800 \ --mysqld=--lock-wait-timeout=86400 \ --mysqld=--innodb-lock-wait-timeout=50 \ --no-mask \ --queries=10000000 \ --seed=random \ --reporters=Backtrace \ --reporters=ErrorLog \ --reporters=Deadlock1 \ --validators=None \ --mysqld=--log_output=none \ --mysqld=--log-bin \ --mysqld=--log_bin_trust_function_creators=1 \ --mysqld=--loose-debug_assert_on_not_freed_memory=0 \ --engine=InnoDB \ --restart_timeout=120 \ --duration=400 \ --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=400 \ --mysqld=--loose-innodb-sync-debug \ --mysqld=--innodb_stats_persistent=off \ --mysqld=--loose-max-statement-time=30 \ --threads=33 \ --mysqld=--innodb_page_size=8K \ --mysqld=--innodb-buffer-pool-size=8M \ --duration=400 \ --no_mask \ --workdir=<local settings> \ --vardir=<local settings> \ --mtr-build-thread=<local settings> \ --basedir1=<local settings> \ --script_debug=_nix_ \ --rr=Server \ --rr_options=--chaos

            I analyzed a recent failure. The clustered index root page ceases to be a leaf page a little earlier:

            10.6 43b239a08151f053a536187a65a874bf3586ff00

            #1  0x00005578a963be33 in btr_page_empty (block=0x552c22fd2a00, 
                page_zip=<optimized out>, index=<optimized out>, level=<optimized out>, 
                mtr=0x23b047a85560)
                at /Server/bb-10.6-MDEV-24915/storage/innobase/btr/btr0btr.cc:1723
            #2  0x00005578a96414b6 in btr_root_raise_and_insert (flags=flags@entry=0, 
                cursor=cursor@entry=0x23b047a84d20, offsets=offsets@entry=0x23b047a84b60, 
                heap=heap@entry=0x23b047a84b40, tuple=tuple@entry=0x6170002994a0, 
                n_ext=<optimized out>, mtr=<optimized out>)
                at /Server/bb-10.6-MDEV-24915/storage/innobase/btr/btr0btr.cc:2022
            #3  0x00005578a966ec9a in btr_cur_pessimistic_insert (flags=flags@entry=0, 
                cursor=cursor@entry=0x23b047a84d20, offsets=offsets@entry=0x23b047a84b60, 
                heap=heap@entry=0x23b047a84b40, entry=entry@entry=0x6170002994a0, 
                rec=rec@entry=0x23b047a84b80, big_rec=<optimized out>, 
                n_ext=<optimized out>, thr=<optimized out>, mtr=<optimized out>)
                at /Server/bb-10.6-MDEV-24915/storage/innobase/btr/btr0cur.cc:3770
            

            At this time, the lock_validate() was trying to acquire the page latch for this very page:

            10.6 43b239a08151f053a536187a65a874bf3586ff00

            #14 mtr_t::page_lock (this=this@entry=0x4500136878d0, block=block@entry=0x552c22fd2a00, rw_latch=rw_latch@entry=2) at /Server/bb-10.6-MDEV-24915/storage/innobase/mtr/mtr0mtr.cc:1071
            #15 0x00005578a96e276f in buf_page_get_low (page_id=<optimized out>, zip_size=<optimized out>, rw_latch=<optimized out>, guess=<optimized out>, mode=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /Server/bb-10.6-MDEV-24915/storage/innobase/buf/buf0buf.cc:3278
            #16 0x00005578a92f33a5 in lock_rec_block_validate (page_id=<optimized out>) at /Server/bb-10.6-MDEV-24915/storage/innobase/include/fil0fil.h:682
            #17 0x00005578a93045a0 in lock_validate () at /Server/bb-10.6-MDEV-24915/storage/innobase/lock/lock0lock.cc:4640
            #18 0x00005578a930584a in lock_print_info_all_transactions (file=file@entry=0x615000002d80) at /Server/bb-10.6-MDEV-24915/storage/innobase/lock/lock0lock.cc:4194
            #19 0x00005578a954a41b in srv_printf_innodb_monitor (file=<optimized out>, nowait=nowait@entry=0, trx_start_pos=trx_start_pos@entry=0x4500136887a0, trx_end=trx_end@entry=0x4500136887c0) at /Server/bb-10.6-MDEV-24915/storage/innobase/srv/srv0srv.cc:812
            #20 0x00005578a91aecc9 in innodb_show_status (stat_print=<optimized out>, thd=<optimized out>, hton=<optimized out>) at /Server/bb-10.6-MDEV-24915/storage/innobase/handler/ha_innodb.cc:15619
            #21 innobase_show_status (hton=<optimized out>, thd=<optimized out>, stat_print=<optimized out>, stat_type=<optimized out>) at /Server/bb-10.6-MDEV-24915/storage/innobase/handler/ha_innodb.cc:15701
            

            The code in lock_rec_block_validate() seems invalid. Here is the 10.2 version:

            		block = buf_page_get_gen(
            			page_id_t(space_id, page_no),
            			page_size_t(space->flags),
            			RW_X_LATCH, NULL,
            			BUF_GET_POSSIBLY_FREED,
            			__FILE__, __LINE__, &mtr, &err);
             
            		if (err != DB_SUCCESS) {
            			ib::error() << "Lock rec block validate failed for tablespace "
            				   << space->name
            				   << " space_id " << space_id
            				   << " page_no " << page_no << " err " << err;
            		}
             
            		if (block) {
            			buf_block_dbg_add_level(block, SYNC_NO_ORDER_CHECK);
             
            			ut_ad(lock_rec_validate_page(block));
            		}
            

            At this point, because we had released the lock_sys->mutex (in 10.6, lock_sys.latch), there is no guarantee that locks actually exist for this page, or even that the tablespace still exists. The warning message is unnecessary. We fail to check block->page.file_page_was_freed after using BUF_GET_POSSIBLY_FREED.

            In lock_rec_queue_validate(), rec must not be dereferenced before we have ensured that locks actually exist for the page.

            Side note: lock_rec_queue_validate() only seems to cover B-tree record locks, not SPATIAL INDEX locks.

            marko Marko Mäkelä added a comment - I analyzed a recent failure. The clustered index root page ceases to be a leaf page a little earlier: 10.6 43b239a08151f053a536187a65a874bf3586ff00 #1 0x00005578a963be33 in btr_page_empty (block=0x552c22fd2a00, page_zip=<optimized out>, index=<optimized out>, level=<optimized out>, mtr=0x23b047a85560) at /Server/bb-10.6-MDEV-24915/storage/innobase/btr/btr0btr.cc:1723 #2 0x00005578a96414b6 in btr_root_raise_and_insert (flags=flags@entry=0, cursor=cursor@entry=0x23b047a84d20, offsets=offsets@entry=0x23b047a84b60, heap=heap@entry=0x23b047a84b40, tuple=tuple@entry=0x6170002994a0, n_ext=<optimized out>, mtr=<optimized out>) at /Server/bb-10.6-MDEV-24915/storage/innobase/btr/btr0btr.cc:2022 #3 0x00005578a966ec9a in btr_cur_pessimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x23b047a84d20, offsets=offsets@entry=0x23b047a84b60, heap=heap@entry=0x23b047a84b40, entry=entry@entry=0x6170002994a0, rec=rec@entry=0x23b047a84b80, big_rec=<optimized out>, n_ext=<optimized out>, thr=<optimized out>, mtr=<optimized out>) at /Server/bb-10.6-MDEV-24915/storage/innobase/btr/btr0cur.cc:3770 At this time, the lock_validate() was trying to acquire the page latch for this very page: 10.6 43b239a08151f053a536187a65a874bf3586ff00 #14 mtr_t::page_lock (this=this@entry=0x4500136878d0, block=block@entry=0x552c22fd2a00, rw_latch=rw_latch@entry=2) at /Server/bb-10.6-MDEV-24915/storage/innobase/mtr/mtr0mtr.cc:1071 #15 0x00005578a96e276f in buf_page_get_low (page_id=<optimized out>, zip_size=<optimized out>, rw_latch=<optimized out>, guess=<optimized out>, mode=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /Server/bb-10.6-MDEV-24915/storage/innobase/buf/buf0buf.cc:3278 #16 0x00005578a92f33a5 in lock_rec_block_validate (page_id=<optimized out>) at /Server/bb-10.6-MDEV-24915/storage/innobase/include/fil0fil.h:682 #17 0x00005578a93045a0 in lock_validate () at /Server/bb-10.6-MDEV-24915/storage/innobase/lock/lock0lock.cc:4640 #18 0x00005578a930584a in lock_print_info_all_transactions (file=file@entry=0x615000002d80) at /Server/bb-10.6-MDEV-24915/storage/innobase/lock/lock0lock.cc:4194 #19 0x00005578a954a41b in srv_printf_innodb_monitor (file=<optimized out>, nowait=nowait@entry=0, trx_start_pos=trx_start_pos@entry=0x4500136887a0, trx_end=trx_end@entry=0x4500136887c0) at /Server/bb-10.6-MDEV-24915/storage/innobase/srv/srv0srv.cc:812 #20 0x00005578a91aecc9 in innodb_show_status (stat_print=<optimized out>, thd=<optimized out>, hton=<optimized out>) at /Server/bb-10.6-MDEV-24915/storage/innobase/handler/ha_innodb.cc:15619 #21 innobase_show_status (hton=<optimized out>, thd=<optimized out>, stat_print=<optimized out>, stat_type=<optimized out>) at /Server/bb-10.6-MDEV-24915/storage/innobase/handler/ha_innodb.cc:15701 The code in lock_rec_block_validate() seems invalid. Here is the 10.2 version: block = buf_page_get_gen( page_id_t(space_id, page_no), page_size_t(space->flags), RW_X_LATCH, NULL, BUF_GET_POSSIBLY_FREED, __FILE__, __LINE__, &mtr, &err);   if (err != DB_SUCCESS) { ib::error() << "Lock rec block validate failed for tablespace " << space->name << " space_id " << space_id << " page_no " << page_no << " err " << err; }   if (block) { buf_block_dbg_add_level(block, SYNC_NO_ORDER_CHECK);   ut_ad(lock_rec_validate_page(block)); } At this point, because we had released the lock_sys->mutex (in 10.6, lock_sys.latch ), there is no guarantee that locks actually exist for this page, or even that the tablespace still exists. The warning message is unnecessary. We fail to check block->page.file_page_was_freed after using BUF_GET_POSSIBLY_FREED . In lock_rec_queue_validate() , rec must not be dereferenced before we have ensured that locks actually exist for the page. Side note: lock_rec_queue_validate() only seems to cover B-tree record locks, not SPATIAL INDEX locks.

            Additional testcase in MDEV-25711, which I marked as duplicate of this one.

            Roel Roel Van de Paar added a comment - Additional testcase in MDEV-25711 , which I marked as duplicate of this one.
            mleich Matthias Leich added a comment - - edited

            How to replay with my current version of RQG
             
            git clone https://github.com/mleich1/rqg --branch experimental RQG
            # GIT_SHOW: HEAD -> experimental, origin/experimental 358366308288eaadbcb62822ef4faadf6e7aebc8 2022-03-28T18:04:44+02:00
            # rqg.pl  : Version 4.0.4 (2021-12)
            Place the uploaded file in teh RQG top level directory.
            ./RR_REPLAY_SIMP.sh MDEV-17843.cfg <path to your binaries>
             
            The replay of the problem on some development tree
            origin/bb-10.6-MDEV-21423 705d1ffdec97929c90a1d7d4adf2a8c044864020 2022-04-08T19:54:48+03:00
            is quite fast.
             
            last_result_dir/result.txt containing around end some aggregation like
            STATISTICS: RQG runs -- Verdict
            STATISTICS:        1 -- 'replay'           -- Replay of desired effect ...
                                                   This means the text pattern typical for MDEV-17843 was observed.
            ...
            STATISTICS:      120 -- 'ignore_stopped'   -- RQG run stopped by rqg_batch.pl ...
                                                     Typical reason is that the RQG batch tool killed ongoing RQG runs
                                                      because we already have a replay.
            ...
            STATISTICS Frequency -- extra_info
            STATISTICS        57 -- 'ignore_stopped -- work flow 13'
            STATISTICS        63 -- 'ignore_stopped -- work flow 5'
            STATISTICS         1 -- 'ignore_unwanted -- STATUS_SERVER_CRASHED--ASAN-fail2--MDEV-17843--rr--TBR-1432'
                                            The text pattern for MDEV-17843 was observed but the rr tracing failed.
            STATISTICS         1 -- 'replay -- STATUS_SERVER_CRASHED--MDEV-17843--rr--TBR-1432'
                                            We have one replayer with good rr trace.
             
             grep STATUS_SERVER_CRASHED--MDEV-17843--rr--TBR-1432 last_result_dir/result.txt 
                   2022-04-12T17:28:05 |  ... | replay           | 000057.log ...
            "suggests" to look into
            last_result_dir/000057.log    -- Protocol of the RQG run
            last_result_dir/000057.tar.xz -- Archived remains (rr trace, datadir, ...) of that test run
             
                            
                                            
            
            

            mleich Matthias Leich added a comment - - edited How to replay with my current version of RQG   git clone https://github.com/mleich1/rqg --branch experimental RQG # GIT_SHOW: HEAD -> experimental, origin/experimental 358366308288eaadbcb62822ef4faadf6e7aebc8 2022-03-28T18:04:44+02:00 # rqg.pl : Version 4.0.4 (2021-12) Place the uploaded file in teh RQG top level directory. ./RR_REPLAY_SIMP.sh MDEV-17843.cfg <path to your binaries>   The replay of the problem on some development tree origin/bb-10.6-MDEV-21423 705d1ffdec97929c90a1d7d4adf2a8c044864020 2022-04-08T19:54:48+03:00 is quite fast.   last_result_dir/result.txt containing around end some aggregation like STATISTICS: RQG runs -- Verdict STATISTICS: 1 -- 'replay' -- Replay of desired effect ... This means the text pattern typical for MDEV-17843 was observed. ... STATISTICS: 120 -- 'ignore_stopped' -- RQG run stopped by rqg_batch.pl ... Typical reason is that the RQG batch tool killed ongoing RQG runs because we already have a replay. ... STATISTICS Frequency -- extra_info STATISTICS 57 -- 'ignore_stopped -- work flow 13' STATISTICS 63 -- 'ignore_stopped -- work flow 5' STATISTICS 1 -- 'ignore_unwanted -- STATUS_SERVER_CRASHED--ASAN-fail2--MDEV-17843--rr--TBR-1432' The text pattern for MDEV-17843 was observed but the rr tracing failed. STATISTICS 1 -- 'replay -- STATUS_SERVER_CRASHED--MDEV-17843--rr--TBR-1432' We have one replayer with good rr trace.   grep STATUS_SERVER_CRASHED--MDEV-17843--rr--TBR-1432 last_result_dir/result.txt 2022-04-12T17:28:05 | ... | replay | 000057.log ... "suggests" to look into last_result_dir/000057.log -- Protocol of the RQG run last_result_dir/000057.tar.xz -- Archived remains (rr trace, datadir, ...) of that test run  

            origin/bb-10.3-MDEV-17843-lock_rec_queue_validate debf02e344b0078fddc5186f9af5c9ff9cdecb24 2022-04-14T15:05:13+03:00
            behaved well in RQG testing. No new bad effect met.

            mleich Matthias Leich added a comment - origin/bb-10.3- MDEV-17843 -lock_rec_queue_validate debf02e344b0078fddc5186f9af5c9ff9cdecb24 2022-04-14T15:05:13+03:00 behaved well in RQG testing. No new bad effect met.

            People

              vlad.lesin Vladislav Lesin
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.