Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14711

Assertion `mode == 16 || mode == 12 || !fix_block->page.file_page_was_freed' failed in buf_page_get_gen (rollback requesting a freed undo page)

Details

    Description

      10.2 8f98835bb86550

      mysqld: /home/elenst/git/10.2/storage/innobase/buf/buf0buf.cc:4755: buf_block_t* buf_page_get_gen(const page_id_t&, const page_size_t&, ulint, buf_block_t*, ulint, const char*, unsigned int, mtr_t*, dberr_t*): Assertion `mode == 16 || mode == 12 || !fix_block->page.file_page_was_freed' failed.
      180307 17:45:23 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007f776c99fc82 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
      #8  0x000055e779cfc09d in buf_page_get_gen (page_id=..., page_size=..., rw_latch=1, guess=0x0, mode=10, file=0x55e77a258078 "/home/elenst/git/10.2/storage/innobase/include/trx0undo.ic", line=180, mtr=0x7f77551c5b40, err=0x0) at /home/elenst/git/10.2/storage/innobase/buf/buf0buf.cc:4755
      #9  0x000055e779c74b95 in trx_undo_page_get_s_latched (page_id=..., mtr=0x7f77551c5b40) at /home/elenst/git/10.2/storage/innobase/include/trx0undo.ic:179
      #10 0x000055e779c77864 in trx_roll_pop_top_rec (trx=0x7f7757217198, undo=0x7f76e403b960, mtr=0x7f77551c5b40) at /home/elenst/git/10.2/storage/innobase/trx/trx0roll.cc:991
      #11 0x000055e779c77e14 in trx_roll_pop_top_rec_of_trx (trx=0x7f7757217198, roll_ptr=0x7f76e415e610, heap=0x7f76e415eae0) at /home/elenst/git/10.2/storage/innobase/trx/trx0roll.cc:1077
      #12 0x000055e779c12aaa in row_undo (node=0x7f76e415e5c8, thr=0x7f76e415e410) at /home/elenst/git/10.2/storage/innobase/row/row0undo.cc:269
      #13 0x000055e779c12d20 in row_undo_step (thr=0x7f76e415e410) at /home/elenst/git/10.2/storage/innobase/row/row0undo.cc:352
      #14 0x000055e779b75215 in que_thr_step (thr=0x7f76e415e410) at /home/elenst/git/10.2/storage/innobase/que/que0que.cc:1047
      #15 0x000055e779b7542f in que_run_threads_low (thr=0x7f76e415e410) at /home/elenst/git/10.2/storage/innobase/que/que0que.cc:1111
      #16 0x000055e779b755d8 in que_run_threads (thr=0x7f76e415e410) at /home/elenst/git/10.2/storage/innobase/que/que0que.cc:1151
      #17 0x000055e779c7506e in trx_rollback_to_savepoint_low (trx=0x7f7757217198, savept=0x0) at /home/elenst/git/10.2/storage/innobase/trx/trx0roll.cc:112
      #18 0x000055e779c753d9 in trx_rollback_for_mysql_low (trx=0x7f7757217198) at /home/elenst/git/10.2/storage/innobase/trx/trx0roll.cc:174
      #19 0x000055e779c756d0 in trx_rollback_low (trx=0x7f7757217198) at /home/elenst/git/10.2/storage/innobase/trx/trx0roll.cc:206
      #20 0x000055e779c75a07 in trx_rollback_for_mysql (trx=0x7f7757217198) at /home/elenst/git/10.2/storage/innobase/trx/trx0roll.cc:282
      #21 0x000055e779a7befe in innobase_rollback (hton=0x55e77d2d9740, thd=0x7f76e4000b00, rollback_trx=true) at /home/elenst/git/10.2/storage/innobase/handler/ha_innodb.cc:4862
      #22 0x000055e77986ffe0 in ha_rollback_trans (thd=0x7f76e4000b00, all=true) at /home/elenst/git/10.2/sql/handler.cc:1672
      #23 0x000055e779759585 in xa_trans_force_rollback (thd=0x7f76e4000b00) at /home/elenst/git/10.2/sql/transaction.cc:141
      #24 0x000055e77975b488 in trans_xa_commit (thd=0x7f76e4000b00) at /home/elenst/git/10.2/sql/transaction.cc:922
      #25 0x000055e779604619 in mysql_execute_command (thd=0x7f76e4000b00) at /home/elenst/git/10.2/sql/sql_parse.cc:6074
      #26 0x000055e779609a82 in mysql_parse (thd=0x7f76e4000b00, rawbuf=0x7f76e40111c8 "XA COMMIT 'xid191'  /* QNO 11370 CON_ID 17 */", length=45, parser_state=0x7f77551c8250, is_com_multi=false, is_next_command=false) at /home/elenst/git/10.2/sql/sql_parse.cc:7902
      #27 0x000055e7795f76b3 in dispatch_command (command=COM_QUERY, thd=0x7f76e4000b00, packet=0x7f76e4008951 "XA COMMIT 'xid191'  /* QNO 11370 CON_ID 17 */ ", packet_length=46, is_com_multi=false, is_next_command=false) at /home/elenst/git/10.2/sql/sql_parse.cc:1806
      #28 0x000055e7795f600f in do_command (thd=0x7f76e4000b00) at /home/elenst/git/10.2/sql/sql_parse.cc:1360
      #29 0x000055e77974436f in do_handle_one_connection (connect=0x55e77dd44eb0) at /home/elenst/git/10.2/sql/sql_connect.cc:1335
      #30 0x000055e7797440ef in handle_one_connection (arg=0x55e77dd44eb0) at /home/elenst/git/10.2/sql/sql_connect.cc:1241
      #31 0x00007f776d5e36ba in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
      #32 0x00007f776ca7882d in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      I'm getting it after a while by running

      experimental f6acdbc55e206

      perl ./runall-trials.pl --trials=20 --duration=350 --threads=6 --seed=1520398074 --reporters=Backtrace,ErrorLog,Deadlock --validators=TransformerNoComparator --transformers=ExecuteAsCTE,ExecuteAsDeleteReturning,ExecuteAsExecuteImmediate,ExecuteAsInsertSelect,ExecuteAsUnion,ExecuteAsUpdateDelete,ExecuteAsView,ExecuteAsPreparedTwice,ExecuteAsSPTwice --redefine=conf/mariadb/general-workarounds.yy --mysqld=--log_output=FILE --views --vcols --redefine=conf/mariadb/alter_table.yy --redefine=conf/mariadb/bulk_insert.yy --redefine=conf/mariadb/xa.yy --mysqld=--log_bin_trust_function_creators=1 --mysqld=--log-bin --mysqld=--max-statement-time=30 --mysqld=--loose-debug_assert_on_not_freed_memory=0 --grammar=conf/mariadb/functions.yy --engine=InnoDB --basedir=<basedir> --vardir=<vardir>
      

      But it's not simplified and not guaranteed.

      Coredump and datadir are available.

      Attachments

        Issue Links

          Activity

            mleich Matthias Leich added a comment - - edited

            Assuming that the assert hit on 10.5.5 (origin/10.5 054f10365c49b70a3868edc0c955ffe187f8344c 2020-07-16T07:15:06+03:00) fits
            (rr) bt
            #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
            #1  0x000017a14145f8b1 in __GI_abort () at abort.c:79
            #2  0x000017a14144f42a in __assert_fail_base (fmt=0x17a1415d6a38 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x56247fc086e0 "mode == 16 || mode == 12 || fix_block->page.status != buf_page_t::FREED", 
                file=file@entry=0x56247fc00da0 "/home/mleich/10.5_A/storage/innobase/buf/buf0buf.cc", line=line@entry=3522, 
                function=function@entry=0x56247fc0ba80 <buf_page_get_low(page_id_t, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned int, mtr_t*, dberr_t*, bool)::__PRETTY_FUNCTION__> "buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, const char*, unsigned int, mtr_t*, dberr_t*, bool)") at assert.c:92
            #3  0x000017a14144f4a2 in __GI___assert_fail (assertion=assertion@entry=0x56247fc086e0 "mode == 16 || mode == 12 || fix_block->page.status != buf_page_t::FREED", file=file@entry=0x56247fc00da0 "/home/mleich/10.5_A/storage/innobase/buf/buf0buf.cc", 
                line=line@entry=3522, 
                function=function@entry=0x56247fc0ba80 <buf_page_get_low(page_id_t, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned int, mtr_t*, dberr_t*, bool)::__PRETTY_FUNCTION__> "buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, const char*, unsigned int, mtr_t*, dberr_t*, bool)") at assert.c:101
            #4  0x000056247ecbbf04 in buf_page_get_low (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=<optimized out>, guess@entry=0x0, mode=mode@entry=10, 
                file=file@entry=0x56247fbd8f00 "/home/mleich/10.5_A/storage/innobase/btr/btr0cur.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /home/mleich/10.5_A/storage/innobase/buf/buf0buf.cc:3522
            #5  0x000056247ecbc528 in buf_page_get_gen (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10, file=file@entry=0x56247fbd8f00 "/home/mleich/10.5_A/storage/innobase/btr/btr0cur.cc", line=803, 
                mtr=0x68cb524390b0, err=0x68cb52436c20, allow_ibuf_merge=false) at /home/mleich/10.5_A/storage/innobase/buf/buf0buf.cc:3647
            #6  0x000056247ebea8bf in btr_block_get_func (index=..., page=page@entry=58, mode=mode@entry=1, merge=<optimized out>, file=file@entry=0x56247fbd8f00 "/home/mleich/10.5_A/storage/innobase/btr/btr0cur.cc", line=line@entry=803, mtr=<optimized out>)
                at /home/mleich/10.5_A/storage/innobase/include/btr0btr.h:240
            #7  0x000056247ec1ba32 in btr_cur_optimistic_latch_leaves (block=0x3e817fa41f70, modify_clock=<optimized out>, latch_mode=latch_mode@entry=0x68cb52436eb8, cursor=cursor@entry=0x6210000f3f80, file=<optimized out>, line=<optimized out>, 
                mtr=<optimized out>) at /home/mleich/10.5_A/storage/innobase/btr/btr0cur.cc:802
            #8  0x000056247ec660d5 in btr_pcur_restore_position_func (latch_mode=<optimized out>, latch_mode@entry=35, cursor=cursor@entry=0x6210000f3f80, file=<optimized out>, file@entry=0x56247fbf35e0 "/home/mleich/10.5_A/storage/innobase/btr/btr0pcur.cc", 
                line=<optimized out>, line@entry=547, mtr=mtr@entry=0x68cb524390b0) at /home/mleich/10.5_A/storage/innobase/btr/btr0pcur.cc:317
            #9  0x000056247ec6711b in btr_pcur_move_backward_from_page (cursor=cursor@entry=0x6210000f3f80, mtr=mtr@entry=0x68cb524390b0) at /home/mleich/10.5_A/storage/innobase/btr/btr0pcur.cc:547
            #10 0x000056247ec695be in btr_pcur_move_to_prev (cursor=cursor@entry=0x6210000f3f80, mtr=mtr@entry=0x68cb524390b0) at /home/mleich/10.5_A/storage/innobase/btr/btr0pcur.cc:602
            #11 0x000056247e9eecc7 in row_search_mvcc (buf=buf@entry=0x61a000b532b8 "\360\001\b\240", <incomplete sequence \313>, mode=PAGE_CUR_L, mode@entry=PAGE_CUR_UNSUPP, prebuilt=<optimized out>, match_mode=match_mode@entry=0, direction=direction@entry=2)
                at /home/mleich/10.5_A/storage/innobase/row/row0sel.cc:5577
            #12 0x000056247e53942f in ha_innobase::general_fetch (this=0x61d000960ab8, buf=<optimized out>, direction=direction@entry=2, match_mode=match_mode@entry=0) at /home/mleich/10.5_A/storage/innobase/handler/ha_innodb.cc:9226
            #13 0x000056247e539af7 in ha_innobase::index_prev (this=<optimized out>, buf=<optimized out>) at /home/mleich/10.5_A/storage/innobase/handler/ha_innodb.cc:9319
            #14 0x000056247d9a18b0 in handler::ha_index_prev (this=0x61d000960ab8, buf=<optimized out>) at /home/mleich/10.5_A/sql/handler.cc:3187
            #15 0x000056247deed1f7 in rr_index_desc (info=0x68cb52439e80) at /home/mleich/10.5_A/sql/records.cc:501
            #16 0x000056247d45b68a in READ_RECORD::read_record (this=0x68cb52439e80) at /home/mleich/10.5_A/sql/records.h:79
            #17 mysql_update (thd=thd@entry=0x62b0000af218, table_list=<optimized out>, fields=..., values=..., conds=<optimized out>, order_num=<optimized out>, order=<optimized out>, limit=1, ignore=<optimized out>, found_return=<optimized out>, 
                updated_return=<optimized out>) at /home/mleich/10.5_A/sql/sql_update.cc:843
            #18 0x000056247d102dc0 in mysql_execute_command (thd=thd@entry=0x62b0000af218) at /home/mleich/10.5_A/sql/sql_parse.cc:4393
            #19 0x000056247d125ed7 in mysql_parse (thd=thd@entry=0x62b0000af218, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x68cb5243bff0, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false)
                at /home/mleich/10.5_A/sql/sql_parse.cc:7993
            #20 0x000056247d0eb366 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b0000af218, packet=packet@entry=0x629000906219 " UPDATE t4 SET col2 = 16 ORDER BY col1 DESC LIMIT 2  /* E_R Thread8 QNO 686 CON_ID 37 */ ", 
                packet_length=packet_length@entry=89, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /home/mleich/10.5_A/sql/sql_parse.cc:1866
            #21 0x000056247d0e61b7 in do_command (thd=0x62b0000af218) at /home/mleich/10.5_A/sql/sql_parse.cc:1347
            #22 0x000056247d586281 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x608000002c38, put_in_cache=put_in_cache@entry=true) at /home/mleich/10.5_A/sql/sql_connect.cc:1411
            #23 0x000056247d587525 in handle_one_connection (arg=arg@entry=0x608000002c38) at /home/mleich/10.5_A/sql/sql_connect.cc:1313
            #24 0x000056247e300ca2 in pfs_spawn_thread (arg=0x615000008c98) at /home/mleich/10.5_A/storage/perfschema/pfs.cc:2201
            #25 0x00007ff21bd9e6db in start_thread (arg=0x68cb5243e300) at pthread_create.c:463
            #26 0x000017a141540a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
            (rr)
             
            /home/mleich/RQG_mleich/rqg.pl \ 
            --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
            --gendata=conf/mariadb/table_stress.zz \
            --gendata_sql=conf/mariadb/table_stress.sql \
            --engine=Innodb \
            --reporters=Deadlock1,ErrorLog,Backtrace \
            --mysqld=--connect_timeout=60 \
            --mysqld=--net_read_timeout=30 \
            --mysqld=--net_write_timeout=60 \
            --mysqld=--loose-idle_readonly_transaction_timeout=0 \
            --mysqld=--loose-idle_transaction_timeout=0 \
            --mysqld=--loose-idle_write_transaction_timeout=0 \
            --mysqld=--interactive_timeout=28800 \
            --mysqld=--lock_wait_timeout=86400 \
            --mysqld=--innodb-lock-wait-timeout=50 \
            --mysqld=--loose-table_lock_wait_timeout=50 \
            --mysqld=--wait_timeout=28800 \
            --mysqld=--slave_net_timeout=60 \
            --mysqld=--loose-max-statement-time=30 \
            --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
            --mysqld=--log-output=none \
            --duration=300 \
            --seed=random \
            --sqltrace=MarkErrors \
            --grammar=conf/mariadb/table_stress_innodb_nocopy.yy \
            --threads=9 \
            --mysqld=--innodb_adaptive_hash_index=ON \
            --mysqld=--innodb_page_size=4K \
            --mysqld=--innodb-buffer-pool-size=8M \
            --duration=300 \
            --no_mask \
            ... certain settings specific to the testing box
            --script_debug=_nix_ \
            --rr=Server \
            --rr_options=--chaos
            
            

            mleich Matthias Leich added a comment - - edited Assuming that the assert hit on 10.5.5 (origin/10.5 054f10365c49b70a3868edc0c955ffe187f8344c 2020-07-16T07:15:06+03:00) fits (rr) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x000017a14145f8b1 in __GI_abort () at abort.c:79 #2 0x000017a14144f42a in __assert_fail_base (fmt=0x17a1415d6a38 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x56247fc086e0 "mode == 16 || mode == 12 || fix_block->page.status != buf_page_t::FREED", file=file@entry=0x56247fc00da0 "/home/mleich/10.5_A/storage/innobase/buf/buf0buf.cc", line=line@entry=3522, function=function@entry=0x56247fc0ba80 <buf_page_get_low(page_id_t, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned int, mtr_t*, dberr_t*, bool)::__PRETTY_FUNCTION__> "buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, const char*, unsigned int, mtr_t*, dberr_t*, bool)") at assert.c:92 #3 0x000017a14144f4a2 in __GI___assert_fail (assertion=assertion@entry=0x56247fc086e0 "mode == 16 || mode == 12 || fix_block->page.status != buf_page_t::FREED", file=file@entry=0x56247fc00da0 "/home/mleich/10.5_A/storage/innobase/buf/buf0buf.cc", line=line@entry=3522, function=function@entry=0x56247fc0ba80 <buf_page_get_low(page_id_t, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned int, mtr_t*, dberr_t*, bool)::__PRETTY_FUNCTION__> "buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, const char*, unsigned int, mtr_t*, dberr_t*, bool)") at assert.c:101 #4 0x000056247ecbbf04 in buf_page_get_low (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=<optimized out>, guess@entry=0x0, mode=mode@entry=10, file=file@entry=0x56247fbd8f00 "/home/mleich/10.5_A/storage/innobase/btr/btr0cur.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /home/mleich/10.5_A/storage/innobase/buf/buf0buf.cc:3522 #5 0x000056247ecbc528 in buf_page_get_gen (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10, file=file@entry=0x56247fbd8f00 "/home/mleich/10.5_A/storage/innobase/btr/btr0cur.cc", line=803, mtr=0x68cb524390b0, err=0x68cb52436c20, allow_ibuf_merge=false) at /home/mleich/10.5_A/storage/innobase/buf/buf0buf.cc:3647 #6 0x000056247ebea8bf in btr_block_get_func (index=..., page=page@entry=58, mode=mode@entry=1, merge=<optimized out>, file=file@entry=0x56247fbd8f00 "/home/mleich/10.5_A/storage/innobase/btr/btr0cur.cc", line=line@entry=803, mtr=<optimized out>) at /home/mleich/10.5_A/storage/innobase/include/btr0btr.h:240 #7 0x000056247ec1ba32 in btr_cur_optimistic_latch_leaves (block=0x3e817fa41f70, modify_clock=<optimized out>, latch_mode=latch_mode@entry=0x68cb52436eb8, cursor=cursor@entry=0x6210000f3f80, file=<optimized out>, line=<optimized out>, mtr=<optimized out>) at /home/mleich/10.5_A/storage/innobase/btr/btr0cur.cc:802 #8 0x000056247ec660d5 in btr_pcur_restore_position_func (latch_mode=<optimized out>, latch_mode@entry=35, cursor=cursor@entry=0x6210000f3f80, file=<optimized out>, file@entry=0x56247fbf35e0 "/home/mleich/10.5_A/storage/innobase/btr/btr0pcur.cc", line=<optimized out>, line@entry=547, mtr=mtr@entry=0x68cb524390b0) at /home/mleich/10.5_A/storage/innobase/btr/btr0pcur.cc:317 #9 0x000056247ec6711b in btr_pcur_move_backward_from_page (cursor=cursor@entry=0x6210000f3f80, mtr=mtr@entry=0x68cb524390b0) at /home/mleich/10.5_A/storage/innobase/btr/btr0pcur.cc:547 #10 0x000056247ec695be in btr_pcur_move_to_prev (cursor=cursor@entry=0x6210000f3f80, mtr=mtr@entry=0x68cb524390b0) at /home/mleich/10.5_A/storage/innobase/btr/btr0pcur.cc:602 #11 0x000056247e9eecc7 in row_search_mvcc (buf=buf@entry=0x61a000b532b8 "\360\001\b\240", <incomplete sequence \313>, mode=PAGE_CUR_L, mode@entry=PAGE_CUR_UNSUPP, prebuilt=<optimized out>, match_mode=match_mode@entry=0, direction=direction@entry=2) at /home/mleich/10.5_A/storage/innobase/row/row0sel.cc:5577 #12 0x000056247e53942f in ha_innobase::general_fetch (this=0x61d000960ab8, buf=<optimized out>, direction=direction@entry=2, match_mode=match_mode@entry=0) at /home/mleich/10.5_A/storage/innobase/handler/ha_innodb.cc:9226 #13 0x000056247e539af7 in ha_innobase::index_prev (this=<optimized out>, buf=<optimized out>) at /home/mleich/10.5_A/storage/innobase/handler/ha_innodb.cc:9319 #14 0x000056247d9a18b0 in handler::ha_index_prev (this=0x61d000960ab8, buf=<optimized out>) at /home/mleich/10.5_A/sql/handler.cc:3187 #15 0x000056247deed1f7 in rr_index_desc (info=0x68cb52439e80) at /home/mleich/10.5_A/sql/records.cc:501 #16 0x000056247d45b68a in READ_RECORD::read_record (this=0x68cb52439e80) at /home/mleich/10.5_A/sql/records.h:79 #17 mysql_update (thd=thd@entry=0x62b0000af218, table_list=<optimized out>, fields=..., values=..., conds=<optimized out>, order_num=<optimized out>, order=<optimized out>, limit=1, ignore=<optimized out>, found_return=<optimized out>, updated_return=<optimized out>) at /home/mleich/10.5_A/sql/sql_update.cc:843 #18 0x000056247d102dc0 in mysql_execute_command (thd=thd@entry=0x62b0000af218) at /home/mleich/10.5_A/sql/sql_parse.cc:4393 #19 0x000056247d125ed7 in mysql_parse (thd=thd@entry=0x62b0000af218, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x68cb5243bff0, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /home/mleich/10.5_A/sql/sql_parse.cc:7993 #20 0x000056247d0eb366 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b0000af218, packet=packet@entry=0x629000906219 " UPDATE t4 SET col2 = 16 ORDER BY col1 DESC LIMIT 2 /* E_R Thread8 QNO 686 CON_ID 37 */ ", packet_length=packet_length@entry=89, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /home/mleich/10.5_A/sql/sql_parse.cc:1866 #21 0x000056247d0e61b7 in do_command (thd=0x62b0000af218) at /home/mleich/10.5_A/sql/sql_parse.cc:1347 #22 0x000056247d586281 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x608000002c38, put_in_cache=put_in_cache@entry=true) at /home/mleich/10.5_A/sql/sql_connect.cc:1411 #23 0x000056247d587525 in handle_one_connection (arg=arg@entry=0x608000002c38) at /home/mleich/10.5_A/sql/sql_connect.cc:1313 #24 0x000056247e300ca2 in pfs_spawn_thread (arg=0x615000008c98) at /home/mleich/10.5_A/storage/perfschema/pfs.cc:2201 #25 0x00007ff21bd9e6db in start_thread (arg=0x68cb5243e300) at pthread_create.c:463 #26 0x000017a141540a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (rr)   /home/mleich/RQG_mleich/rqg.pl \ --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \ --gendata=conf/mariadb/table_stress.zz \ --gendata_sql=conf/mariadb/table_stress.sql \ --engine=Innodb \ --reporters=Deadlock1,ErrorLog,Backtrace \ --mysqld=--connect_timeout=60 \ --mysqld=--net_read_timeout=30 \ --mysqld=--net_write_timeout=60 \ --mysqld=--loose-idle_readonly_transaction_timeout=0 \ --mysqld=--loose-idle_transaction_timeout=0 \ --mysqld=--loose-idle_write_transaction_timeout=0 \ --mysqld=--interactive_timeout=28800 \ --mysqld=--lock_wait_timeout=86400 \ --mysqld=--innodb-lock-wait-timeout=50 \ --mysqld=--loose-table_lock_wait_timeout=50 \ --mysqld=--wait_timeout=28800 \ --mysqld=--slave_net_timeout=60 \ --mysqld=--loose-max-statement-time=30 \ --mysqld=--loose-debug_assert_on_not_freed_memory=0 \ --mysqld=--log-output=none \ --duration=300 \ --seed=random \ --sqltrace=MarkErrors \ --grammar=conf/mariadb/table_stress_innodb_nocopy.yy \ --threads=9 \ --mysqld=--innodb_adaptive_hash_index=ON \ --mysqld=--innodb_page_size=4K \ --mysqld=--innodb-buffer-pool-size=8M \ --duration=300 \ --no_mask \ ... certain settings specific to the testing box --script_debug=_nix_ \ --rr=Server \ --rr_options=--chaos

            I suspect that this could be a bug in btr_cur_optimistic_latch_leaves(), which is called by btr_pcur_restore_position_func(). It is requesting a page with BTR_GET, but there might be no guarantee that the page is still available. The block parameter (page 61) looks like a leaf page (PAGE_LEVEL is 0), with a preceding page 90 and no succeeding page. Yet, we have left_page_no=58 in the code:

            		left_page_no = btr_page_get_prev(block->frame);
            		rw_lock_s_unlock(&block->lock);
             
            		cursor->left_block = left_page_no != FIL_NULL
            			? btr_block_get(*cursor->index, left_page_no, mode,
            					page_is_leaf(block->frame), mtr)
            			: NULL;
            

            Can you spot the mistake? I would claim that BUF_GET is wrong here, and we should perform an ‘optimistic’ read. After we released the block->lock, the block->frame is not guaranteed to be valid any more! Whatever freed the page 58 should have modified the block->frame as well. Let us ask rr replay for the proof:

            10.5 054f10365c49b70a3868edc0c955ffe187f8344c

            (rr) watch -l block->frame[8]@4
            Hardware watchpoint 1: -location block->frame[8]@4
            (rr) when
            Current event: 355531
            (rr) reverse-continue
            Continuing.
             
            Thread 2 received signal SIGABRT, Aborted.
            __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
            51	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
            (rr) reverse-continue
            Continuing.
            [Switching to Thread 73817.94395]
             
            Thread 154 hit Hardware watchpoint 1: -location block->frame[8]@4
             
            Old value = "\000\000\000Z"
            New value = "\000\000\000:"
            0x000056247ebec2ef in mtr_t::write<4u, (mtr_t::write_type)1, unsigned long> (
                val=2141462384, ptr=0x3e817fc3f008, block=..., this=0x6f6a5d4564e0)
                at /home/mleich/10.5_A/storage/innobase/include/mtr0log.h:209
            209	  ::memcpy(ptr, buf, l);
            (rr) when
            Current event: 355495
            (rr) where
            #0  0x000056247ebec2ef in mtr_t::write<4u, (mtr_t::write_type)1, unsigned long>
                (val=2141462384, ptr=0x3e817fc3f008, block=..., this=0x6f6a5d4564e0)
                at /home/mleich/10.5_A/storage/innobase/include/mtr0log.h:209
            #1  btr_page_set_prev (block=block@entry=0x3e817fa41f70, prev=prev@entry=90, 
                mtr=mtr@entry=0x6f6a5d4564e0)
                at /home/mleich/10.5_A/storage/innobase/include/btr0btr.ic:80
            #2  0x000056247ebc5367 in btr_level_list_remove (block=..., index=..., 
                mtr=mtr@entry=0x6f6a5d4564e0)
                at /home/mleich/10.5_A/storage/innobase/btr/btr0btr.cc:3179
            #3  0x000056247ebe635d in btr_compress (cursor=cursor@entry=0x61b000042290, 
                adjust=adjust@entry=0, mtr=mtr@entry=0x6f6a5d4564e0)
                at /home/mleich/10.5_A/storage/innobase/btr/btr0btr.cc:3560
            

            I do not think that we need to analyze this further. My suggested fix would go along the following lines:

            1. Replace the BUF_GET in the btr_cur_optimistic_latch_leaves() call stack with BUF_GET_POSSIBLY_FREED.
            2. Once we have a latch on cursor->left_block, we must check that its FIL_PAGE_NEXT matches our block number. If not, release cursor->left_block and return false.

            As far as I can tell, this bug might only affect reverse index scans (BTR_SEARCH_PREV, BTR_MODIFY_PREV).

            marko Marko Mäkelä added a comment - I suspect that this could be a bug in btr_cur_optimistic_latch_leaves() , which is called by btr_pcur_restore_position_func() . It is requesting a page with BTR_GET , but there might be no guarantee that the page is still available. The block parameter (page 61) looks like a leaf page ( PAGE_LEVEL is 0), with a preceding page 90 and no succeeding page. Yet, we have left_page_no=58 in the code: left_page_no = btr_page_get_prev(block->frame); rw_lock_s_unlock(&block->lock);   cursor->left_block = left_page_no != FIL_NULL ? btr_block_get(*cursor->index, left_page_no, mode, page_is_leaf(block->frame), mtr) : NULL; Can you spot the mistake? I would claim that BUF_GET is wrong here, and we should perform an ‘optimistic’ read. After we released the block->lock , the block->frame is not guaranteed to be valid any more! Whatever freed the page 58 should have modified the block->frame as well. Let us ask rr replay for the proof: 10.5 054f10365c49b70a3868edc0c955ffe187f8344c (rr) watch -l block->frame[8]@4 Hardware watchpoint 1: -location block->frame[8]@4 (rr) when Current event: 355531 (rr) reverse-continue Continuing.   Thread 2 received signal SIGABRT, Aborted. __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. (rr) reverse-continue Continuing. [Switching to Thread 73817.94395]   Thread 154 hit Hardware watchpoint 1: -location block->frame[8]@4   Old value = "\000\000\000Z" New value = "\000\000\000:" 0x000056247ebec2ef in mtr_t::write<4u, (mtr_t::write_type)1, unsigned long> ( val=2141462384, ptr=0x3e817fc3f008, block=..., this=0x6f6a5d4564e0) at /home/mleich/10.5_A/storage/innobase/include/mtr0log.h:209 209 ::memcpy(ptr, buf, l); (rr) when Current event: 355495 (rr) where #0 0x000056247ebec2ef in mtr_t::write<4u, (mtr_t::write_type)1, unsigned long> (val=2141462384, ptr=0x3e817fc3f008, block=..., this=0x6f6a5d4564e0) at /home/mleich/10.5_A/storage/innobase/include/mtr0log.h:209 #1 btr_page_set_prev (block=block@entry=0x3e817fa41f70, prev=prev@entry=90, mtr=mtr@entry=0x6f6a5d4564e0) at /home/mleich/10.5_A/storage/innobase/include/btr0btr.ic:80 #2 0x000056247ebc5367 in btr_level_list_remove (block=..., index=..., mtr=mtr@entry=0x6f6a5d4564e0) at /home/mleich/10.5_A/storage/innobase/btr/btr0btr.cc:3179 #3 0x000056247ebe635d in btr_compress (cursor=cursor@entry=0x61b000042290, adjust=adjust@entry=0, mtr=mtr@entry=0x6f6a5d4564e0) at /home/mleich/10.5_A/storage/innobase/btr/btr0btr.cc:3560 I do not think that we need to analyze this further. My suggested fix would go along the following lines: Replace the BUF_GET in the btr_cur_optimistic_latch_leaves() call stack with BUF_GET_POSSIBLY_FREED . Once we have a latch on cursor->left_block , we must check that its FIL_PAGE_NEXT matches our block number. If not, release cursor->left_block and return false . As far as I can tell, this bug might only affect reverse index scans ( BTR_SEARCH_PREV , BTR_MODIFY_PREV ).

            patch is in bb-10.2-thiru

            thiru Thirunarayanan Balathandayuthapani added a comment - patch is in bb-10.2-thiru

            Thanks! I think that it is a step to the right direction, but not enough. Because this error handling cannot adequately be covered by the regression tests, this must be extensively tested with RQG once you have addressed my review comments.

            marko Marko Mäkelä added a comment - Thanks! I think that it is a step to the right direction, but not enough. Because this error handling cannot adequately be covered by the regression tests, this must be extensively tested with RQG once you have addressed my review comments.

            origin/bb-10.2-MDEV-14711 commit 0ec30e0606d1795d020bab9f1a40744e57e6eb68
            "survived" ~ 16000 RQG tests. The issues observed are known and exist on other source trees too.
            

            mleich Matthias Leich added a comment - origin/bb-10.2-MDEV-14711 commit 0ec30e0606d1795d020bab9f1a40744e57e6eb68 "survived" ~ 16000 RQG tests. The issues observed are known and exist on other source trees too.

            People

              thiru Thirunarayanan Balathandayuthapani
              elenst Elena Stepanova
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.