Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21217

innodb_force_recovery=2 may wrongly abort the rollback of recovered transactions

    XMLWordPrintable

    Details

      Description

      The assert hit during RQG testing was observed first time on
      origin/10.5 8ed646f0712b8f459439dfcdc3cf4ae45d79ec95 2019-12-02T13:35:54+03:00
      The corresponding remains (archive with core file etc.) of that failing test had to be deleted later.
       
      The server error log contained
      #0  __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
      #1  0x00005632ef370321 in my_write_core (sig=6) at mysys/stacktrace.c:518
      #2  0x00005632eea7659f in handle_fatal_signal (sig=6) at sql/signal_handler.cc:343
      #3  <signal handler called>
      #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
      #5  0x00007f8ef32fc37a in __GI_abort () at abort.c:89
      #6  0x00007f8ef32f2b47 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x5632ef83c7d4 "trx->lock.table_cached == 0", file=file@entry=0x5632ef83c3a0 "storage/innobase/trx/trx0trx.cc", 
          line=line@entry=381, function=function@entry=0x5632ef83e8d0 <trx_create()::__PRETTY_FUNCTION__> "trx_t* trx_create()") at assert.c:92
      #7  0x00007f8ef32f2bf2 in __GI___assert_fail (assertion=0x5632ef83c7d4 "trx->lock.table_cached == 0", file=0x5632ef83c3a0 "storage/innobase/trx/trx0trx.cc", line=381, 
          function=0x5632ef83e8d0 <trx_create()::__PRETTY_FUNCTION__> "trx_t* trx_create()") at assert.c:101
      #8  0x00005632ef0f14bc in trx_create () at storage/innobase/trx/trx0trx.cc:381
      #9  0x00005632eee729f2 in innobase_trx_allocate (thd=0x7f8eb8001bd8) at storage/innobase/handler/ha_innodb.cc:2526
      #10 0x00005632eee72ad1 in check_trx_exists (thd=0x7f8eb8001bd8) at storage/innobase/handler/ha_innodb.cc:2551
      #11 0x00005632eee9e303 in ha_innobase::update_thd (this=0x7f8eb800f3c0, thd=0x7f8eb8001bd8) at storage/innobase/handler/ha_innodb.cc:2767
      #12 0x00005632eee8d0dc in ha_innobase::info_low (this=0x7f8eb800f3c0, flag=282, is_analyze=false) at storage/innobase/handler/ha_innodb.cc:14493
      #13 0x00005632eee8dd1d in ha_innobase::info (this=0x7f8eb800f3c0, flag=282) at storage/innobase/handler/ha_innodb.cc:14806
      #14 0x00005632eee7a5ae in ha_innobase::open (this=0x7f8eb800f3c0, name=0x7f8eb800bda8 "./mysql/gtid_slave_pos") at storage/innobase/handler/ha_innodb.cc:6117
      #15 0x00005632eea7f3f8 in handler::ha_open (this=0x7f8eb800f3c0, table_arg=0x7f8eb800e1c8, name=0x7f8eb800bda8 "./mysql/gtid_slave_pos", mode=2, test_if_locked=18, mem_root=0x0, partitions_to_open=0x0)
          at sql/handler.cc:2736
      #16 0x00005632ee8722a9 in open_table_from_share (thd=0x7f8eb8001bd8, share=0x7f8eb800b810, alias=0x7f8eec21e3a8, db_stat=33, prgflag=8, ha_open_flags=18, outparam=0x7f8eb800e1c8, is_create_table=false, 
          partitions_to_open=0x0) at sql/table.cc:4094
      #17 0x00005632ee68dcc4 in open_table (thd=0x7f8eb8001bd8, table_list=0x7f8eec21e360, ot_ctx=0x7f8eec21e190) at sql/sql_base.cc:2074
      #18 0x00005632ee69175d in open_and_process_table (thd=0x7f8eb8001bd8, tables=0x7f8eec21e360, counter=0x7f8eec21e224, flags=0, prelocking_strategy=0x7f8eec21e2a8, has_prelocking_list=false, ot_ctx=0x7f8eec21e190)
          at sql/sql_base.cc:3837
      #19 0x00005632ee6929f7 in open_tables (thd=0x7f8eb8001bd8, options=..., start=0x7f8eec21e208, counter=0x7f8eec21e224, flags=0, prelocking_strategy=0x7f8eec21e2a8) at sql/sql_base.cc:4311
      #20 0x00005632ee694b74 in open_and_lock_tables (thd=0x7f8eb8001bd8, options=..., tables=0x7f8eec21e360, derived=false, flags=0, prelocking_strategy=0x7f8eec21e2a8) at sql/sql_base.cc:5204
      #21 0x00005632ee64ea15 in open_and_lock_tables (thd=0x7f8eb8001bd8, tables=0x7f8eec21e360, derived=false, flags=0) at sql/sql_base.h:506
      #22 0x00005632ee8bc86d in scan_one_gtid_slave_pos_table (thd=0x7f8eb8001bd8, hash=0x7f8eec21ee10, array=0x7f8eec21edb0, tablename=0x7f8eb800a6e0, out_hton=0x7f8eec21ea70) at sql/rpl_rli.cc:1531
      #23 0x00005632ee8bd633 in load_gtid_state_cb (thd=0x7f8eb8001bd8, table_name=0x7f8eb800a6e0, arg=0x7f8eec21edd0) at sql/rpl_rli.cc:1806
      #24 0x00005632ee8bd12a in scan_all_gtid_slave_pos_table (thd=0x7f8eb8001bd8, cb=0x5632ee8bd5f1 <load_gtid_state_cb(THD*, LEX_CSTRING*, void*)>, cb_data=0x7f8eec21edd0) at sql/rpl_rli.cc:1675
      #25 0x00005632ee8bd7f1 in rpl_load_gtid_slave_state (thd=0x7f8eb8001bd8) at sql/rpl_rli.cc:1843
      #26 0x00005632eebba326 in binlog_background_thread (arg=0x0) at sql/log.cc:9971
      #27 0x00007f8ef413f6da in start_thread (arg=0x7f8eec21f700) at pthread_create.c:456
      #28 0x00007f8ef33cdd7f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
       
      What follows are the results of another replay on
      origin/10.5 d2c593c2a66549abe67c4e73dbefeb876fa5ed9d 2020-06-12T16:22:03+03:00  build with ASAN
      which claims to be a 10.5.4.
      ...
      2020-06-12 16:35:45 0 [Note] Crash recovery finished.
      ...
      2020-06-12 16:35:50 0 [Note] /home/mleich/Server_bin/10.5_asan//bin/mysqld: ready for connections.
      2020-06-12T16:36:58 [67872] | Version: '10.5.4-MariaDB-debug-log'  socket: ...
      2020-06-12T16:36:58 [67872] | mysqld: /home/mleich/10.5/storage/innobase/trx/trx0trx.cc:376: trx_t* trx_create(): Assertion `trx->lock.table_cached == 0' failed.
      ....    "rr" together with ASAN etc. caused that I did not got the usual backtrace within the server error log 
       
      Backtrace based on rr trace:
      -------------------------------------------
      Thread 2 (Thread 59357.95740):
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
      #1  0x00007fda70738801 in __GI_abort () at abort.c:79
      #2  0x00007fda7072839a in __assert_fail_base (fmt=0x7fda708af7d8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x55db86822240 "trx->lock.table_cached == 0", 
          file=file@entry=0x55db86821a00 "/home/mleich/10.5/storage/innobase/trx/trx0trx.cc", line=line@entry=376, function=function@entry=0x55db86827220 <trx_create()::__PRETTY_FUNCTION__> "trx_t* trx_create()") at assert.c:92
      #3  0x00007fda70728412 in __GI___assert_fail (assertion=0x55db86822240 "trx->lock.table_cached == 0", file=0x55db86821a00 "/home/mleich/10.5/storage/innobase/trx/trx0trx.cc", line=376, 
          function=0x55db86827220 <trx_create()::__PRETTY_FUNCTION__> "trx_t* trx_create()") at assert.c:101
      #4  0x000055db8523576a in trx_create () at /home/mleich/10.5/storage/innobase/trx/trx0trx.cc:376
      #5  0x000055db84cf2b02 in innobase_trx_allocate (thd=0x62b00007e218) at /home/mleich/10.5/storage/innobase/handler/ha_innodb.cc:2465
      #6  0x000055db84cf2cea in check_trx_exists (thd=0x62b00007e218) at /home/mleich/10.5/storage/innobase/handler/ha_innodb.cc:2490
      #7  0x000055db84d5aa1f in ha_innobase::update_thd (this=0x61d0007e90b8, thd=0x62b00007e218) at /home/mleich/10.5/storage/innobase/handler/ha_innodb.cc:2706
      #8  0x000055db84d32ede in ha_innobase::info_low (this=0x61d0007e90b8, flag=282, is_analyze=false) at /home/mleich/10.5/storage/innobase/handler/ha_innodb.cc:14395
      #9  0x000055db84d34c73 in ha_innobase::info (this=0x61d0007e90b8, flag=282) at /home/mleich/10.5/storage/innobase/handler/ha_innodb.cc:14707
      #10 0x000055db84d041fd in ha_innobase::open (this=0x61d0007e90b8, name=0x61b000080250 "./test/E") at /home/mleich/10.5/storage/innobase/handler/ha_innodb.cc:6032
      #11 0x000055db84220eee in handler::ha_open (this=0x61d0007e90b8, table_arg=0x620000040098, name=0x61b000080250 "./test/E", mode=2, test_if_locked=18, mem_root=0x0, partitions_to_open=0x0) at /home/mleich/10.5/sql/handler.cc:2906
      #12 0x000055db83d862c7 in open_table_from_share (thd=0x62b00007e218, share=0x61b00007fcb8, alias=0x629000401280, db_stat=33, prgflag=8, ha_open_flags=18, outparam=0x620000040098, is_create_table=false, partitions_to_open=0x0)
          at /home/mleich/10.5/sql/table.cc:4185
      #13 0x000055db838ad314 in open_table (thd=0x62b00007e218, table_list=0x629000401238, ot_ctx=0x373f36466140) at /home/mleich/10.5/sql/sql_base.cc:1981
      #14 0x000055db838b6f22 in open_and_process_table (thd=0x62b00007e218, tables=0x629000401238, counter=0x373f36466260, flags=1090, prelocking_strategy=0x373f364662a0, has_prelocking_list=false, ot_ctx=0x373f36466140)
          at /home/mleich/10.5/sql/sql_base.cc:3780
      #15 0x000055db838b9aca in open_tables (thd=0x62b00007e218, options=..., start=0x373f36466230, counter=0x373f36466260, flags=1090, prelocking_strategy=0x373f364662a0) at /home/mleich/10.5/sql/sql_base.cc:4252
      #16 0x000055db838a2eae in open_tables (thd=0x62b00007e218, tables=0x373f36466230, counter=0x373f36466260, flags=1090, prelocking_strategy=0x373f364662a0) at /home/mleich/10.5/sql/sql_base.h:264
      #17 0x000055db838bf6de in open_normal_and_derived_tables (thd=0x62b00007e218, tables=0x629000401238, flags=1090, dt_phases=35) at /home/mleich/10.5/sql/sql_base.cc:5221
      #18 0x000055db838bfc5a in open_tables_only_view_structure (thd=0x62b00007e218, table_list=0x629000401238, can_deadlock=false) at /home/mleich/10.5/sql/sql_base.cc:5272
      #19 0x000055db83c0c0c2 in fill_schema_table_by_open (thd=0x62b00007e218, mem_root=0x62b000083ce0, is_show_fields_or_keys=true, table=0x624000488138, schema_table=0x55db87399260 <schema_tables+1920>, orig_db_name=0x62b000085800, 
          orig_table_name=0x62b000085810, open_tables_state_backup=0x373f36468510, can_deadlock=false) at /home/mleich/10.5/sql/sql_show.cc:4600
      #20 0x000055db83c0ebd4 in get_all_tables (thd=0x62b00007e218, tables=0x62b000087238, cond=0x0) at /home/mleich/10.5/sql/sql_show.cc:5125
      #21 0x000055db83c41bc3 in get_schema_tables_result (join=0x62b0000883a0, executed_place=PROCESSED_BY_JOIN_EXEC) at /home/mleich/10.5/sql/sql_show.cc:8673
      #22 0x000055db83b143b9 in JOIN::exec_inner (this=0x62b0000883a0) at /home/mleich/10.5/sql/sql_select.cc:4401
      #23 0x000055db83b12268 in JOIN::exec (this=0x62b0000883a0) at /home/mleich/10.5/sql/sql_select.cc:4225
      #24 0x000055db83b16322 in mysql_select (thd=0x62b00007e218, tables=0x62b000087238, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2684619520, result=0x62b000088370, unit=0x62b000082230, 
          select_lex=0x62b000082a30) at /home/mleich/10.5/sql/sql_select.cc:4649
      #25 0x000055db83ae84d3 in handle_select (thd=0x62b00007e218, lex=0x62b000082168, result=0x62b000088370, setup_tables_done_option=0) at /home/mleich/10.5/sql/sql_select.cc:417
      #26 0x000055db83a58ca5 in execute_sqlcom_select (thd=0x62b00007e218, all_tables=0x62b000087238) at /home/mleich/10.5/sql/sql_parse.cc:6208
      #27 0x000055db83a47fd4 in mysql_execute_command (thd=0x62b00007e218) at /home/mleich/10.5/sql/sql_parse.cc:3939
      #28 0x000055db83a63df9 in mysql_parse (thd=0x62b00007e218, rawbuf=0x62b000085238 "SHOW KEYS FROM `test`.`E`", length=25, parser_state=0x373f3646a8f0, is_com_multi=false, is_next_command=false) at /home/mleich/10.5/sql/sql_parse.cc:7992
      #29 0x000055db83a3a94d in dispatch_command (command=COM_QUERY, thd=0x62b00007e218, packet=0x6290003ed219 "SHOW KEYS FROM `test`.`E`", packet_length=25, is_com_multi=false, is_next_command=false) at /home/mleich/10.5/sql/sql_parse.cc:1874
      #30 0x000055db83a371cb in do_command (thd=0x62b00007e218) at /home/mleich/10.5/sql/sql_parse.cc:1355
      #31 0x000055db83e64708 in do_handle_one_connection (connect=0x6080045091b8, put_in_cache=true) at /home/mleich/10.5/sql/sql_connect.cc:1411
      #32 0x000055db83e64066 in handle_one_connection (arg=0x6080045091b8) at /home/mleich/10.5/sql/sql_connect.cc:1313
      #33 0x000055db84b1cac1 in pfs_spawn_thread (arg=0x615000008f18) at /home/mleich/10.5/storage/perfschema/pfs.cc:2201
      #34 0x0000104016d8d6db in start_thread (arg=0x373f3646c300) at pthread_create.c:463
      #35 0x00007fda7081988f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
       
      Work flow of the RQG Test
      0. Bootstrap
      1. Start the server with innodb_force_recovery=2
      2. Create and fill all tables which get used later.
      3. Several concurrent sessions run excessive DML on these tables
           ~ 280s after start of 3. the server process gets killed (SIGKILL)
      4. The attempt to restart that server with innodb_force_recovery=2 had success.
      5. Than come checks for he integrity of the data content and there a 
           SHOW KEYS FROM `test`.`E` caused the assert.
       
      RQG
      git clone https://github.com/mleich1/rqg --branch experimental RQG
      origin/experimental 611a31e38a3bc157392c0748c95991b6a248bb3b 2020-06-09T15:31:46+02:00
       
      perl rqg.pl \                        
      --grammar=conf/mariadb/innodb_compression_encryption.yy \
      --gendata=conf/mariadb/innodb_compression_encryption.zz \
      --mysqld=--plugin-load-add=file_key_management.so \
      --mysqld=--loose-file-key-management-filename=conf/mariadb/encryption_keys.txt \
      --mysqld=--innodb_use_native_aio=1 \
      --mysqld=--innodb_stats_persistent=off \
      --mysqld=--innodb_lock_schedule_algorithm=fcfs \
      --mysqld=--loose-idle_write_transaction_timeout=0 \
      --mysqld=--loose-idle_transaction_timeout=0 \
      --mysqld=--loose-idle_readonly_transaction_timeout=0 \
      --mysqld=--connect_timeout=60 \
      --mysqld=--interactive_timeout=28800 \
      --mysqld=--slave_net_timeout=60 \
      --mysqld=--net_read_timeout=30 \
      --mysqld=--net_write_timeout=60 \
      --mysqld=--loose-table_lock_wait_timeout=50 \
      --mysqld=--wait_timeout=28800 \
      --mysqld=--lock-wait-timeout=86400 \
      --mysqld=--innodb-lock-wait-timeout=50 \
      --no-mask \
      --queries=10000000 \
      --duration=300 \
      --seed=random \
      --reporters=Backtrace \
      --reporters=ErrorLog \
      --reporters=Deadlock1 \
      --reporters=CrashRecovery1 \
      --validators=None \
      --mysqld=--log_output=none \
      --mysqld=--log-bin \
      --mysqld=--log_bin_trust_function_creators=1 \
      --mysqld=--loose-max-statement-time=30 \
      --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
      --engine=InnoDB \
      --restart_timeout=120 \
      --threads=2 \
      --mysqld=--innodb_force_recovery=2 \
      --mysqld=--innodb_page_size=8K \
      --mysqld=--innodb-buffer-pool-size=256M \
      --duration=300 \
      --no_mask \
      --workdir=<local settings> \
      --vardir=<local settings> \
      --mtr-build-thread=<local settings> \
      --basedir1=<local settings> \
      --basedir2=<local settings> \
      --script_debug=_nix_ \
      --rr=Server \
      --rr_options=--chaos
       
      Certain options are not required for replay.
       
      
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              marko Marko Mäkelä
              Reporter:
              mleich Matthias Leich
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: