Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33575

Assertion `rec_get_trx_id(rec, clust_index) < trx->id' failed in row_merge_read_clustered_index

Details

    Description

      This is a continuation of MDEV-33547. It was closed as not fixable in 10.5; this one is for 10.6 (and probably higher versions) and a different failing assertion, so I've created a new one instead of re-opening MDEV-33547 to avoid confusion in case you decide to fix this. Still, the scenario is unlikely, so I'm setting it to Minor, feel free to close it as "Won't fix", too.

      The scenario again involves concurrent OPTIMIZE on a normal table and a statistical table. This time I couldn't get enough concurrency in MTR (It does still happen in MTR, but it takes too long and is too unreliable). The RQG test below reproduces it for me in the matter of seconds. It is also rr-able.

      As of now, I can only reproduce the assertion failure on 10.6. 10.5 fails on the assertion from MDEV-33547, while 10.6.17 release and current 10.11+ branches end up in a deadlock instead. Probably the difference comes from the MDEV-33462 fix which is now in 10.6 but not in higher branches yet.

      10.6 33dcf8155b4a215c9f24ab92f5274bc51503b6f2

      mysqld: /data/MDEV-33575/10.6/storage/innobase/row/row0merge.cc:2181: dberr_t row_merge_read_clustered_index(trx_t*, TABLE*, const dict_table_t*, dict_table_t*, bool, dict_index_t**, dict_index_t*, fts_psort_t*, merge_file_t*, const ulint*, ulint, const dtuple_t*, const dict_add_v_col_t*, const ulint*, ulint, ib_sequence_t&, row_merge_block_t*, bool, pfs_os_file_t*, ut_stage_alter_t*, double, row_merge_block_t*, TABLE*, bool, const col_collations*): Assertion `rec_get_trx_id(rec, clust_index) < trx->id' failed.
       
      (rr) bt
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #1  0x000044bd59d09859 in __GI_abort () at abort.c:79
      #2  0x000044bd59d09729 in __assert_fail_base (fmt=0x44bd59e9f588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55e1b213f830 "rec_get_trx_id(rec, clust_index) < trx->id", 
          file=0x55e1b213e5f0 "/data/MDEV-33575/10.6/storage/innobase/row/row0merge.cc", line=2181, function=<optimized out>) at assert.c:92
      #3  0x000044bd59d1afd6 in __GI___assert_fail (assertion=0x55e1b213f830 "rec_get_trx_id(rec, clust_index) < trx->id", file=0x55e1b213e5f0 "/data/MDEV-33575/10.6/storage/innobase/row/row0merge.cc", 
          line=2181, 
          function=0x55e1b213f3a8 "dberr_t row_merge_read_clustered_index(trx_t*, TABLE*, const dict_table_t*, dict_table_t*, bool, dict_index_t**, dict_index_t*, fts_psort_t*, merge_file_t*, const ulint*, ulint, const dtuple_t*, const"...) at assert.c:101
      #4  0x000055e1b1957e29 in row_merge_read_clustered_index (trx=0x6fec570f4b80, table=0x1f206b63ccf0, old_table=0x36de78031eb0, new_table=0x55e1b50cc150, online=false, index=0x36de78188608, 
          fts_sort_idx=0x0, psort_info=0x0, files=0x36de7817aaf0, key_numbers=0x36de78188610, n_index=1, defaults=0x0, add_v=0x0, col_map=0x36de781ae8e0, add_autoinc=18446744073709551615, sequence=..., 
          block=0x4960e8d4000 "", skip_pk_sort=true, tmpfd=0x1f206b63be58, stage=0x36de7812a270, pct_cost=50, crypt_block=0x0, eval_table=0x1f206b63ccf0, allow_not_null=false, col_collate=0x0)
          at /data/MDEV-33575/10.6/storage/innobase/row/row0merge.cc:2181
      #5  0x000055e1b195ff83 in row_merge_build_indexes (trx=0x6fec570f4b80, old_table=0x36de78031eb0, new_table=0x55e1b50cc150, online=false, indexes=0x36de78188608, key_numbers=0x36de78188610, n_indexes=1, 
          table=0x1f206b63ccf0, defaults=0x0, col_map=0x36de781ae8e0, add_autoinc=18446744073709551615, sequence=..., skip_pk_sort=true, stage=0x36de7812a270, add_v=0x0, eval_table=0x1f206b63ccf0, 
          allow_not_null=false, col_collate=0x0) at /data/MDEV-33575/10.6/storage/innobase/row/row0merge.cc:4554
      #6  0x000055e1b17f75a6 in ha_innobase::inplace_alter_table (this=0x36de781b5060, altered_table=0x1f206b63ccf0, ha_alter_info=0x1f206b63cc30)
          at /data/MDEV-33575/10.6/storage/innobase/handler/handler0alter.cc:8811
      #7  0x000055e1b11cf32d in handler::ha_inplace_alter_table (this=0x36de781b5060, altered_table=0x1f206b63ccf0, ha_alter_info=0x1f206b63cc30) at /data/MDEV-33575/10.6/sql/handler.h:4699
      #8  0x000055e1b11bfe4d in mysql_inplace_alter_table (thd=0x36de78001628, table_list=0x36de78013ba8, table=0x36de78180b78, altered_table=0x1f206b63ccf0, ha_alter_info=0x1f206b63cc30, 
          target_mdl_request=0x1f206b63d5b0, ddl_log_state=0x1f206b63cbf0, trigger_param=0x1f206b63d120, alter_ctx=0x1f206b63e170) at /data/MDEV-33575/10.6/sql/sql_table.cc:7529
      #9  0x000055e1b11c94a9 in mysql_alter_table (thd=0x36de78001628, new_db=0x55e1b1e63e30 <null_clex_str>, new_name=0x55e1b1e63e30 <null_clex_str>, create_info=0x1f206b63efc0, table_list=0x36de78013ba8, 
          recreate_info=0x1f206b63f400, alter_info=0x1f206b63ee50, order_num=0, order=0x0, ignore=false, if_exists=false) at /data/MDEV-33575/10.6/sql/sql_table.cc:10566
      #10 0x000055e1b11ccd2e in mysql_recreate_table (thd=0x36de78001628, table_list=0x36de78013ba8, recreate_info=0x1f206b63f400, table_copy=false) at /data/MDEV-33575/10.6/sql/sql_table.cc:11622
      #11 0x000055e1b127ddcf in admin_recreate_table (thd=0x36de78001628, table_list=0x36de78013ba8, recreate_info=0x1f206b63f400) at /data/MDEV-33575/10.6/sql/sql_admin.cc:72
      #12 0x000055e1b12813bc in mysql_admin_table (thd=0x36de78001628, tables=0x36de78013ba8, check_opt=0x36de78006b08, operator_name=0x55e1b27818b0 <msg_optimize>, lock_type=TL_WRITE, org_open_for_modify=true, 
          repair_table_use_frm=false, extra_open_options=0, prepare_func=0x0, operator_func=
          (int (handler::*)(class handler * const, class THD *, HA_CHECK_OPT *)) 0x55e1b144c0f0 <handler::ha_optimize(THD*, st_ha_check_opt*)>, view_operator_func=0x0, is_cmd_replicated=true)
          at /data/MDEV-33575/10.6/sql/sql_admin.cc:1229
      #13 0x000055e1b12827b6 in Sql_cmd_optimize_table::execute (this=0x36de780142b0, thd=0x36de78001628) at /data/MDEV-33575/10.6/sql/sql_admin.cc:1614
      #14 0x000055e1b10caf03 in mysql_execute_command (thd=0x36de78001628, is_called_from_prepared_stmt=false) at /data/MDEV-33575/10.6/sql/sql_parse.cc:6117
      #15 0x000055e1b10d0a3a in mysql_parse (thd=0x36de78001628, rawbuf=0x36de78013a70 "/* WRK-1 QNO 565 */ OPTIMIZE TABLE mysql.innodb_index_stats", length=59, parser_state=0x1f206b640350)
          at /data/MDEV-33575/10.6/sql/sql_parse.cc:8143
      #16 0x000055e1b10bd509 in dispatch_command (command=COM_QUERY, thd=0x36de78001628, packet=0x36de7800b699 "", packet_length=59, blocking=true) at /data/MDEV-33575/10.6/sql/sql_parse.cc:1896
      #17 0x000055e1b10bc106 in do_command (thd=0x36de78001628, blocking=true) at /data/MDEV-33575/10.6/sql/sql_parse.cc:1409
      #18 0x000055e1b12690b1 in do_handle_one_connection (connect=0x55e1b5083a78, put_in_cache=true) at /data/MDEV-33575/10.6/sql/sql_connect.cc:1415
      #19 0x000055e1b1268e3d in handle_one_connection (arg=0x55e1b5083a78) at /data/MDEV-33575/10.6/sql/sql_connect.cc:1317
      #20 0x00004d7f291da609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #21 0x000044bd59e06133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      randgen branch mdev33575

      perl ./run.pl --grammar=mdev33575.yy --threads=2 --duration=120 --mysqld=--max-statement-time=5 --vardir=/dev/shm/var-mdev33575 --nometadata-reload --gendata=simple --basedir=/data/MDEV-33575/10.6 --rr
      

      Attachments

        Issue Links

          Activity

            A record in the table mysql.innodb_index_stats is being unexpectedly modified while OPTIMIZE TABLE mysql.innodb_index_stats is running. This happens as part of executing OPTIMIZE TABLE simple_db.E, in dict_stats_save(). The assertion fails, because OPTIMIZE TABLE mysql.innodb_index_stats is supposed to be a locking variant and the DB_TRX_ID in the table is not supposed to be modified. Here is a related stack trace:

            10.6 33dcf8155b4a215c9f24ab92f5274bc51503b6f2

            #0  0x000055e1b1adef87 in dict_acquire_mdl_shared<false> (table=0x36de78031eb0, mdl_context=0x2d9358004e40, mdl=0x38775f9196f8, table_op=DICT_TABLE_OP_NORMAL)
                at /data/MDEV-33575/10.6/storage/innobase/dict/dict0dict.cc:708
            #1  0x000055e1b1adf35c in dict_acquire_mdl_shared<false> (table=0x36de78031eb0, thd=0x2d9358004ce8, mdl=0x38775f9196f8, table_op=DICT_TABLE_OP_NORMAL)
                at /data/MDEV-33575/10.6/storage/innobase/dict/dict0dict.cc:831
            #2  0x000055e1b1b015c0 in dict_stats_save (table_orig=0x209680b44d0, only_for_index=0x0) at /data/MDEV-33575/10.6/storage/innobase/dict/dict0stats.cc:3283
            #3  0x000055e1b1b03798 in dict_stats_update (table=0x209680b44d0, stats_upd_option=DICT_STATS_RECALC_PERSISTENT) at /data/MDEV-33575/10.6/storage/innobase/dict/dict0stats.cc:4091
            #4  0x000055e1b17bac28 in ha_innobase::info_low (this=0x2d93580b12a0, flag=28, is_analyze=true) at /data/MDEV-33575/10.6/storage/innobase/handler/ha_innodb.cc:14754
            #5  0x000055e1b17bb6ac in ha_innobase::analyze (this=0x2d93580b12a0) at /data/MDEV-33575/10.6/storage/innobase/handler/ha_innodb.cc:15046
            #6  0x000055e1b144c2c6 in handler::ha_analyze (this=0x2d93580b12a0, thd=0x2d9358004ce8, check_opt=0x2d935800a1c8) at /data/MDEV-33575/10.6/sql/handler.cc:5205
            #7  0x000055e1b1281561 in mysql_admin_table (thd=0x2d9358004ce8, tables=0x2d9358016fe8, check_opt=0x2d935800a1c8, operator_name=0x55e1b27818b0 <msg_optimize>, lock_type=TL_WRITE, org_open_for_modify=true, 
                repair_table_use_frm=false, extra_open_options=0, prepare_func=0x0, operator_func=(int (handler::*)(handler * const, THD *, HA_CHECK_OPT *)) 0x55e1b144c0f0 <handler::ha_optimize(THD*, st_ha_check_opt*)>, 
                view_operator_func=0x0, is_cmd_replicated=true) at /data/MDEV-33575/10.6/sql/sql_admin.cc:1255
            #8  0x000055e1b12827b6 in Sql_cmd_optimize_table::execute (this=0x2d93580176f0, thd=0x2d9358004ce8) at /data/MDEV-33575/10.6/sql/sql_admin.cc:1614
            #9  0x000055e1b10caf03 in mysql_execute_command (thd=0x2d9358004ce8, is_called_from_prepared_stmt=false) at /data/MDEV-33575/10.6/sql/sql_parse.cc:6117
            #10 0x000055e1b10d0a3a in mysql_parse (thd=0x2d9358004ce8, rawbuf=0x2d9358016ee0 "/* WRK-2 QNO 147 */ OPTIMIZE TABLE simple_db.E", length=46, parser_state=0x38775f91c350)
                at /data/MDEV-33575/10.6/sql/sql_parse.cc:8143
            

            This thread is acquiring a shared metadata lock (MDL) on the statistics tables, but it actually should acquire a stronger MDL that would conflict with the downgraded MDL that is held during handler::ha_inplace_alter_table() by the OPTIMIZE TABLE mysql.innodb_index_stats.

            As far as I can tell, fixing this would require an API change. I am reassigning this to the subsystem owner.

            marko Marko Mäkelä added a comment - A record in the table mysql.innodb_index_stats is being unexpectedly modified while OPTIMIZE TABLE mysql.innodb_index_stats is running. This happens as part of executing OPTIMIZE TABLE simple_db.E , in dict_stats_save() . The assertion fails, because OPTIMIZE TABLE mysql.innodb_index_stats is supposed to be a locking variant and the DB_TRX_ID in the table is not supposed to be modified. Here is a related stack trace: 10.6 33dcf8155b4a215c9f24ab92f5274bc51503b6f2 #0 0x000055e1b1adef87 in dict_acquire_mdl_shared<false> (table=0x36de78031eb0, mdl_context=0x2d9358004e40, mdl=0x38775f9196f8, table_op=DICT_TABLE_OP_NORMAL) at /data/MDEV-33575/10.6/storage/innobase/dict/dict0dict.cc:708 #1 0x000055e1b1adf35c in dict_acquire_mdl_shared<false> (table=0x36de78031eb0, thd=0x2d9358004ce8, mdl=0x38775f9196f8, table_op=DICT_TABLE_OP_NORMAL) at /data/MDEV-33575/10.6/storage/innobase/dict/dict0dict.cc:831 #2 0x000055e1b1b015c0 in dict_stats_save (table_orig=0x209680b44d0, only_for_index=0x0) at /data/MDEV-33575/10.6/storage/innobase/dict/dict0stats.cc:3283 #3 0x000055e1b1b03798 in dict_stats_update (table=0x209680b44d0, stats_upd_option=DICT_STATS_RECALC_PERSISTENT) at /data/MDEV-33575/10.6/storage/innobase/dict/dict0stats.cc:4091 #4 0x000055e1b17bac28 in ha_innobase::info_low (this=0x2d93580b12a0, flag=28, is_analyze=true) at /data/MDEV-33575/10.6/storage/innobase/handler/ha_innodb.cc:14754 #5 0x000055e1b17bb6ac in ha_innobase::analyze (this=0x2d93580b12a0) at /data/MDEV-33575/10.6/storage/innobase/handler/ha_innodb.cc:15046 #6 0x000055e1b144c2c6 in handler::ha_analyze (this=0x2d93580b12a0, thd=0x2d9358004ce8, check_opt=0x2d935800a1c8) at /data/MDEV-33575/10.6/sql/handler.cc:5205 #7 0x000055e1b1281561 in mysql_admin_table (thd=0x2d9358004ce8, tables=0x2d9358016fe8, check_opt=0x2d935800a1c8, operator_name=0x55e1b27818b0 <msg_optimize>, lock_type=TL_WRITE, org_open_for_modify=true, repair_table_use_frm=false, extra_open_options=0, prepare_func=0x0, operator_func=(int (handler::*)(handler * const, THD *, HA_CHECK_OPT *)) 0x55e1b144c0f0 <handler::ha_optimize(THD*, st_ha_check_opt*)>, view_operator_func=0x0, is_cmd_replicated=true) at /data/MDEV-33575/10.6/sql/sql_admin.cc:1255 #8 0x000055e1b12827b6 in Sql_cmd_optimize_table::execute (this=0x2d93580176f0, thd=0x2d9358004ce8) at /data/MDEV-33575/10.6/sql/sql_admin.cc:1614 #9 0x000055e1b10caf03 in mysql_execute_command (thd=0x2d9358004ce8, is_called_from_prepared_stmt=false) at /data/MDEV-33575/10.6/sql/sql_parse.cc:6117 #10 0x000055e1b10d0a3a in mysql_parse (thd=0x2d9358004ce8, rawbuf=0x2d9358016ee0 "/* WRK-2 QNO 147 */ OPTIMIZE TABLE simple_db.E", length=46, parser_state=0x38775f91c350) at /data/MDEV-33575/10.6/sql/sql_parse.cc:8143 This thread is acquiring a shared metadata lock (MDL) on the statistics tables, but it actually should acquire a stronger MDL that would conflict with the downgraded MDL that is held during handler::ha_inplace_alter_table() by the OPTIMIZE TABLE mysql.innodb_index_stats . As far as I can tell, fixing this would require an API change. I am reassigning this to the subsystem owner.

            The MDEV-35443 test case triggers this bug in 10.6:

            diff --git a/mysql-test/suite/innodb/t/stats_persistent.test b/mysql-test/suite/innodb/t/stats_persistent.test
            index 8561298c4d3..870808993f4 100644
            --- a/mysql-test/suite/innodb/t/stats_persistent.test
            +++ b/mysql-test/suite/innodb/t/stats_persistent.test
            @@ -14,14 +14,25 @@ SET DEBUG_SYNC='dict_stats_update_persistent SIGNAL stop WAIT_FOR go';
             --connect(con1, localhost, root)
             SET DEBUG_SYNC='now WAIT_FOR stop';
             
            +--connect(con2, localhost, root)
            +SET DEBUG_SYNC='innodb_inplace_alter_table_enter SIGNAL astop WAIT_FOR ago';
            +send ALTER TABLE mysql.innodb_index_stats FORCE;
            +
            +--connection con1
             --replace_column 1 SUM
             SELECT SUM(DATA_LENGTH+INDEX_LENGTH) FROM information_schema.TABLES WHERE ENGINE='InnoDB';
             
            +SET DEBUG_SYNC='now WAIT_FOR astop';
             SET DEBUG_SYNC='now SIGNAL go';
             --disconnect con1
             
             --connection default
             --reap
            +SET DEBUG_SYNC='now SIGNAL ago';
            +--connection con2
            +reap;
            +--disconnect con2
            +--connection default
             SET DEBUG_SYNC= 'RESET';
             DROP TABLE t1;
             
            

            10.6 f20ee931d8e36af01841752ce0dd06da29004835

            mariadbd: /mariadb/10.6/storage/innobase/row/row0merge.cc:2161: dberr_t row_merge_read_clustered_index(trx_t*, TABLE*, const dict_table_t*, dict_table_t*, bool, dict_index_t**, dict_index_t*, fts_psort_t*, merge_file_t*, const ulint*, ulint, const dtuple_t*, const dict_add_v_col_t*, const ulint*, ulint, ib_sequence_t&, row_merge_block_t*, bool, pfs_os_file_t*, ut_stage_alter_t*, double, row_merge_block_t*, TABLE*, bool, const col_collations*): Assertion `rec_get_trx_id(rec, clust_index) < trx->id' failed.
            

            I also tried to simulate the impact of the MDEV-33462 fix in the 10.5 version of the test:

            diff --git a/mysql-test/suite/innodb/t/stats_persistent.test b/mysql-test/suite/innodb/t/stats_persistent.test
            index 870808993f4..cf4700e8c52 100644
            --- a/mysql-test/suite/innodb/t/stats_persistent.test
            +++ b/mysql-test/suite/innodb/t/stats_persistent.test
            @@ -16,7 +16,7 @@ SET DEBUG_SYNC='now WAIT_FOR stop';
             
             --connect(con2, localhost, root)
             SET DEBUG_SYNC='innodb_inplace_alter_table_enter SIGNAL astop WAIT_FOR ago';
            -send ALTER TABLE mysql.innodb_index_stats FORCE;
            +send ALTER TABLE mysql.innodb_index_stats FORCE, LOCK=SHARED;
             
             --connection con1
             --replace_column 1 SUM
            

            However, the test would not trigger any assertion failure. (10.5 has other problems; see MDEV-15020.)

            marko Marko Mäkelä added a comment - The MDEV-35443 test case triggers this bug in 10.6: diff --git a/mysql-test/suite/innodb/t/stats_persistent.test b/mysql-test/suite/innodb/t/stats_persistent.test index 8561298c4d3..870808993f4 100644 --- a/mysql-test/suite/innodb/t/stats_persistent.test +++ b/mysql-test/suite/innodb/t/stats_persistent.test @@ -14,14 +14,25 @@ SET DEBUG_SYNC='dict_stats_update_persistent SIGNAL stop WAIT_FOR go'; --connect(con1, localhost, root) SET DEBUG_SYNC='now WAIT_FOR stop'; +--connect(con2, localhost, root) +SET DEBUG_SYNC='innodb_inplace_alter_table_enter SIGNAL astop WAIT_FOR ago'; +send ALTER TABLE mysql.innodb_index_stats FORCE; + +--connection con1 --replace_column 1 SUM SELECT SUM(DATA_LENGTH+INDEX_LENGTH) FROM information_schema.TABLES WHERE ENGINE='InnoDB'; +SET DEBUG_SYNC='now WAIT_FOR astop'; SET DEBUG_SYNC='now SIGNAL go'; --disconnect con1 --connection default --reap +SET DEBUG_SYNC='now SIGNAL ago'; +--connection con2 +reap; +--disconnect con2 +--connection default SET DEBUG_SYNC= 'RESET'; DROP TABLE t1; 10.6 f20ee931d8e36af01841752ce0dd06da29004835 mariadbd: /mariadb/10.6/storage/innobase/row/row0merge.cc:2161: dberr_t row_merge_read_clustered_index(trx_t*, TABLE*, const dict_table_t*, dict_table_t*, bool, dict_index_t**, dict_index_t*, fts_psort_t*, merge_file_t*, const ulint*, ulint, const dtuple_t*, const dict_add_v_col_t*, const ulint*, ulint, ib_sequence_t&, row_merge_block_t*, bool, pfs_os_file_t*, ut_stage_alter_t*, double, row_merge_block_t*, TABLE*, bool, const col_collations*): Assertion `rec_get_trx_id(rec, clust_index) < trx->id' failed. I also tried to simulate the impact of the MDEV-33462 fix in the 10.5 version of the test: diff --git a/mysql-test/suite/innodb/t/stats_persistent.test b/mysql-test/suite/innodb/t/stats_persistent.test index 870808993f4..cf4700e8c52 100644 --- a/mysql-test/suite/innodb/t/stats_persistent.test +++ b/mysql-test/suite/innodb/t/stats_persistent.test @@ -16,7 +16,7 @@ SET DEBUG_SYNC='now WAIT_FOR stop'; --connect(con2, localhost, root) SET DEBUG_SYNC='innodb_inplace_alter_table_enter SIGNAL astop WAIT_FOR ago'; -send ALTER TABLE mysql.innodb_index_stats FORCE; +send ALTER TABLE mysql.innodb_index_stats FORCE, LOCK=SHARED; --connection con1 --replace_column 1 SUM However, the test would not trigger any assertion failure. (10.5 has other problems; see MDEV-15020 .)

            People

              nikitamalyavin Nikita Malyavin
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.