Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24564

Statistics are lost after ALTER TABLE

Details

    Description

      Merging to 10.5 the following code change of MDEV-23632 causes the last step of the test gcol.innodb_virtual_stats to fail:

       @@ -11189,44 +11237,19 @@ ha_innobase::commit_inplace_alter_table(
       	Currently dict_load_column_low() is the only place where
       	num_base for virtual columns is assigned to nonzero. */
       	if (ctx0->num_to_drop_vcol || ctx0->num_to_add_vcol
      +	    || (ctx0->new_table->n_v_cols && !new_clustered
      +		&& (ha_alter_info->alter_info->drop_list.elements
      +		    || ha_alter_info->alter_info->create_list.elements))
       	    || (ctx0->is_instant()
       		&& m_prebuilt->table->n_v_cols
       		&& ha_alter_info->handler_flags & ALTER_STORED_COLUMN_ORDER)) {
      ...
      

      The table will lose all persistent statistics:

      CURRENT_TEST: gcol.innodb_virtual_stats
      --- /mariadb/10.5m/mysql-test/suite/gcol/r/innodb_virtual_stats.result	2020-11-18 08:34:35.136411687 +0200
      +++ /mariadb/10.5m/mysql-test/suite/gcol/r/innodb_virtual_stats.reject	2021-01-11 12:57:24.756571844 +0200
      @@ -121,19 +121,4 @@
       FROM mysql.innodb_index_stats
       WHERE database_name = 'test' AND table_name = 't';
       index_name	stat_name	stat_description
      -GEN_CLUST_INDEX	n_diff_pfx01	DB_ROW_ID
      -GEN_CLUST_INDEX	n_leaf_pages	Number of leaf pages in the index
      -GEN_CLUST_INDEX	size	Number of pages in the index
      -idxb	n_diff_pfx01	b
      -idxb	n_diff_pfx02	b,DB_ROW_ID
      -idxb	n_leaf_pages	Number of leaf pages in the index
      -idxb	size	Number of pages in the index
      -vidxe	n_diff_pfx01	e
      -vidxe	n_diff_pfx02	e,DB_ROW_ID
      -vidxe	n_leaf_pages	Number of leaf pages in the index
      -vidxe	size	Number of pages in the index
      -vidxf	n_diff_pfx01	f
      -vidxf	n_diff_pfx02	f,DB_ROW_ID
      -vidxf	n_leaf_pages	Number of leaf pages in the index
      -vidxf	size	Number of pages in the index
       DROP TABLE t;
       
      mysqltest: Result length mismatch
      

      In both 10.4 and 10.5, we counter-intuitively have ha_alter_info->alter_info->create_list.elements == 5 for the preceding statement:

      ALTER TABLE t DROP INDEX vidxcd;
      

      The reason why this fails in 10.5 appears to be a change to "Remove not needed open/close call at end of inline alter table." The ha_innobase::open() call is needed for initializing the persistent statistics after a native ALTER TABLE:

      	if (ib_table->is_readable()) {
      		dict_stats_init(ib_table);
      	} else {
      		ib_table->stat_initialized = 1;
      	}
      

      I agree that this is not ideal, but that work-around was added in MySQL 5.7 and is not easy to remove without a major rewrite (MDEV-22363).

      Attachments

        Issue Links

          Activity

            Why can't innodb do the above init of persistent statistics in handler::notify_tabledef_has_changed()?

            We cannot do things as they where before as that would interfere with discoverable tables.
            In other words, we have to inform the engine that the table has changed BEFORE the table is closed and used again. The old code opened the new table and after it open informed the engine that the definition has been changed (which is too late)

            I do not understand the comment about ha_alter_info->alter_info->create_list.elements == 5

            After mysql_prepare_alter_table() call as part of the above ALTER TABLE we have:
            p alter_info->create_list
            $9 = {<base_list> = {<Sql_alloc> =

            {dummy_for_valgrind = false}

            , first = 0x7fffc01de730, last = 0x7fffc021ccd8, elements = 4}, Which gives us 4 elements that matches fields a,b,d,e in the table

            monty Michael Widenius added a comment - Why can't innodb do the above init of persistent statistics in handler::notify_tabledef_has_changed()? We cannot do things as they where before as that would interfere with discoverable tables. In other words, we have to inform the engine that the table has changed BEFORE the table is closed and used again. The old code opened the new table and after it open informed the engine that the definition has been changed (which is too late) I do not understand the comment about ha_alter_info->alter_info->create_list.elements == 5 After mysql_prepare_alter_table() call as part of the above ALTER TABLE we have: p alter_info->create_list $9 = {<base_list> = {<Sql_alloc> = {dummy_for_valgrind = false} , first = 0x7fffc01de730, last = 0x7fffc021ccd8, elements = 4}, Which gives us 4 elements that matches fields a,b,d,e in the table

            There is no handler::notify_tabledef_has_changed(). The virtual member function handler::notify_table_changed() was replaced with the function pointer handlerton::notify_tabledef_changed in the change that was already mentioned in the description.

            What might work is that the handlerton::notify_tabledef_changed() is extended with new parameters, so that InnoDB can pass the call to ha_innobase::open(). At the minimum, we would seem to need handler* and const TABLE_SHARE*. The handler::table_share or handler::table->s are pointing to the original table definition as it was before the ALTER TABLE statement, while InnoDB would need the new table definition.

            marko Marko Mäkelä added a comment - There is no handler::notify_tabledef_has_changed() . The virtual member function handler::notify_table_changed() was replaced with the function pointer handlerton::notify_tabledef_changed in the change that was already mentioned in the description. What might work is that the handlerton::notify_tabledef_changed() is extended with new parameters, so that InnoDB can pass the call to ha_innobase::open() . At the minimum, we would seem to need handler* and const TABLE_SHARE* . The handler::table_share or handler::table->s are pointing to the original table definition as it was before the ALTER TABLE statement, while InnoDB would need the new table definition.

            I should be able to fix this with a new function innodb_notify_tabledef_changed() that will invoke another new function ha_innobase::reload_statistics().

            marko Marko Mäkelä added a comment - I should be able to fix this with a new function innodb_notify_tabledef_changed() that will invoke another new function ha_innobase::reload_statistics() .

            I noticed that many tests for partitioned tables fail due to heap-use-after-free, even after this additional change:

            diff --git a/sql/sql_table.cc b/sql/sql_table.cc
            index 5a51dd10732..49681137f34 100644
            --- a/sql/sql_table.cc
            +++ b/sql/sql_table.cc
            @@ -7934,7 +7934,6 @@ static bool mysql_inplace_alter_table(THD *thd,
               Alter_info *alter_info= ha_alter_info->alter_info;
               bool reopen_tables= false;
               bool res;
            -  handlerton *hton;
             
               const enum_alter_inplace_result inplace_supported=
                 ha_alter_info->inplace_supported;
            @@ -8145,9 +8144,9 @@ static bool mysql_inplace_alter_table(THD *thd,
             
               /* Notify the engine that the table definition has changed */
             
            -  hton= table->file->partition_ht();
            -  if (hton->notify_tabledef_changed)
            +  if (table->file->partition_ht()->notify_tabledef_changed)
               {
            +    handlerton *hton= table->file->ht;
                 char db_buff[FN_REFLEN], table_buff[FN_REFLEN];
                 LEX_CSTRING tmp_db, tmp_table;
                 tmp_db.str=    db_buff;
            diff --git a/storage/innobase/handler/handler0alter.cc b/storage/innobase/handler/handler0alter.cc
            index bd4a56ee6da..4248442f3da 100644
            --- a/storage/innobase/handler/handler0alter.cc
            +++ b/storage/innobase/handler/handler0alter.cc
            @@ -11263,14 +11263,18 @@ ha_innobase::commit_inplace_alter_table(
             		&& m_prebuilt->table->n_v_cols
             		&& ha_alter_info->handler_flags & ALTER_STORED_COLUMN_ORDER)) {
             		DBUG_ASSERT(ctx0->old_table->get_ref_count() == 1);
            +		ut_ad(ctx0->prebuilt == m_prebuilt);
             		trx_commit_for_mysql(m_prebuilt->trx);
             
            -		m_prebuilt->table = innobase_reload_table(m_user_thd,
            -							  m_prebuilt->table,
            -							  table->s->table_name,
            -							  *ctx0);
            -		innobase_copy_frm_flags_from_table_share(
            -			m_prebuilt->table, altered_table->s);
            +		for (inplace_alter_handler_ctx** pctx = ctx_array; *pctx;
            +		     pctx++) {
            +			auto ctx= static_cast<ha_innobase_inplace_ctx*>(*pctx);
            +			ctx->prebuilt->table = innobase_reload_table(
            +				m_user_thd, ctx->prebuilt->table,
            +				table->s->table_name, *ctx0);
            +			innobase_copy_frm_flags_from_table_share(
            +				ctx->prebuilt->table, altered_table->s);
            +		}
             
             		row_mysql_unlock_data_dictionary(trx);
             		trx->free();
            

            Here is an example failure:

            CURRENT_TEST: parts.partition_auto_increment_innodb
            mysqltest: In included file "./suite/parts/inc/partition_auto_increment.inc": 
            included from /mariadb/10.5/mysql-test/suite/parts/t/partition_auto_increment_innodb.test at line 35:
            At line 353: query 'OPTIMIZE TABLE t1' failed: 2013: Lost connection to MySQL server during query
            …
            ==63664==ERROR: AddressSanitizer: use-after-poison on address 0x616000d431cc at pc 0x000002891f3e bp 0x7fb79355f3b0 sp 0x7fb79355f3a8
            READ of size 1 at 0x616000d431cc thread T11
                #0 0x2891f3d in strxnmov /mariadb/10.5m/strings/strxnmov.c:66:22
                #1 0x175a726 in create_partition_name(char*, unsigned long, char const*, char const*, unsigned int, bool) /mariadb/10.5m/sql/sql_partition.cc:8659:10
                #2 0x1a1adf7 in ha_partition::notify_tabledef_changed(st_mysql_const_lex_string*, st_mysql_const_lex_string*, st_mysql_const_unsigned_lex_string*, st_mysql_const_unsigned_lex_string*) /mariadb/10.5m/sql/ha_partition.cc:114:9
                #3 0xd191a4 in mysql_inplace_alter_table(THD*, TABLE_LIST*, TABLE*, TABLE*, Alter_inplace_info*, MDL_request*, Alter_table_ctx*) /mariadb/10.5m/sql/sql_table.cc:8158:9
            …
            SUMMARY: AddressSanitizer: use-after-poison /mariadb/10.5m/strings/strxnmov.c:66:22 in strxnmov
            Shadow bytes around the buggy address:
              0x0c2c801a05e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
            …
            =>0x0c2c801a0630: 00 00 00 00 00 00 f7 f7 f7[f7]f7 f7 f7 f7 f7 f7
            

            The above was generated with

            ASAN_OPTIONS=abort_on_error=1 ./mtr --rr parts.partition_auto_increment_innodb
            

            and I was able to obtain the stack trace for the poisoning as follows:

            _RR_TRACE_DIR=var/rr.1 rr replay
            continue
            watch *(char*)0x0c2c801a0639
            reverse-continue
            reverse-continue
            backtrace
            quit
            

            Note: the watch expression refers to the ASAN shadow byte that was part of the preceding output. The reported stack trace is as follows:

            #0  __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:179
            #1  0x000000000079fef4 in __asan_poison_memory_region ()
            #2  0x0000000002761af3 in free_root (root=0x61d0003d59a8, MyFlags=<optimized out>) at /mariadb/10.5m/mysys/my_alloc.c:423
            #3  0x0000000001a33321 in ha_partition::clear_handler_file (this=0x61d0003d54b8) at /mariadb/10.5m/sql/ha_partition.cc:2967
            #4  ha_partition::open (this=0x61d0003d54b8, name=<optimized out>, mode=<optimized out>, test_if_locked=<optimized out>) at /mariadb/10.5m/sql/ha_partition.cc:3759
            #5  0x00000000011fde82 in handler::ha_open (this=0x61d0003d54b8, table_arg=<optimized out>, name=0x61b0000df850 "./test/t1", mode=2, test_if_locked=18, mem_root=0x0, partitions_to_open=0x0)
                at /mariadb/10.5m/sql/handler.cc:2983
            #6  0x0000000000da5d0c in open_table_from_share (thd=<optimized out>, share=<optimized out>, alias=<optimized out>, db_stat=<optimized out>, prgflag=<optimized out>, ha_open_flags=<optimized out>, 
                outparam=<optimized out>, is_create_table=<optimized out>, partitions_to_open=<optimized out>) at /mariadb/10.5m/sql/table.cc:4218
            #7  0x000000000093523b in open_table (thd=<optimized out>, table_list=0x62b0000a1318, ot_ctx=<optimized out>) at /mariadb/10.5m/sql/sql_base.cc:2001
            #8  0x000000000093d977 in open_and_process_table (thd=0x62b00009a218, tables=0x62b0000a1318, counter=0x7fb793566ec0, flags=2, prelocking_strategy=0x7fb793567140, has_prelocking_list=<optimized out>, 
                ot_ctx=0x7fb793566ba0) at /mariadb/10.5m/sql/sql_base.cc:3801
            #9  open_tables (thd=<optimized out>, options=<optimized out>, start=<optimized out>, counter=<optimized out>, flags=<optimized out>, prelocking_strategy=<optimized out>) at /mariadb/10.5m/sql/sql_base.cc:4275
            #10 0x0000000000945e55 in open_and_lock_tables (thd=<optimized out>, options=<optimized out>, tables=<optimized out>, derived=<optimized out>, flags=<optimized out>, prelocking_strategy=<optimized out>)
                at /mariadb/10.5m/sql/sql_base.cc:5211
            #11 0x0000000000ec198d in open_and_lock_tables (thd=0x62b00009a218, tables=0x62b0000a1318, derived=false, flags=64) at /mariadb/10.5m/sql/sql_base.h:507
            #12 Sql_cmd_truncate_table::handler_truncate (this=<optimized out>, thd=<optimized out>, table_ref=<optimized out>, is_tmp_table=<optimized out>) at /mariadb/10.5m/sql/sql_truncate.cc:230
            #13 0x0000000000ec3ac8 in Sql_cmd_truncate_table::truncate_table (this=<optimized out>, thd=<optimized out>, table_ref=<optimized out>) at /mariadb/10.5m/sql/sql_truncate.cc:475
            #14 0x0000000000ec4256 in Sql_cmd_truncate_table::execute (this=0x62b0000a1a00, thd=0x62b00009a218) at /mariadb/10.5m/sql/sql_truncate.cc:531
            #15 0x0000000000ab6131 in mysql_execute_command (thd=0x62b00009a218) at /mariadb/10.5m/sql/sql_parse.cc:6023
            

            marko Marko Mäkelä added a comment - I noticed that many tests for partitioned tables fail due to heap-use-after-free, even after this additional change: diff --git a/sql/sql_table.cc b/sql/sql_table.cc index 5a51dd10732..49681137f34 100644 --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -7934,7 +7934,6 @@ static bool mysql_inplace_alter_table(THD *thd, Alter_info *alter_info= ha_alter_info->alter_info; bool reopen_tables= false; bool res; - handlerton *hton; const enum_alter_inplace_result inplace_supported= ha_alter_info->inplace_supported; @@ -8145,9 +8144,9 @@ static bool mysql_inplace_alter_table(THD *thd, /* Notify the engine that the table definition has changed */ - hton= table->file->partition_ht(); - if (hton->notify_tabledef_changed) + if (table->file->partition_ht()->notify_tabledef_changed) { + handlerton *hton= table->file->ht; char db_buff[FN_REFLEN], table_buff[FN_REFLEN]; LEX_CSTRING tmp_db, tmp_table; tmp_db.str= db_buff; diff --git a/storage/innobase/handler/handler0alter.cc b/storage/innobase/handler/handler0alter.cc index bd4a56ee6da..4248442f3da 100644 --- a/storage/innobase/handler/handler0alter.cc +++ b/storage/innobase/handler/handler0alter.cc @@ -11263,14 +11263,18 @@ ha_innobase::commit_inplace_alter_table( && m_prebuilt->table->n_v_cols && ha_alter_info->handler_flags & ALTER_STORED_COLUMN_ORDER)) { DBUG_ASSERT(ctx0->old_table->get_ref_count() == 1); + ut_ad(ctx0->prebuilt == m_prebuilt); trx_commit_for_mysql(m_prebuilt->trx); - m_prebuilt->table = innobase_reload_table(m_user_thd, - m_prebuilt->table, - table->s->table_name, - *ctx0); - innobase_copy_frm_flags_from_table_share( - m_prebuilt->table, altered_table->s); + for (inplace_alter_handler_ctx** pctx = ctx_array; *pctx; + pctx++) { + auto ctx= static_cast<ha_innobase_inplace_ctx*>(*pctx); + ctx->prebuilt->table = innobase_reload_table( + m_user_thd, ctx->prebuilt->table, + table->s->table_name, *ctx0); + innobase_copy_frm_flags_from_table_share( + ctx->prebuilt->table, altered_table->s); + } row_mysql_unlock_data_dictionary(trx); trx->free(); Here is an example failure: CURRENT_TEST: parts.partition_auto_increment_innodb mysqltest: In included file "./suite/parts/inc/partition_auto_increment.inc": included from /mariadb/10.5/mysql-test/suite/parts/t/partition_auto_increment_innodb.test at line 35: At line 353: query 'OPTIMIZE TABLE t1' failed: 2013: Lost connection to MySQL server during query … ==63664==ERROR: AddressSanitizer: use-after-poison on address 0x616000d431cc at pc 0x000002891f3e bp 0x7fb79355f3b0 sp 0x7fb79355f3a8 READ of size 1 at 0x616000d431cc thread T11 #0 0x2891f3d in strxnmov /mariadb/10.5m/strings/strxnmov.c:66:22 #1 0x175a726 in create_partition_name(char*, unsigned long, char const*, char const*, unsigned int, bool) /mariadb/10.5m/sql/sql_partition.cc:8659:10 #2 0x1a1adf7 in ha_partition::notify_tabledef_changed(st_mysql_const_lex_string*, st_mysql_const_lex_string*, st_mysql_const_unsigned_lex_string*, st_mysql_const_unsigned_lex_string*) /mariadb/10.5m/sql/ha_partition.cc:114:9 #3 0xd191a4 in mysql_inplace_alter_table(THD*, TABLE_LIST*, TABLE*, TABLE*, Alter_inplace_info*, MDL_request*, Alter_table_ctx*) /mariadb/10.5m/sql/sql_table.cc:8158:9 … SUMMARY: AddressSanitizer: use-after-poison /mariadb/10.5m/strings/strxnmov.c:66:22 in strxnmov Shadow bytes around the buggy address: 0x0c2c801a05e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd … =>0x0c2c801a0630: 00 00 00 00 00 00 f7 f7 f7[f7]f7 f7 f7 f7 f7 f7 The above was generated with ASAN_OPTIONS=abort_on_error=1 ./mtr --rr parts.partition_auto_increment_innodb and I was able to obtain the stack trace for the poisoning as follows: _RR_TRACE_DIR=var/rr.1 rr replay continue watch *(char*)0x0c2c801a0639 reverse-continue reverse-continue backtrace quit Note: the watch expression refers to the ASAN shadow byte that was part of the preceding output. The reported stack trace is as follows: #0 __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:179 #1 0x000000000079fef4 in __asan_poison_memory_region () #2 0x0000000002761af3 in free_root (root=0x61d0003d59a8, MyFlags=<optimized out>) at /mariadb/10.5m/mysys/my_alloc.c:423 #3 0x0000000001a33321 in ha_partition::clear_handler_file (this=0x61d0003d54b8) at /mariadb/10.5m/sql/ha_partition.cc:2967 #4 ha_partition::open (this=0x61d0003d54b8, name=<optimized out>, mode=<optimized out>, test_if_locked=<optimized out>) at /mariadb/10.5m/sql/ha_partition.cc:3759 #5 0x00000000011fde82 in handler::ha_open (this=0x61d0003d54b8, table_arg=<optimized out>, name=0x61b0000df850 "./test/t1", mode=2, test_if_locked=18, mem_root=0x0, partitions_to_open=0x0) at /mariadb/10.5m/sql/handler.cc:2983 #6 0x0000000000da5d0c in open_table_from_share (thd=<optimized out>, share=<optimized out>, alias=<optimized out>, db_stat=<optimized out>, prgflag=<optimized out>, ha_open_flags=<optimized out>, outparam=<optimized out>, is_create_table=<optimized out>, partitions_to_open=<optimized out>) at /mariadb/10.5m/sql/table.cc:4218 #7 0x000000000093523b in open_table (thd=<optimized out>, table_list=0x62b0000a1318, ot_ctx=<optimized out>) at /mariadb/10.5m/sql/sql_base.cc:2001 #8 0x000000000093d977 in open_and_process_table (thd=0x62b00009a218, tables=0x62b0000a1318, counter=0x7fb793566ec0, flags=2, prelocking_strategy=0x7fb793567140, has_prelocking_list=<optimized out>, ot_ctx=0x7fb793566ba0) at /mariadb/10.5m/sql/sql_base.cc:3801 #9 open_tables (thd=<optimized out>, options=<optimized out>, start=<optimized out>, counter=<optimized out>, flags=<optimized out>, prelocking_strategy=<optimized out>) at /mariadb/10.5m/sql/sql_base.cc:4275 #10 0x0000000000945e55 in open_and_lock_tables (thd=<optimized out>, options=<optimized out>, tables=<optimized out>, derived=<optimized out>, flags=<optimized out>, prelocking_strategy=<optimized out>) at /mariadb/10.5m/sql/sql_base.cc:5211 #11 0x0000000000ec198d in open_and_lock_tables (thd=0x62b00009a218, tables=0x62b0000a1318, derived=false, flags=64) at /mariadb/10.5m/sql/sql_base.h:507 #12 Sql_cmd_truncate_table::handler_truncate (this=<optimized out>, thd=<optimized out>, table_ref=<optimized out>, is_tmp_table=<optimized out>) at /mariadb/10.5m/sql/sql_truncate.cc:230 #13 0x0000000000ec3ac8 in Sql_cmd_truncate_table::truncate_table (this=<optimized out>, thd=<optimized out>, table_ref=<optimized out>) at /mariadb/10.5m/sql/sql_truncate.cc:475 #14 0x0000000000ec4256 in Sql_cmd_truncate_table::execute (this=0x62b0000a1a00, thd=0x62b00009a218) at /mariadb/10.5m/sql/sql_truncate.cc:531 #15 0x0000000000ab6131 in mysql_execute_command (thd=0x62b00009a218) at /mariadb/10.5m/sql/sql_parse.cc:6023

            Review of code done. Fixed reported issues in the new code. Marko has tested the new code and things looks good.

            monty Michael Widenius added a comment - Review of code done. Fixed reported issues in the new code. Marko has tested the new code and things looks good.

            monty, please check a failure of the S3 storage engine:

            CURRENT_TEST: s3.replication_partition
            analyze: sync_with_master
            mysqltest: At line 106: sync_slave_with_master failed: 'select master_pos_wait('master-bin.000001', 4054, 300, '')' returned NULL indicating slave SQL thread failure
            

            marko Marko Mäkelä added a comment - monty , please check a failure of the S3 storage engine : CURRENT_TEST: s3.replication_partition analyze: sync_with_master mysqltest: At line 106: sync_slave_with_master failed: 'select master_pos_wait('master-bin.000001', 4054, 300, '')' returned NULL indicating slave SQL thread failure

            The s3.replication_partition failure looks like MDEV-24087.

            marko Marko Mäkelä added a comment - The s3.replication_partition failure looks like MDEV-24087 .

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.