[MDEV-35443] During an online table rebuild of an InnoDB statistics table, opt_search_plan_for_table() may degrade to full table scan - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5
Fix Version/s: 10.5.28
Component/s: Data Definition - Alter Table, Storage Engine - InnoDB
Labels:
- performance
- upstream
Environment:
LINUX

Description

We've see a server crash due to long semaphore wait. The innodb stats update thread updates stats(dict_stats_save_index_stat()) on a table using InnoDB internal parser, holding dict_operation_lock and dict_sys->mutext, but the plan is full table scan(innodb_index_stats) other than clust index scan. There are more than 200 thousands records in stats table, and there are multiple udpates for the table stats. So it takes more than 900s to finish.

The reason why stats updates use full table scan is that there is a running SQL: ALTER TABLE mysql.innodb_index_stats ENGINE=InnoDB STATS_PERSISTENT=0. The online DDL set cluster index of innodb_index_stats to ONLINE_INDEX_CREATION.
callstack:
prepare_inplace_alter_table_dict
--row_log_allocate
----dict_index_set_online_status

In function opt_calc_index_goodness():
343 /* At least for now we don't support using FTS indexes for queries
344 done through InnoDB's own SQL parser. */
345 if (dict_index_is_online_ddl(index) || (index->type & DICT_FTS))

{ 346 return(0); 347 }

In this case we should use the online index.

Attachments

Issue Links

relates to

MDEV-33575 Assertion `rec_get_trx_id(rec, clust_index) < trx->id' failed in row_merge_read_clustered_index

Confirmed

MDEV-33462 Server aborts while altering an InnoDB statistics table

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2024-12-09 06:20

It turns out that I had introduced this condition back in MySQL 5.6 when implementing WL#5526 Online ADD INDEX. The condition was almost correct at that point of time, but I forgot to revise it later that year when implementing WL#6255 online table rebuild.

I believe that the condition dict_index_is_online_ddl(index) needs to be replaced with index->is_committed(). That would also fix a corner case where a secondary index has been fully built, but the ALTER TABLE or CREATE INDEX operation has not been committed yet, possibly because an upgrade of a metadata lock (MDL) to exclusive is in progress. In particular, if this happened during CREATE UNIQUE INDEX, it is possible that the DDL operation will need to be rolled back because some DML operation would introduce a duplicate entry in the not-yet-committed unique index.

Correcting the condition to index->is_committed() also has the benefit that the condition will never hold on the clustered index of a table, that is, the clustered index will never be disqualified from use in the InnoDB internal SQL interpreter.

Also the condition on index->type is inaccurate. index->is_btree() would properly disqualify SPATIAL INDEX, which were introduced for InnoDB in MySQL 5.7 and MariaDB Server 10.2.

Marko Mäkelä added a comment - 2024-12-09 06:20 It turns out that I had introduced this condition back in MySQL 5.6 when implementing WL#5526 Online ADD INDEX . The condition was almost correct at that point of time, but I forgot to revise it later that year when implementing WL#6255 online table rebuild. I believe that the condition dict_index_is_online_ddl(index) needs to be replaced with index->is_committed() . That would also fix a corner case where a secondary index has been fully built, but the ALTER TABLE or CREATE INDEX operation has not been committed yet, possibly because an upgrade of a metadata lock (MDL) to exclusive is in progress. In particular, if this happened during CREATE UNIQUE INDEX , it is possible that the DDL operation will need to be rolled back because some DML operation would introduce a duplicate entry in the not-yet-committed unique index. Correcting the condition to index->is_committed() also has the benefit that the condition will never hold on the clustered index of a table, that is, the clustered index will never be disqualified from use in the InnoDB internal SQL interpreter. Also the condition on index->type is inaccurate. index->is_btree() would properly disqualify SPATIAL INDEX , which were introduced for InnoDB in MySQL 5.7 and MariaDB Server 10.2.

Marko Mäkelä added a comment - 2024-12-09 07:05

I was curious about the claim about an online table rebuild. I ran the following test with ./mtr --rr:

--source include/have_innodb.inc

ALTER TABLE mysql.innodb_index_stats STATS_PERSISTENT=1;

ALTER TABLE mysql.innodb_index_stats ENGINE=InnoDB STATS_PERSISTENT=0;

ALTER TABLE mysql.innodb_index_stats STATS_PERSISTENT=0 STATS_AUTO_RECALC=0;

It turns out that for the second ALTER TABLE, with the seemingly redundant ENGINE=InnoDB, the ha_alter_info->handler_flags will include the flag ALTER_RECREATE_TABLE. Before the FORCE keyword was introduced for this purpose, it was customary to write ALTER TABLE tablename ENGINE=InnoDB instead of OPTIMIZE TABLE to have an InnoDB table rebuilt. For all these ALTER TABLE statements, also the flag ALTER_OPTIONS will be set.

I think that opt_calc_index_goodness() had better also disqualify any indexes that are built on any VIRTUAL columns, because the InnoDB internal SQL parser was not in any way adapted for the introduction of indexed virtual columns (~~MDEV-5800~~).

Marko Mäkelä added a comment - 2024-12-09 07:05 I was curious about the claim about an online table rebuild. I ran the following test with ./mtr --rr : --source include/have_innodb.inc ALTER TABLE mysql.innodb_index_stats STATS_PERSISTENT=1; ALTER TABLE mysql.innodb_index_stats ENGINE=InnoDB STATS_PERSISTENT=0; ALTER TABLE mysql.innodb_index_stats STATS_PERSISTENT=0 STATS_AUTO_RECALC=0; It turns out that for the second ALTER TABLE , with the seemingly redundant ENGINE=InnoDB , the ha_alter_info->handler_flags will include the flag ALTER_RECREATE_TABLE . Before the FORCE keyword was introduced for this purpose, it was customary to write ALTER TABLE tablename ENGINE=InnoDB instead of OPTIMIZE TABLE to have an InnoDB table rebuilt. For all these ALTER TABLE statements, also the flag ALTER_OPTIONS will be set. I think that opt_calc_index_goodness() had better also disqualify any indexes that are built on any VIRTUAL columns, because the InnoDB internal SQL parser was not in any way adapted for the introduction of indexed virtual columns ( MDEV-5800 ).

Marko Mäkelä added a comment - 2024-12-09 07:50

As far as I can tell, in any version of MariaDB Server, there is no secondary index defined on the tables mysql.innodb_index_stats or mysql.innodb_table_stats. Also in MySQL 5.6 and 5.7, only a PRIMARY KEY had been defined on these tables. I wrote a DEBUG_SYNC test case that would exercise this code in MariaDB Server 10.5:

diff --git a/mysql-test/suite/innodb/t/stats_persistent.test b/mysql-test/suite/innodb/t/stats_persistent.test

index 8561298c4d3..870808993f4 100644

--- a/mysql-test/suite/innodb/t/stats_persistent.test

+++ b/mysql-test/suite/innodb/t/stats_persistent.test

@@ -14,14 +14,25 @@ SET DEBUG_SYNC='dict_stats_update_persistent SIGNAL stop WAIT_FOR go';

 --connect(con1, localhost, root)

 SET DEBUG_SYNC='now WAIT_FOR stop';

+--connect(con2, localhost, root)

+SET DEBUG_SYNC='innodb_inplace_alter_table_enter SIGNAL astop WAIT_FOR ago';

+send ALTER TABLE mysql.innodb_index_stats FORCE;

+--connection con1

 --replace_column 1 SUM

 SELECT SUM(DATA_LENGTH+INDEX_LENGTH) FROM information_schema.TABLES WHERE ENGINE='InnoDB';

+SET DEBUG_SYNC='now WAIT_FOR astop';

 SET DEBUG_SYNC='now SIGNAL go';

 --disconnect con1

 --connection default

 --reap

+SET DEBUG_SYNC='now SIGNAL ago';

+--connection con2

+reap;

+--disconnect con2

+--connection default

 SET DEBUG_SYNC= 'RESET';

 DROP TABLE t1;

With the following I can observe with the following that the "goodness" 0 was returned:

./mtr --rr innodb.stats_persistent

rr replay var/log/mysqld.1.rr/latest-trace

break dict_stats_update_persistent

continue

break opt_calc_index_goodness thread 2

continue

continue

finish

next

next

next

The penultimate breakpoint would be for the other statistics table mysql.innodb_table_stats.

Because there only is a clustered index defined for these tables, only one index can be chosen. What makes the difference is the following:

opt_search_plan_for_table() in storage/innobase/pars/pars0opt.cc
n_fields = opt_calc_n_fields_from_goodness(best_goodness);

if (n_fields == 0) {
plan->tuple = NULL;
plan->n_exact_match = 0;
} else {
plan->tuple = dtuple_create(pars_sym_tab_global->heap,
n_fields);

That is, when the incorrect condition is present, a table scan will be executed, instead of searching for a key (database name, table name, index name). Therefore, I can confirm this bug.

Marko Mäkelä added a comment - 2024-12-09 07:50 As far as I can tell, in any version of MariaDB Server, there is no secondary index defined on the tables mysql.innodb_index_stats or mysql.innodb_table_stats . Also in MySQL 5.6 and 5.7, only a PRIMARY KEY had been defined on these tables. I wrote a DEBUG_SYNC test case that would exercise this code in MariaDB Server 10.5: diff --git a/mysql-test/suite/innodb/t/stats_persistent.test b/mysql-test/suite/innodb/t/stats_persistent.test index 8561298c4d3..870808993f4 100644 --- a/mysql-test/suite/innodb/t/stats_persistent.test +++ b/mysql-test/suite/innodb/t/stats_persistent.test @@ -14,14 +14,25 @@ SET DEBUG_SYNC='dict_stats_update_persistent SIGNAL stop WAIT_FOR go'; --connect(con1, localhost, root) SET DEBUG_SYNC='now WAIT_FOR stop'; +--connect(con2, localhost, root) +SET DEBUG_SYNC='innodb_inplace_alter_table_enter SIGNAL astop WAIT_FOR ago'; +send ALTER TABLE mysql.innodb_index_stats FORCE; + +--connection con1 --replace_column 1 SUM SELECT SUM(DATA_LENGTH+INDEX_LENGTH) FROM information_schema.TABLES WHERE ENGINE='InnoDB'; +SET DEBUG_SYNC='now WAIT_FOR astop'; SET DEBUG_SYNC='now SIGNAL go'; --disconnect con1 --connection default --reap +SET DEBUG_SYNC='now SIGNAL ago'; +--connection con2 +reap; +--disconnect con2 +--connection default SET DEBUG_SYNC= 'RESET'; DROP TABLE t1; With the following I can observe with the following that the "goodness" 0 was returned: ./mtr --rr innodb.stats_persistent rr replay var/log/mysqld.1.rr/latest-trace break dict_stats_update_persistent continue break opt_calc_index_goodness thread 2 continue continue finish next next next The penultimate breakpoint would be for the other statistics table mysql.innodb_table_stats . Because there only is a clustered index defined for these tables, only one index can be chosen. What makes the difference is the following: opt_search_plan_for_table() in storage/innobase/pars/pars0opt.cc n_fields = opt_calc_n_fields_from_goodness(best_goodness); if (n_fields == 0) { plan->tuple = NULL; plan->n_exact_match = 0; } else { plan->tuple = dtuple_create(pars_sym_tab_global->heap, n_fields); That is, when the incorrect condition is present, a table scan will be executed, instead of searching for a key (database name, table name, index name). Therefore, I can confirm this bug.

Vladislav Lesin added a comment - 2024-12-19 11:51

Looks to me. Thank you marko for explaining the way to reproduce and debug the issue in rr.

Vladislav Lesin added a comment - 2024-12-19 11:51 Looks to me. Thank you marko for explaining the way to reproduce and debug the issue in rr.

Marko Mäkelä added a comment - 2025-01-03 07:06

It turns out that in ~~MDEV-33462~~ in MariaDB Server 10.6.18, the online table rebuild for InnoDB statistics tables was disabled. Therefore, this bug should not affect more recent MariaDB releases.

In MariaDB Server 10.6, the MariaDB Server 10.5 test case for this bug would trigger the bug MDEV-33575 during the execution of ALTER TABLE mysql.innodb_index_stats.

Marko Mäkelä added a comment - 2025-01-03 07:06 It turns out that in MDEV-33462 in MariaDB Server 10.6.18, the online table rebuild for InnoDB statistics tables was disabled. Therefore, this bug should not affect more recent MariaDB releases. In MariaDB Server 10.6, the MariaDB Server 10.5 test case for this bug would trigger the bug MDEV-33575 during the execution of ALTER TABLE mysql.innodb_index_stats .

MariaDB Server

During an online table rebuild of an InnoDB statistics table, opt_search_plan_for_table() may degrade to full table scan

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration