[MDEV-19574] innodb_stats_method is not honored when innodb_stats_persistent=ON - Jira

Details

Type: Bug
Status: Stalled (View Workflow)
Priority: Critical
Resolution: Unresolved
Affects Version/s: 10.3.15
Fix Version/s: 10.6
Component/s: Storage Engine - InnoDB
Labels:
- upstream

Description

The value of innodb_stats_method is not honored when innodb_stats_persistent=ON.

Steps to reproduce:
Let's first try with non-persistent statistics:

set global innodb_stats_persistent=off;

set global innodb_stats_transient_sample_pages=100;

set global innodb_stats_persistent_sample_pages=100;

mysql> show variables like 'innodb%stats%';

+--------------------------------------+-------------+

| Variable_name                        | Value       |

+--------------------------------------+-------------+

| innodb_defragment_stats_accuracy     | 0           |

| innodb_stats_auto_recalc             | ON          |

| innodb_stats_include_delete_marked   | OFF         |

| innodb_stats_method                  | nulls_equal |

| innodb_stats_modified_counter        | 0           |

| innodb_stats_on_metadata             | OFF         |

| innodb_stats_persistent              | OFF         |

| innodb_stats_persistent_sample_pages | 100         |

| innodb_stats_sample_pages            | 100         |

| innodb_stats_traditional             | ON          |

| innodb_stats_transient_sample_pages  | 100         |

+--------------------------------------+-------------+

create table ten(a int);

insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

create table one_k(a int);

insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;

create table t1 (pk int primary key , a int, key(a)) engine=innodb;

# 400 K rows with NULLs:

insert into t1 select A.a+1000*B.a, null from one_k A , one_k B where B.a< 400;

# 101K rows with different non-NULL values:

insert into t1

select

  A.a+1000*B.a,A.a+1000*B.a from one_k A, one_k B

where B.a between 400 and 500;

analyze table t1;

Now, let's see what we've got:

mysql> show keys from t1;

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| t1    |          0 | PRIMARY  |            1 | pk          | A         |      506705 |     NULL | NULL   |      | BTREE      |         |               |

| t1    |          1 | a        |            1 | a           | A         |      253352 |     NULL | NULL   | YES  | BTREE      |         |               |

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Cardinality(a)=253K. There are actually 101K different values, but this is an estimate, so I'm fine with this value (for this MDEV at least).

Now, let's treat all NULLs as unequal, which means 'a' will have 500K different values:

set global innodb_stats_method=nulls_unequal;

analyze table t1;

mysql> show keys from t1;

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| t1    |          0 | PRIMARY  |            1 | pk          | A         |      498951 |     NULL | NULL   |      | BTREE      |         |               |

| t1    |          1 | a        |            1 | a           | A         |      498951 |     NULL | NULL   | YES  | BTREE      |         |               |

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Cardinality(a)=498K. Good - same as PK, close to reality.

Now, let's try to repeat the above steps with persistent statistics.

drop table t1;

set global innodb_stats_persistent=on;

set global innodb_stats_method=nulls_equal;

create table t1 (pk int primary key , a int, key(a)) engine=innodb;

insert into t1 select A.a+1000*B.a, null from one_k A , one_k B where B.a< 400;

insert into t1

select

  A.a+1000*B.a,A.a+1000*B.a from one_k A, one_k B

where B.a between 400 and 500;

analyze table t1;

mysql> show keys from t1;

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| t1    |          0 | PRIMARY  |            1 | pk          | A         |      501337 |     NULL | NULL   |      | BTREE      |         |               |

| t1    |          1 | a        |            1 | a           | A         |      250668 |     NULL | NULL   | YES  | BTREE      |         |               |

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Cardinality(a)=250K. Close to what we've got for non-persistent stats, ok.

set global innodb_stats_method=nulls_unequal;

analyze table t1;

mysql> show keys from t1;

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| t1    |          0 | PRIMARY  |            1 | pk          | A         |      500745 |     NULL | NULL   |      | BTREE      |         |               |

| t1    |          1 | a        |            1 | a           | A         |      250372 |     NULL | NULL   | YES  | BTREE      |         |               |

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Still, Cardinality(a)=250K! Now, this is NOT acceptable.
with nulls_unequal, all values of 'a' are different, there is no reason for innodb to produce cardinality(a) < cardinality(PK)!

Attachments

Issue Links

relates to

MDEV-17295 Wrong cardinality with innodb_stats_method=nulls_ignored

Open

Activity

Ascending order - Click to sort in descending order

View 4 older comments

Sergei Petrunia added a comment - 2019-05-23 21:41

A better paste showing it being reproduced on MySQL 8.0: https://gist.github.com/spetrunia/7be5a577f46635b7f254c2ed70e50332

Sergei Petrunia added a comment - 2019-05-23 21:41 A better paste showing it being reproduced on MySQL 8.0: https://gist.github.com/spetrunia/7be5a577f46635b7f254c2ed70e50332

Hans Kristian Rosbach added a comment - 2019-09-05 07:33 - edited

This issue is linked from MDEV-17295
Just making sure they are connected both ways in case someone works on this

Hans Kristian Rosbach added a comment - 2019-09-05 07:33 - edited This issue is linked from MDEV-17295 Just making sure they are connected both ways in case someone works on this

Marko Mäkelä added a comment - 2022-08-02 08:47

psergei, thank you for the analysis. Another quick observation is that null pointer checks are duplicated both at the start of btr_record_not_null_field_in_rec() and before its calls in btr_estimate_number_of_different_key_vals().

In the elements of the vector that is returned by btr_estimate_number_of_different_key_vals(), there are index_field_stats_t::n_non_null_key_vals that will be copied to dict_index_t::stat_n_non_null_key_vals in dict_stats_update_persistent(). But, that data field will be skipped by dict_stats_save() and set to 0 in dict_stats_fetch_index_stats_step().

In ~~MDEV-15020~~ I previously wrote some ideas for replacing the InnoDB statistics tables. In the end, nothing was done to the data structures yet. Only the InnoDB hangs related to updating persistent statistics were fixed.

A minimal data format change to implement this fix would seem to be to introduce mysql.innodb_index_stats.stat_name keys similar to n_diff_pfx%02u, for persisting the index_field_stats_t::n_non_null_key_vals. While file format changes are something that should be avoided in GA versions, maybe we could make an exception here. At least starting with 10.6, dict_stats_delete_from_index_stats() would drop all statistics for a being-dropped table, for any stat_name.

thiru, can you implement the following?

remove the null pointer checks before the btr_record_not_null_field_in_rec() calls
introduce new stat_name, say, n_nonnull_pfx%02u

I think that this should target the 10.6 series, because before ~~MDEV-15020~~ was fixed in 10.6, the InnoDB persistent statistics code was prone to hangs, and thus it would be hard to test this change in earlier releases.

Marko Mäkelä added a comment - 2022-08-02 08:47 psergei , thank you for the analysis. Another quick observation is that null pointer checks are duplicated both at the start of btr_record_not_null_field_in_rec() and before its calls in btr_estimate_number_of_different_key_vals() . In the elements of the vector that is returned by btr_estimate_number_of_different_key_vals() , there are index_field_stats_t::n_non_null_key_vals that will be copied to dict_index_t::stat_n_non_null_key_vals in dict_stats_update_persistent() . But, that data field will be skipped by dict_stats_save() and set to 0 in dict_stats_fetch_index_stats_step() . In MDEV-15020 I previously wrote some ideas for replacing the InnoDB statistics tables. In the end, nothing was done to the data structures yet. Only the InnoDB hangs related to updating persistent statistics were fixed. A minimal data format change to implement this fix would seem to be to introduce mysql.innodb_index_stats.stat_name keys similar to n_diff_pfx%02u , for persisting the index_field_stats_t::n_non_null_key_vals . While file format changes are something that should be avoided in GA versions, maybe we could make an exception here. At least starting with 10.6, dict_stats_delete_from_index_stats() would drop all statistics for a being-dropped table, for any stat_name . thiru , can you implement the following? remove the null pointer checks before the btr_record_not_null_field_in_rec() calls introduce new stat_name , say, n_nonnull_pfx%02u I think that this should target the 10.6 series, because before MDEV-15020 was fixed in 10.6, the InnoDB persistent statistics code was prone to hangs, and thus it would be hard to test this change in earlier releases.

Thirunarayanan Balathandayuthapani added a comment - 2025-03-10 10:23

https://github.com/MariaDB/server/pull/3886

Thirunarayanan Balathandayuthapani added a comment - 2025-03-10 10:23 https://github.com/MariaDB/server/pull/3886

Marko Mäkelä added a comment - 2025-03-11 07:28

Can we make the persistent statistics collection independent of the current value of innodb_stats_method and make it collect the necessary information for all values of that parameter? That is, only the retrieval or the use of statistics in ha_innobase::info() would take innodb_stats_method into account. This would seem to only affect indexes where some of the columns are not NOT NULL.

How would this idea affect the performance of collecting statistics? What is the minimum amount of additional data that we would need to collect, and what kind of a data layout would you propose?

Marko Mäkelä added a comment - 2025-03-11 07:28 Can we make the persistent statistics collection independent of the current value of innodb_stats_method and make it collect the necessary information for all values of that parameter? That is, only the retrieval or the use of statistics in ha_innobase::info() would take innodb_stats_method into account. This would seem to only affect indexes where some of the columns are not NOT NULL . How would this idea affect the performance of collecting statistics? What is the minimum amount of additional data that we would need to collect, and what kind of a data layout would you propose?

People

Assignee:: Thirunarayanan Balathandayuthapani

Reporter:: Sergei Petrunia

Votes:: 3 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 2019-05-23 16:03

Updated:: 2025-03-14 15:28

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server