[MDEV-10790] InnoDB statistics update may temporarily cause wrong index cardinalities Created: 2016-09-11  Updated: 2018-01-01

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.1.17
Fix Version/s: 10.1

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 1
Labels: upstream

Attachments: File mdev10790-mysql-5.7.diff     File mdev10790-mysql-5.7.log     File psergey-sept11-stop-at-index-update.diff    
Issue Links:
Relates
relates to MDEV-10649 Optimizer sometimes use "index" inste... Closed
Sprint: 10.1.18

 Description   

This is a continuation of MDEV-10649, but this time, it is about index cardinalities.

Under concurrent load, if you're lucky, you may get a very wrong index cardinality estimates.

Here's a way to observe:
Apply psergey-sept11-stop-at-index-update.diff. This patch adds capability to stop the index cardinality update code at the right place.

Then, start the server and create the test dataset:

create table ten(a int) engine=myisam;
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
 
create table one_k(a int) engine=myisam;
insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
 
create table t0 (a int) engine=myisam;
insert into t0 values (1),(2),(3);
 
create table t1 (
  pk int primary key,
  col1 int not null,
  col2 int not null,
  key(col1)
) engine= innodb;
 
insert into t1 select a,a,a from one_k;
analyze table t1;

Run a query and observe t1.rows=1 which is a precise index statistics:

 
explain select * from t0, t1 where t1.col1=t0.a;
+------+-------------+-------+------+---------------+------+---------+---------+------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref     | rows | Extra       |
+------+-------------+-------+------+---------------+------+---------+---------+------+-------------+
|    1 | SIMPLE      | t0    | ALL  | NULL          | NULL | NULL    | NULL    |    3 | Using where |
|    1 | SIMPLE      | t1    | ref  | col1          | col1 | 4       | j5.t0.a |    1 |             |
+------+-------------+-------+------+---------------+------+---------+---------+------+-------------+

Now, make innodb's statistics update stop in the middle of update:

system touch /tmp/dict_stats_analyze_index_must_stop

Do something to cause auto statistics update (we are running with innodb_stats_auto_recalc=ON, which is the default).

insert into t1 select 5000+pk,5000+col1, 5000+col2 from t1 limit 2000;
Query OK, 1000 rows affected (0.37 sec)
Records: 1000  Duplicates: 0  Warnings: 0

OPTIONAL: You can check the server stderr and see this line:

  AAA: dict_stats_analyze_index called dict_stats_empty_index for col1

Then, run the test query again:

flush tables;
explain select * from t0, t1 where t1.col1=t0.a;
+------+-------------+-------+------+---------------+------+---------+---------+------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref     | rows | Extra       |
+------+-------------+-------+------+---------------+------+---------+---------+------+-------------+
|    1 | SIMPLE      | t0    | ALL  | NULL          | NULL | NULL    | NULL    |    3 | Using where |
|    1 | SIMPLE      | t1    | ref  | col1          | col1 | 4       | j5.t0.a | 1000 |             |
+------+-------------+-------+------+---------------+------+---------+---------+------+-------------+

Note rows=1000 ! The real value is rows=1, both before and after the update.
The value one is seeing is about table_rows/2.

Make the stats calculation finish:

system rm  /tmp/dict_stats_analyze_index_must_stop

wait a few seconds, and check again:

flush tables;
explain select * from t0, t1 where t1.col1=t0.a;
+------+-------------+-------+------+---------------+------+---------+---------+------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref     | rows | Extra       |
+------+-------------+-------+------+---------------+------+---------+---------+------+-------------+
|    1 | SIMPLE      | t0    | ALL  | NULL          | NULL | NULL    | NULL    |    3 | Using where |
|    1 | SIMPLE      | t1    | ref  | col1          | col1 | 4       | j5.t0.a |    1 |             |
+------+-------------+-------+------+---------------+------+---------+---------+------+-------------+

rows=1, the correct value is back.



 Comments   
Comment by Sergei Petrunia [ 2016-09-11 ]

If you have innodb_stats_auto_recalc=OFF, you can hit the same when running ANALYZE TABLE. (albeit the chance is smaller as ANALYZE TABLE is not as frequent).

Comment by Sergei Petrunia [ 2016-09-12 ]

Reproducible on MySQL-5.7. mdev10790-mysql-5.7.diff , mdev10790-mysql-5.7.log .

Generated at Thu Feb 08 07:44:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.