[MDEV-32901] innodb.mdev-14846 fails in 11.0 Created: 2023-11-28  Updated: 2023-12-11  Resolved: 2023-12-05

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 11.0
Fix Version/s: 10.4.33, 10.5.24, 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3, 11.3.2, 11.4.1

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: innodb-noisy-estimates

Attachments: File bad-trace.log     File good-trace.log    
Issue Links:
Relates
relates to MDEV-32785 background innodb stats recalculation... Open

 Description   

The failure looks like this:

CURRENT_TEST: innodb.mdev-14846
--- /home/buildbot/buildbot/build/mariadb-11.1.4/mysql-test/suite/innodb/r/mdev-14846.result	2023-11-24 15:03:29.000000000 +0000
+++ /home/buildbot/buildbot/build/mariadb-11.1.4/mysql-test/suite/innodb/r/mdev-14846.reject	2023-11-24 17:55:45.075140470 +0000
@@ -38,7 +38,7 @@
 id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
 1	PRIMARY	alias1	ALL	NULL	NULL	NULL	NULL	#	
 1	PRIMARY	alias2	ALL	NULL	NULL	NULL	NULL	#	Using where
-2	DEPENDENT SUBQUERY	t1	ALL	NULL	NULL	NULL	NULL	#	Using where
+2	MATERIALIZED	t1	ALL	NULL	NULL	NULL	NULL	#	
 UPDATE t3 AS alias1 LEFT JOIN t3 AS alias2 ON ( alias1.f1 <> alias1.f2 ) SET alias1.f3 = 59 WHERE ( EXISTS ( SELECT t1.f3 FROM t1 IGNORE INDEX (f1) WHERE t1.f1 = alias1.f1 ) ) OR alias2.f1 = 'h';
 connect  con2,localhost,root,,test;
 set debug_sync='now WAIT_FOR con2_dml';
@@ -51,8 +51,8 @@
 1	PRIMARY	t1	index	NULL	f1	12	NULL	#	Using index
 1	PRIMARY	t1	ALL	NULL	NULL	NULL	NULL	#	
 1	PRIMARY	<subquery2>	eq_ref	distinct_key	distinct_key	13	func	#	
-2	MATERIALIZED	t3	ALL	NULL	NULL	NULL	NULL	#	
 2	MATERIALIZED	t2	ALL	NULL	NULL	NULL	NULL	#	
+2	MATERIALIZED	t3	ALL	NULL	NULL	NULL	NULL	#	
 UPDATE v4, t1 SET t1.pk = 76 WHERE t1.f2 IN ( SELECT t2.f FROM t2 INNER JOIN t3 );
 connection default;
 ERROR 40001: Deadlock found when trying to get lock; try restarting transaction



 Comments   
Comment by Sergei Petrunia [ 2023-11-28 ]

Reproducible with something like

./mtr --mem --repeat=200 --parallel=30 innodb.mdev-14846{,,,,,,,,,,,,,,,,,,}

Comment by Sergei Petrunia [ 2023-11-28 ]

Test harness to look at trace difference:

diff --git a/mysql-test/suite/innodb/t/mdev-14846.test b/mysql-test/suite/innodb/t/mdev-14846.test
index a576d244007..4af8cc1cc6d 100644
--- a/mysql-test/suite/innodb/t/mdev-14846.test
+++ b/mysql-test/suite/innodb/t/mdev-14846.test
@@ -2,6 +2,7 @@
 --source include/count_sessions.inc
 --source include/have_debug_sync.inc
 
+--exec touch $MYSQLTEST_VARDIR/tmp/trace.log
 CREATE TABLE t1 (
  pk INT,
  f1 VARCHAR(10) NOT NULL,
@@ -39,8 +40,12 @@ SET DEBUG_SYNC='now SIGNAL con2_dml';
 
 --connection default
 SET DEBUG_SYNC='now WAIT_FOR default_dml';
+--remove_file $MYSQLTEST_VARDIR/tmp/trace.log
+set optimizer_trace=1;
 --replace_column 9 #
 explain UPDATE t3 AS alias1 LEFT JOIN t3 AS alias2 ON ( alias1.f1 <> alias1.f2 ) SET alias1.f3 = 59 WHERE ( EXISTS ( SELECT t1.f3 FROM t1 IGNORE INDEX (f1) WHERE t1.f1 = alias1.f1 ) ) OR alias2.f1 = 'h';
+evalp select trace from information_schema.optimizer_trace INTO OUTFILE '$MYSQLTEST_VARDIR/tmp/trace.log';
+
 --send UPDATE t3 AS alias1 LEFT JOIN t3 AS alias2 ON ( alias1.f1 <> alias1.f2 ) SET alias1.f3 = 59 WHERE ( EXISTS ( SELECT t1.f3 FROM t1 IGNORE INDEX (f1) WHERE t1.f1 = alias1.f1 ) ) OR alias2.f1 = 'h'
 # It holds the lock of all record in t3 and tries to acquire record lock for the table  t1.
 

Comment by Sergei Petrunia [ 2023-11-28 ]

Diffing the .trace files between a passing and a failing run: bad-trace.log good-trace.log ...

some off-by-1 estimate differences produce off-by-many difference in join output cardinality:

@@ -183,8 +183,8 @@
                   {\
                     "plan_prefix": "alias1",\
                     "table": "alias2",\
-                    "rows_for_plan": 16,\
-                    "cost_for_plan": 0.0550764\
+                    "rows_for_plan": 9,\
+                    "cost_for_plan": 0.0437368\
                   }\
                 ]\

which causes one query plan to be picked instead of the other

@@ -316,9 +316,9 @@
                 {\
                   "subquery_plan": {\
                     "rows": 5,\
-                    "materialization_cost": 0.038299536,\
-                    "in_exist_cost": 0.0466792,\
-                    "choosen": "materialization"\
+                    "materialization_cost": 0.038136232,\
+                    "in_exist_cost": 0.0350094,\
+                    "choosen": "in_to_exists"\
                   }\
                 },\

Comment by Sergei Petrunia [ 2023-12-04 ]

Take-aways from discussion with marko:

There is no global O(1) solution.

There seems to be no acceptable global setting that we could use for the whole testsuite to make all InnoDB's estimates stable.

I have actually tried this by setting innodb_stats_persistent=0 in MDEV-4750. This caused pushback from InnoDB team and Marko has reverted it in 10.6. The rationale is that tests should use execution paths close to the defaults.

Can do it on a per-table basis

Setting stats_auto_recalc=0 or stats_persistent=0 makes the problem go away.
Or, we could add ANALYZE TABLE statements after the table has been filled and before any EXPLAIN.
(a variant from the Optimizer call: or we could add a --analyze table mtr command. )

Can do it on a per-testcase basis

We can add a .opt file for the affected tests.
(Or try adding something like '--source include/requre-innodb-stable-stats.inc')

Comment by Sergei Petrunia [ 2023-12-04 ]

bb-11.0-mdev-32901-v2. oleg.smirnov, could you review this please?

Comment by Oleg Smirnov [ 2023-12-05 ]

Just a couple of cosmetic comments at GitHub, otherwise legitimate.

Comment by Sergei Petrunia [ 2023-12-05 ]

Review input addressed

Generated at Thu Feb 08 10:34:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.