[MDEV-13266] Race condition in ANALYZE TABLE / statistics collection, main.stat_tables_par failed in buildbot with wrong result Created: 2017-07-06  Updated: 2020-05-05  Resolved: 2020-05-05

Status: Closed
Project: MariaDB Server
Component/s: Admin statements, Tests
Affects Version/s: 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.2.32, 10.3.23, 10.4.13, 10.5.3

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Varun Gupta (Inactive)
Resolution: Fixed Votes: 1
Labels: None

Attachments: File mdev13266.result     File mdev13266.test    
Issue Links:
Duplicate
duplicates MDEV-13544 main.stat_tables_par failed in buildbot Closed
Relates
relates to MDEV-10804 main.stat_tables_par fails sporadical... Closed
Sprint: 10.2.12

 Description   

http://buildbot.askmonty.org/buildbot/builders/p8-rhel71-bintar-debug/builds/2017/steps/test/logs/stdio

Also reproducible locally by running the test with --repeat=N with a big enough N.

main.stat_tables_par                     w1 [ fail ]
        Test ended at 2017-07-05 14:50:31
 
CURRENT_TEST: main.stat_tables_par
--- /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/mysql-test/r/stat_tables_par.result	2017-07-05 11:32:08.057165801 -0400
+++ /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/mysql-test/r/stat_tables_par.reject	2017-07-05 14:50:30.258223458 -0400
@@ -176,7 +176,6 @@
 dbt3_s001	lineitem	i_l_orderkey_quantity	2	1.0404
 dbt3_s001	lineitem	i_l_partkey	1	30.0250
 dbt3_s001	lineitem	i_l_receiptdate	1	2.6477
-dbt3_s001	lineitem	i_l_shipdate	1	2.6500
 dbt3_s001	lineitem	i_l_suppkey	1	600.5000
 dbt3_s001	lineitem	i_l_suppkey_partkey	1	30.0250
 dbt3_s001	lineitem	i_l_suppkey_partkey	2	8.5786
@@ -191,7 +190,6 @@
 dbt3_s001	lineitem	i_l_orderkey_quantity	2	1.0404
 dbt3_s001	lineitem	i_l_partkey	1	30.0250
 dbt3_s001	lineitem	i_l_receiptdate	1	2.6477
-dbt3_s001	lineitem	i_l_shipdate	1	2.6500
 dbt3_s001	lineitem	i_l_suppkey	1	600.5000
 dbt3_s001	lineitem	i_l_suppkey_partkey	1	30.0250
 dbt3_s001	lineitem	i_l_suppkey_partkey	2	8.5786
 
mysqltest: Result length mismatch

First recorded occurrences in buildbot are of August 2016.

Not reproducible on 10.1.



 Comments   
Comment by Elena Stepanova [ 2017-08-05 ]

The failure starts from the timed out sync point. We don't see it normally because warnings and result sets are disabled there:

 connection con2;
 Table	Op	Msg_type	Msg_text
 dbt3_s001.lineitem	analyze	status	Engine-independent statistics collected
+dbt3_s001.lineitem	analyze	Warning	debug sync point wait timed out
 dbt3_s001.lineitem	analyze	status	Table is already up to date
 connection default;
 disconnect con1;
@@ -182,7 +183,6 @@
 dbt3_s001	lineitem	i_l_orderkey_quantity	2	1.0404
 dbt3_s001	lineitem	i_l_partkey	1	30.0250
 dbt3_s001	lineitem	i_l_receiptdate	1	2.6477
-dbt3_s001	lineitem	i_l_shipdate	1	2.6500
 dbt3_s001	lineitem	i_l_suppkey	1	600.5000
 dbt3_s001	lineitem	i_l_suppkey_partkey	1	30.0250
 dbt3_s001	lineitem	i_l_suppkey_partkey	2	8.5786
@@ -197,7 +197,6 @@
 dbt3_s001	lineitem	i_l_orderkey_quantity	2	1.0404
 dbt3_s001	lineitem	i_l_partkey	1	30.0250
 dbt3_s001	lineitem	i_l_receiptdate	1	2.6477
-dbt3_s001	lineitem	i_l_shipdate	1	2.6500
 dbt3_s001	lineitem	i_l_suppkey	1	600.5000
 dbt3_s001	lineitem	i_l_suppkey_partkey	1	30.0250
 dbt3_s001	lineitem	i_l_suppkey_partkey	2	8.5786

+-----+------+-----------------+-----------+---------+------+------------------------------------------------+---------------------------------------------------------------------------+----------+
| Id  | User | Host            | db        | Command | Time | State                                          | Info                                                                      | Progress |
+-----+------+-----------------+-----------+---------+------+------------------------------------------------+---------------------------------------------------------------------------+----------+
| 316 | root | localhost       | dbt3_s001 | Sleep   |   69 |                                                | NULL                                                                      |    0.000 |
| 319 | root | localhost       | dbt3_s001 | Sleep   |   69 |                                                | NULL                                                                      |    0.000 |
| 320 | root | localhost       | dbt3_s001 | Query   |   69 | debug sync point: statistics_collection_start2 | analyze table lineitem persistent for columns() indexes (i_l_receiptdate) |    0.000 |
| 322 | root | localhost:40710 | NULL      | Query   |    0 | init                                           | show processlist                                                          |    0.000 |
+-----+------+-----------------+-----------+---------+------+------------------------------------------------+---------------------------------------------------------------------------+----------+

However, sync points are irrelevant. The attached simplified version of the same test without sync points still hits the same error.

CURRENT_TEST: bug.mdev13266
--- /data/bld/10.2/mysql-test/suite/bug/mdev13266.result	2017-08-06 19:15:00.826455463 +0300
+++ /data/bld/10.2/mysql-test/suite/bug/mdev13266.reject	2017-08-06 19:52:47.598437878 +0300
@@ -32,5 +32,4 @@
 select * from mysql.index_stats where table_name='lineitem' order by index_name, prefix_arity;
 db_name	table_name	index_name	prefix_arity	avg_frequency
 dbt3_s001	lineitem	i_l_partkey	1	30.0250
-dbt3_s001	lineitem	i_l_shipdate	1	2.6500
 DROP DATABASE dbt3_s001;
 
mysqltest: Result length mismatch

I think it's not a test problem, but a genuine race condition in the code, which requires proper debugging.

Comment by Alice Sherepa [ 2017-10-28 ]

now on 10.3 http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest2/builds/10176/steps/test_2/logs/stdio

Comment by Aleksey Midenkov [ 2019-10-17 ]

Reproduced locally on 10.5.

Comment by Varun Gupta (Inactive) [ 2020-04-12 ]

Patch
http://lists.askmonty.org/pipermail/commits/2020-April/014242.html

Comment by Sergei Petrunia [ 2020-04-30 ]

Ok to push after the request in the review email is done.

Generated at Thu Feb 08 08:04:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.