Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13266

Race condition in ANALYZE TABLE / statistics collection, main.stat_tables_par failed in buildbot with wrong result

Details

    • 10.2.12

    Description

      http://buildbot.askmonty.org/buildbot/builders/p8-rhel71-bintar-debug/builds/2017/steps/test/logs/stdio

      Also reproducible locally by running the test with --repeat=N with a big enough N.

      main.stat_tables_par                     w1 [ fail ]
              Test ended at 2017-07-05 14:50:31
       
      CURRENT_TEST: main.stat_tables_par
      --- /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/mysql-test/r/stat_tables_par.result	2017-07-05 11:32:08.057165801 -0400
      +++ /home/buildbot/maria-slave/power8-vlp06-bintar-debug/build/mysql-test/r/stat_tables_par.reject	2017-07-05 14:50:30.258223458 -0400
      @@ -176,7 +176,6 @@
       dbt3_s001	lineitem	i_l_orderkey_quantity	2	1.0404
       dbt3_s001	lineitem	i_l_partkey	1	30.0250
       dbt3_s001	lineitem	i_l_receiptdate	1	2.6477
      -dbt3_s001	lineitem	i_l_shipdate	1	2.6500
       dbt3_s001	lineitem	i_l_suppkey	1	600.5000
       dbt3_s001	lineitem	i_l_suppkey_partkey	1	30.0250
       dbt3_s001	lineitem	i_l_suppkey_partkey	2	8.5786
      @@ -191,7 +190,6 @@
       dbt3_s001	lineitem	i_l_orderkey_quantity	2	1.0404
       dbt3_s001	lineitem	i_l_partkey	1	30.0250
       dbt3_s001	lineitem	i_l_receiptdate	1	2.6477
      -dbt3_s001	lineitem	i_l_shipdate	1	2.6500
       dbt3_s001	lineitem	i_l_suppkey	1	600.5000
       dbt3_s001	lineitem	i_l_suppkey_partkey	1	30.0250
       dbt3_s001	lineitem	i_l_suppkey_partkey	2	8.5786
       
      mysqltest: Result length mismatch
      

      First recorded occurrences in buildbot are of August 2016.

      Not reproducible on 10.1.

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova added a comment - - edited

            The failure starts from the timed out sync point. We don't see it normally because warnings and result sets are disabled there:

             connection con2;
             Table	Op	Msg_type	Msg_text
             dbt3_s001.lineitem	analyze	status	Engine-independent statistics collected
            +dbt3_s001.lineitem	analyze	Warning	debug sync point wait timed out
             dbt3_s001.lineitem	analyze	status	Table is already up to date
             connection default;
             disconnect con1;
            @@ -182,7 +183,6 @@
             dbt3_s001	lineitem	i_l_orderkey_quantity	2	1.0404
             dbt3_s001	lineitem	i_l_partkey	1	30.0250
             dbt3_s001	lineitem	i_l_receiptdate	1	2.6477
            -dbt3_s001	lineitem	i_l_shipdate	1	2.6500
             dbt3_s001	lineitem	i_l_suppkey	1	600.5000
             dbt3_s001	lineitem	i_l_suppkey_partkey	1	30.0250
             dbt3_s001	lineitem	i_l_suppkey_partkey	2	8.5786
            @@ -197,7 +197,6 @@
             dbt3_s001	lineitem	i_l_orderkey_quantity	2	1.0404
             dbt3_s001	lineitem	i_l_partkey	1	30.0250
             dbt3_s001	lineitem	i_l_receiptdate	1	2.6477
            -dbt3_s001	lineitem	i_l_shipdate	1	2.6500
             dbt3_s001	lineitem	i_l_suppkey	1	600.5000
             dbt3_s001	lineitem	i_l_suppkey_partkey	1	30.0250
             dbt3_s001	lineitem	i_l_suppkey_partkey	2	8.5786
            

            +-----+------+-----------------+-----------+---------+------+------------------------------------------------+---------------------------------------------------------------------------+----------+
            | Id  | User | Host            | db        | Command | Time | State                                          | Info                                                                      | Progress |
            +-----+------+-----------------+-----------+---------+------+------------------------------------------------+---------------------------------------------------------------------------+----------+
            | 316 | root | localhost       | dbt3_s001 | Sleep   |   69 |                                                | NULL                                                                      |    0.000 |
            | 319 | root | localhost       | dbt3_s001 | Sleep   |   69 |                                                | NULL                                                                      |    0.000 |
            | 320 | root | localhost       | dbt3_s001 | Query   |   69 | debug sync point: statistics_collection_start2 | analyze table lineitem persistent for columns() indexes (i_l_receiptdate) |    0.000 |
            | 322 | root | localhost:40710 | NULL      | Query   |    0 | init                                           | show processlist                                                          |    0.000 |
            +-----+------+-----------------+-----------+---------+------+------------------------------------------------+---------------------------------------------------------------------------+----------+
            

            However, sync points are irrelevant. The attached simplified version of the same test without sync points still hits the same error.

            CURRENT_TEST: bug.mdev13266
            --- /data/bld/10.2/mysql-test/suite/bug/mdev13266.result	2017-08-06 19:15:00.826455463 +0300
            +++ /data/bld/10.2/mysql-test/suite/bug/mdev13266.reject	2017-08-06 19:52:47.598437878 +0300
            @@ -32,5 +32,4 @@
             select * from mysql.index_stats where table_name='lineitem' order by index_name, prefix_arity;
             db_name	table_name	index_name	prefix_arity	avg_frequency
             dbt3_s001	lineitem	i_l_partkey	1	30.0250
            -dbt3_s001	lineitem	i_l_shipdate	1	2.6500
             DROP DATABASE dbt3_s001;
             
            mysqltest: Result length mismatch
            

            I think it's not a test problem, but a genuine race condition in the code, which requires proper debugging.

            elenst Elena Stepanova added a comment - - edited The failure starts from the timed out sync point. We don't see it normally because warnings and result sets are disabled there: connection con2; Table Op Msg_type Msg_text dbt3_s001.lineitem analyze status Engine-independent statistics collected +dbt3_s001.lineitem analyze Warning debug sync point wait timed out dbt3_s001.lineitem analyze status Table is already up to date connection default; disconnect con1; @@ -182,7 +183,6 @@ dbt3_s001 lineitem i_l_orderkey_quantity 2 1.0404 dbt3_s001 lineitem i_l_partkey 1 30.0250 dbt3_s001 lineitem i_l_receiptdate 1 2.6477 -dbt3_s001 lineitem i_l_shipdate 1 2.6500 dbt3_s001 lineitem i_l_suppkey 1 600.5000 dbt3_s001 lineitem i_l_suppkey_partkey 1 30.0250 dbt3_s001 lineitem i_l_suppkey_partkey 2 8.5786 @@ -197,7 +197,6 @@ dbt3_s001 lineitem i_l_orderkey_quantity 2 1.0404 dbt3_s001 lineitem i_l_partkey 1 30.0250 dbt3_s001 lineitem i_l_receiptdate 1 2.6477 -dbt3_s001 lineitem i_l_shipdate 1 2.6500 dbt3_s001 lineitem i_l_suppkey 1 600.5000 dbt3_s001 lineitem i_l_suppkey_partkey 1 30.0250 dbt3_s001 lineitem i_l_suppkey_partkey 2 8.5786 +-----+------+-----------------+-----------+---------+------+------------------------------------------------+---------------------------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +-----+------+-----------------+-----------+---------+------+------------------------------------------------+---------------------------------------------------------------------------+----------+ | 316 | root | localhost | dbt3_s001 | Sleep | 69 | | NULL | 0.000 | | 319 | root | localhost | dbt3_s001 | Sleep | 69 | | NULL | 0.000 | | 320 | root | localhost | dbt3_s001 | Query | 69 | debug sync point: statistics_collection_start2 | analyze table lineitem persistent for columns() indexes (i_l_receiptdate) | 0.000 | | 322 | root | localhost:40710 | NULL | Query | 0 | init | show processlist | 0.000 | +-----+------+-----------------+-----------+---------+------+------------------------------------------------+---------------------------------------------------------------------------+----------+ However, sync points are irrelevant. The attached simplified version of the same test without sync points still hits the same error. CURRENT_TEST: bug.mdev13266 --- /data/bld/10.2/mysql-test/suite/bug/mdev13266.result 2017-08-06 19:15:00.826455463 +0300 +++ /data/bld/10.2/mysql-test/suite/bug/mdev13266.reject 2017-08-06 19:52:47.598437878 +0300 @@ -32,5 +32,4 @@ select * from mysql.index_stats where table_name='lineitem' order by index_name, prefix_arity; db_name table_name index_name prefix_arity avg_frequency dbt3_s001 lineitem i_l_partkey 1 30.0250 -dbt3_s001 lineitem i_l_shipdate 1 2.6500 DROP DATABASE dbt3_s001;   mysqltest: Result length mismatch I think it's not a test problem, but a genuine race condition in the code, which requires proper debugging.
            alice Alice Sherepa added a comment - now on 10.3 http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest2/builds/10176/steps/test_2/logs/stdio

            Reproduced locally on 10.5.

            midenok Aleksey Midenkov added a comment - Reproduced locally on 10.5.
            varun Varun Gupta (Inactive) added a comment - Patch http://lists.askmonty.org/pipermail/commits/2020-April/014242.html

            Ok to push after the request in the review email is done.

            psergei Sergei Petrunia added a comment - Ok to push after the request in the review email is done.

            People

              varun Varun Gupta (Inactive)
              elenst Elena Stepanova
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.