Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30540

Wrong result with IN list length reaching IN_PREDICATE_CONVERSION_THRESHOLD

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • N/A
    • 11.0.0
    • Optimizer
    • None

    Description

      I could only reproduce it on bb-11.0 so far.

      --source include/have_sequence.inc
       
      CREATE TABLE t (a INT PRIMARY KEY);
      INSERT INTO t SELECT seq FROM seq_1_to_30;
       
      ANALYZE TABLE t PERSISTENT FOR ALL;
       
      SET IN_PREDICATE_CONVERSION_THRESHOLD=4;
      SELECT a FROM t WHERE a IN ( 1, 1, 2, 194 );
      SET IN_PREDICATE_CONVERSION_THRESHOLD=100;
      SELECT a FROM t WHERE a IN ( 1, 1, 2, 194 );
       
      drop table t;
      

      bb-11.0

      SET IN_PREDICATE_CONVERSION_THRESHOLD=4;
      SELECT a FROM t WHERE a IN ( 1, 1, 2, 194 );
      a
      1
      1
      2
      SET IN_PREDICATE_CONVERSION_THRESHOLD=100;
      SELECT a FROM t WHERE a IN ( 1, 1, 2, 194 );
      a
      1
      2
      

      The second one is of course the correct result.

      Plan with the wrong result:

      explain extended SELECT a FROM t WHERE a IN ( 1, 1, 2, 194 );
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	PRIMARY	<derived3>	ALL	NULL	NULL	NULL	NULL	4	100.00	
      1	PRIMARY	t	eq_ref	PRIMARY	PRIMARY	4	tvc_0._col_1	1	100.00	Using index
      3	DERIVED	NULL	NULL	NULL	NULL	NULL	NULL	NULL	NULL	No tables used
      Warnings:
      Note	1003	/* select#1 */ select `test`.`t`.`a` AS `a` from (values (1),(1),(2),(194)) `tvc_0` join `test`.`t` where `test`.`t`.`a` = `tvc_0`.`_col_1`
      

      Reproducible with MyISAM, InnoDB, Aria.

      Attachments

        Issue Links

          Activity

            The scenario which causes the bug is as follows:

            • IN-predicate is converted into a derived table, (SELECT ... FROM temp_table)
            • Derived table is created with a unique key (Good so far).
            • derived_with_keys optimization starts. It adds another "potential" index (This is a known deficiency, it could actually re-use the key that already exists).
            • join optimization decides to use a full scan on the derived table.
            • derived_with_keys sees that indexes it has added are not necessary.
            • It removes all indexes, including the unique one (This is where the bug is)
            • Now, when the query is run, there is no unique key on the temp_table, duplicates are not removed and so duplicate rows are produced in the output.
            psergei Sergei Petrunia added a comment - The scenario which causes the bug is as follows: IN-predicate is converted into a derived table, (SELECT ... FROM temp_table) Derived table is created with a unique key (Good so far). derived_with_keys optimization starts. It adds another "potential" index (This is a known deficiency, it could actually re-use the key that already exists). join optimization decides to use a full scan on the derived table. derived_with_keys sees that indexes it has added are not necessary. It removes all indexes, including the unique one (This is where the bug is) Now, when the query is run, there is no unique key on the temp_table, duplicates are not removed and so duplicate rows are produced in the output.

            Stack trace of the location where all indexes are removed:

            #0  TABLE::use_index (this=0x7fff9cb411c0, key_to_save=-1) at /home/psergey/dev-git/11.0-before-rebase/sql/table.
            cc:8469
            #1  0x0000555555fe1cd7 in JOIN::drop_unused_derived_keys (this=0x7fff9c018500) at /home/psergey/dev-git/11.0-befo
            re-rebase/sql/sql_select.cc:13872
            #2  0x0000555555fbe31f in JOIN::optimize_stage2 (this=0x7fff9c018500) at /home/psergey/dev-git/11.0-before-rebase
            /sql/sql_select.cc:2632
            #3  0x0000555555fbe0dc in JOIN::optimize_inner (this=0x7fff9c018500) at /home/psergey/dev-git/11.0-before-rebase/
            sql/sql_select.cc:2597
            #4  0x0000555555fbb7af in JOIN::optimize (this=0x7fff9c018500) at /home/psergey/dev-git/11.0-before-rebase/sql/sq
            l_select.cc:1899
            

            psergei Sergei Petrunia added a comment - Stack trace of the location where all indexes are removed: #0 TABLE::use_index (this=0x7fff9cb411c0, key_to_save=-1) at /home/psergey/dev-git/11.0-before-rebase/sql/table. cc:8469 #1 0x0000555555fe1cd7 in JOIN::drop_unused_derived_keys (this=0x7fff9c018500) at /home/psergey/dev-git/11.0-befo re-rebase/sql/sql_select.cc:13872 #2 0x0000555555fbe31f in JOIN::optimize_stage2 (this=0x7fff9c018500) at /home/psergey/dev-git/11.0-before-rebase /sql/sql_select.cc:2632 #3 0x0000555555fbe0dc in JOIN::optimize_inner (this=0x7fff9c018500) at /home/psergey/dev-git/11.0-before-rebase/ sql/sql_select.cc:2597 #4 0x0000555555fbb7af in JOIN::optimize (this=0x7fff9c018500) at /home/psergey/dev-git/11.0-before-rebase/sql/sq l_select.cc:1899

            Developed a fix for this, but I get a failure in another test. It shows that the fix exposes another bug which is not easy to fix.

            Pushed a commit which temporarily disables the Derived tables and union can now create distinct keys feature of bb-11.0 tree. It makes this bug go away.

            commit 60f08a7d49c4ff1838319cdd84be064ab0a9b4c8
            Author: Sergei Petrunia <sergey@mariadb.com>
            Date:   Sat Feb 4 13:26:30 2023 +0300
             
                MDEV-30540: Temporarily revert : Derived tables and union can now create distinct keys
                
                Revert this as it was causing the issue:
                
                commit 5e5a8eda1641eda1d915a7eb5736e494d2179795
                Author: Monty <monty@mariadb.org>
                Date:   Wed May 4 17:26:43 2022 +0300
                
                    Derived tables and union can now create distinct keys
                
                    The idea is that instead of marking all select_lex's with DISTINCT, we
                    only mark those that really need distinct result.
                
            

            psergei Sergei Petrunia added a comment - Developed a fix for this, but I get a failure in another test. It shows that the fix exposes another bug which is not easy to fix. Pushed a commit which temporarily disables the Derived tables and union can now create distinct keys feature of bb-11.0 tree. It makes this bug go away. commit 60f08a7d49c4ff1838319cdd84be064ab0a9b4c8 Author: Sergei Petrunia <sergey@mariadb.com> Date: Sat Feb 4 13:26:30 2023 +0300   MDEV-30540: Temporarily revert : Derived tables and union can now create distinct keys Revert this as it was causing the issue: commit 5e5a8eda1641eda1d915a7eb5736e494d2179795 Author: Monty <monty@mariadb.org> Date: Wed May 4 17:26:43 2022 +0300 Derived tables and union can now create distinct keys The idea is that instead of marking all select_lex's with DISTINCT, we only mark those that really need distinct result.
            psergei Sergei Petrunia added a comment - - edited

            The patch which makes derived_with_keys optimization not to remove unique keys: fix-unique-attempt.diff

            It causes this test failure:

            mysql-test/main/derived.result    2023-02-03 15:02:03.784632772 +0300
            mysql-test/main/derived.reject    2023-02-09 10:50:42.571006411 +0300
            @@ -355,43 +355,41 @@
             create table t1(a int);
             create table t2(a int);
             create table t3(a int);
             insert into t1 values(1),(1);
             insert into t2 values(2),(2);
             insert into t3 values(3),(3);
             select * from t1 union distinct select * from t2 union all select * from t3;
             a
             1
             2
             3
             3
             select * from (select * from t1 union distinct select * from t2 union all select * from t3) X;
             a
             1
             2
             3
            -3
             set @save2_derived_optimizer_switch_bug=@@optimizer_switch;
             set @@optimizer_switch=default;
             select * from (select * from t1 union distinct select * from t2 union all select * from t3) X;
             a
             1
             2
             3
            -3
            

            psergei Sergei Petrunia added a comment - - edited The patch which makes derived_with_keys optimization not to remove unique keys: fix-unique-attempt.diff It causes this test failure: mysql-test/main/derived.result 2023-02-03 15:02:03.784632772 +0300 mysql-test/main/derived.reject 2023-02-09 10:50:42.571006411 +0300 @@ -355,43 +355,41 @@ create table t1(a int); create table t2(a int); create table t3(a int); insert into t1 values(1),(1); insert into t2 values(2),(2); insert into t3 values(3),(3); select * from t1 union distinct select * from t2 union all select * from t3; a 1 2 3 3 select * from (select * from t1 union distinct select * from t2 union all select * from t3) X; a 1 2 3 -3 set @save2_derived_optimizer_switch_bug=@@optimizer_switch; set @@optimizer_switch=default; select * from (select * from t1 union distinct select * from t2 union all select * from t3) X; a 1 2 3 -3

            Worked with Petruna to fix this one.

            The problem was the mysql_derived_prepare() did not correctly set
            'distinct' when creating a temporary derived table.

            Fixed by separating checking for distinct for queries with and without UNION.

            monty Michael Widenius added a comment - Worked with Petruna to fix this one. The problem was the mysql_derived_prepare() did not correctly set 'distinct' when creating a temporary derived table. Fixed by separating checking for distinct for queries with and without UNION.

            The patch was not pushed at once as buildbot discovered a crash on Windows after the patch was pushed.
            The crash was caused by 2 other independent bugs (one windows only and one general).
            These are now fixed.

            monty Michael Widenius added a comment - The patch was not pushed at once as buildbot discovered a crash on Windows after the patch was pushed. The crash was caused by 2 other independent bugs (one windows only and one general). These are now fixed.

            People

              psergei Sergei Petrunia
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.