Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5161

Wrong result (missing rows) with semijoin, LEFT JOIN, ORDER BY, constant table

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.4, 5.3.12, 5.5.33a
    • 5.5.35, 10.0.7
    • None
    • None

    Description

      The test case is nearly identical to the one provided in the comment to MDEV-5059 (https://mariadb.atlassian.net/browse/MDEV-5059?focusedCommentId=35418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-35418); the only difference is that table t3 lost the index, and the wrong result now affects the default join_cache_level=2.

      Test case:

      SET optimizer_switch = 'semijoin=on';
       
      CREATE TABLE t1 (pk INT PRIMARY KEY, c1 VARCHAR(1)) ENGINE=MyISAM;
      INSERT INTO t1 VALUES (1,'v'),(2,'v'),(3,'c'),(4,NULL),(5,'x');
       
      CREATE TABLE t2 (c2 VARCHAR(1)) ENGINE=MyISAM;
      INSERT INTO t2 VALUES ('x');
       
      CREATE TABLE t3 (c3 VARCHAR(1)) ENGINE=MyISAM;
      INSERT INTO t3 VALUES ('x'),('d');
       
      SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;

      Actual result:

      pk	c1	c2
      4	NULL	x

      Expected result:

      pk	c1	c2
      4	NULL	x
      3	c	x
      1	v	x
      2	v	x
      5	x	x

      revision-id: bar@mnogosearch.org-20131016141313-63jt6geakz1e6dg2
      revno: 3708
      branch-nick: 5.3

      EXPLAIN EXTENDED
      SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	PRIMARY	t2	system	NULL	NULL	NULL	NULL	1	100.00	Using filesort
      1	PRIMARY	t1	ALL	PRIMARY	NULL	NULL	NULL	5	100.00	
      1	PRIMARY	t1	eq_ref	PRIMARY	PRIMARY	4	test.t1.pk	1	100.00	
      1	PRIMARY	t3	ALL	NULL	NULL	NULL	NULL	2	100.00	Using where; Start temporary; End temporary
      Warnings:
      Note	1003	select `test`.`t1`.`pk` AS `pk`,`test`.`t1`.`c1` AS `c1`,'x' AS `c2` from `test`.`t1` semi join (`test`.`t3`) join `test`.`t1` join `test`.`t2` where (`test`.`t1`.`pk` = `test`.`t1`.`pk`) order by 'x',`test`.`t1`.`c1`

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova created issue -
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            psergei Sergei Petrunia made changes -
            Status Open [ 1 ] In Progress [ 3 ]

            FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) )

            EXPLAIN EXTENDED shows:

            ... from `j2`.`t1` semi join (`j2`.`t3`) join `j2`.`t1` where (`j2`.`t1`.`pk` = `j2`.`t1`.`pk`)

            note that

            • t1 is pulled out of the semi-join,
            • t3 is not pulled out of the semi-join.
            • the LEFT JOIN from "t1 LEFT JOIN t3" is gone.
            psergei Sergei Petrunia added a comment - FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) EXPLAIN EXTENDED shows: ... from `j2`.`t1` semi join (`j2`.`t3`) join `j2`.`t1` where (`j2`.`t1`.`pk` = `j2`.`t1`.`pk`) note that t1 is pulled out of the semi-join, t3 is not pulled out of the semi-join. the LEFT JOIN from "t1 LEFT JOIN t3" is gone.

            Another interesting observation is that table elimination could theoretically remove the LEFT JOIN from the subquery, but that didn't happen.

            psergei Sergei Petrunia added a comment - Another interesting observation is that table elimination could theoretically remove the LEFT JOIN from the subquery, but that didn't happen.

            If I disallow pullout of parts of outer joins out of semi-joins, I get EXPLAIN like this:

            +EXPLAIN EXTENDED
            +SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;
            +id select_type table type possible_keys key key_len ref rows filtered Extra
            +1 PRIMARY t2 system NULL NULL NULL NULL 1 100.00 Using filesort
            +1 PRIMARY t1 ALL PRIMARY NULL NULL NULL 5 100.00
            +1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 test.t1.pk 1 100.00 Start temporary
            +1 PRIMARY t3 ALL NULL NULL NULL NULL 2 100.00 Using where; End temporary
            +Warnings:
            +Note 1003 select `test`.`t1`.`pk` AS `pk`,`test`.`t1`.`c1` AS `c1`,'x' AS `c2` from `test`.`t1` semi join (`test`.`t1` left join `test`.`t3` on((`test`.`t1`.`c1` = `test`.`t3`.`c3`))) where (`test`.`t1`.`pk` = `test`.`t1`.`pk`) order by 'x',`test`.`t1`.`c1`

            The warning text shows that the LEFT JOIN is here. However, the query result is still incorrect.

            psergei Sergei Petrunia added a comment - If I disallow pullout of parts of outer joins out of semi-joins, I get EXPLAIN like this: +EXPLAIN EXTENDED +SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1; +id select_type table type possible_keys key key_len ref rows filtered Extra +1 PRIMARY t2 system NULL NULL NULL NULL 1 100.00 Using filesort +1 PRIMARY t1 ALL PRIMARY NULL NULL NULL 5 100.00 +1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 test.t1.pk 1 100.00 Start temporary +1 PRIMARY t3 ALL NULL NULL NULL NULL 2 100.00 Using where; End temporary +Warnings: +Note 1003 select `test`.`t1`.`pk` AS `pk`,`test`.`t1`.`c1` AS `c1`,'x' AS `c2` from `test`.`t1` semi join (`test`.`t1` left join `test`.`t3` on((`test`.`t1`.`c1` = `test`.`t3`.`c3`))) where (`test`.`t1`.`pk` = `test`.`t1`.`pk`) order by 'x',`test`.`t1`.`c1` The warning text shows that the LEFT JOIN is here. However, the query result is still incorrect.

            Tracing this EXPLAIN

            MariaDB [j2]> explain SELECT * FROM t1a, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;
            ----------------------------------------------------------------------------------------------------------------------------------+

            id select_type table type possible_keys key key_len ref rows Extra

            ----------------------------------------------------------------------------------------------------------------------------------+

            1 PRIMARY t2 system NULL NULL NULL NULL 1 Using filesort
            1 PRIMARY t1a ALL PRIMARY NULL NULL NULL 5  
            1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 j2.t1a.pk 1 Start temporary
            1 PRIMARY t3 ALL NULL NULL NULL NULL 2 Using where; End temporary; Using join buffer (flat, BNL join)

            ----------------------------------------------------------------------------------------------------------------------------------+
            4 rows in set (0.00 sec)

            it seems that SJ_TMP_TABLE::sj_weedout_check_row() is invoked correclty for
            table t3. However, the check doesn't work: it checks rowid of table t1a, which
            always has the same value.

            psergei Sergei Petrunia added a comment - Tracing this EXPLAIN MariaDB [j2] > explain SELECT * FROM t1a, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1; ----- ----------- ----- ------ ------------- ------- ------- --------- ---- ---------------------------------------------------------------+ id select_type table type possible_keys key key_len ref rows Extra ----- ----------- ----- ------ ------------- ------- ------- --------- ---- ---------------------------------------------------------------+ 1 PRIMARY t2 system NULL NULL NULL NULL 1 Using filesort 1 PRIMARY t1a ALL PRIMARY NULL NULL NULL 5   1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 j2.t1a.pk 1 Start temporary 1 PRIMARY t3 ALL NULL NULL NULL NULL 2 Using where; End temporary; Using join buffer (flat, BNL join) ----- ----------- ----- ------ ------------- ------- ------- --------- ---- ---------------------------------------------------------------+ 4 rows in set (0.00 sec) it seems that SJ_TMP_TABLE::sj_weedout_check_row() is invoked correclty for table t3. However, the check doesn't work: it checks rowid of table t1a, which always has the same value.

            The issue doesn't depend on join buffering. I can get a wrong result when it is not used, too:
            set optimizer_switch='materialization=off,mrr=on,mrr_sort_keys=on,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off';
            set join_cache_level=1;

            MariaDB [j4]> explain SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;
            ---------------------------------------------------------------------------------------------+

            id select_type table type possible_keys key key_len ref rows Extra

            ---------------------------------------------------------------------------------------------+

            1 PRIMARY t2 system NULL NULL NULL NULL 1 Using filesort
            1 PRIMARY t1 ALL PRIMARY NULL NULL NULL 5  
            1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 j4.t1.pk 1 Start temporary
            1 PRIMARY t3 ALL NULL NULL NULL NULL 2 Using where; End temporary

            ---------------------------------------------------------------------------------------------+
            4 rows in set (0.00 sec)

            MariaDB [j4]> SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;
            ------------

            pk c1 c2

            ------------

            4 NULL x

            ------------
            1 row in set (0.00 sec)

            psergei Sergei Petrunia added a comment - The issue doesn't depend on join buffering. I can get a wrong result when it is not used, too: set optimizer_switch='materialization=off,mrr=on,mrr_sort_keys=on,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off'; set join_cache_level=1; MariaDB [j4] > explain SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1; ----- ----------- ----- ------ ------------- ------- ------- -------- ---- ---------------------------+ id select_type table type possible_keys key key_len ref rows Extra ----- ----------- ----- ------ ------------- ------- ------- -------- ---- ---------------------------+ 1 PRIMARY t2 system NULL NULL NULL NULL 1 Using filesort 1 PRIMARY t1 ALL PRIMARY NULL NULL NULL 5   1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 j4.t1.pk 1 Start temporary 1 PRIMARY t3 ALL NULL NULL NULL NULL 2 Using where; End temporary ----- ----------- ----- ------ ------------- ------- ------- -------- ---- ---------------------------+ 4 rows in set (0.00 sec) MariaDB [j4] > SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1; --- ---- ----- pk c1 c2 --- ---- ----- 4 NULL x --- ---- ----- 1 row in set (0.00 sec)

            Without join buffering, the problem is still the same: SJ_TMP_TABLE::sj_weedout_check_row() checks rowid of table t1a (the outer t1), however, that table always has:

            (gdb) p h->ref_length
            $101 = 6
            (gdb) x/6x h->ref
            0x7fffb4c584b0: 0xff 0xff 0xff 0xff 0xff 0xff

            and that's why the query produces only one row instead of five.
            Note that the SJ-Weedout table is not degenerate:
            (gdb) p is_degenerate
            $102 = false

            psergei Sergei Petrunia added a comment - Without join buffering, the problem is still the same: SJ_TMP_TABLE::sj_weedout_check_row() checks rowid of table t1a (the outer t1), however, that table always has: (gdb) p h->ref_length $101 = 6 (gdb) x/6x h->ref 0x7fffb4c584b0: 0xff 0xff 0xff 0xff 0xff 0xff and that's why the query produces only one row instead of five. Note that the SJ-Weedout table is not degenerate: (gdb) p is_degenerate $102 = false

            Ok, I think I get it. The problem seems to be caused by doing filesort on the first table. After that, join execution reads table rows with rr_unpack_from_buffer(). However, that function doesn't update the rowid at join->join_tab[1].table->ref.

            Duplicate Weedout code sees the same rowid all the time, and concludes that all row combinations except for the first one are duplicates.

            psergei Sergei Petrunia added a comment - Ok, I think I get it. The problem seems to be caused by doing filesort on the first table. After that, join execution reads table rows with rr_unpack_from_buffer(). However, that function doesn't update the rowid at join->join_tab [1] .table->ref. Duplicate Weedout code sees the same rowid all the time, and concludes that all row combinations except for the first one are duplicates.

            Two thoughts:

            1. If the first table is doing a post-create_sort_index() read, we need to
            provide either rowids of rows that are read, or some substitute (e.g. a counter)

            2. Duplicate Elimination and create_sort_index.

            2.1 Suppose there is no join buffering.
            Then, there should be no reason to include first table's rowid in Duplicate
            Elimination's weedout table. The first table is part of the prefix of outer
            tables and should not be included.

            2.2 Suppose, join buffering is used somewhere within the duplicate elimination
            range. Then, all tables, including the very first one, should be included in
            the Duplicate Elimination Weedout table.
            However, join buffering destroys ordering. Why would we need to run
            create_sort_index to produce ordered resultset but after that use join
            buffering to destroy it?

            psergei Sergei Petrunia added a comment - Two thoughts: 1. If the first table is doing a post-create_sort_index() read, we need to provide either rowids of rows that are read, or some substitute (e.g. a counter) 2. Duplicate Elimination and create_sort_index. 2.1 Suppose there is no join buffering. Then, there should be no reason to include first table's rowid in Duplicate Elimination's weedout table. The first table is part of the prefix of outer tables and should not be included. 2.2 Suppose, join buffering is used somewhere within the duplicate elimination range. Then, all tables, including the very first one, should be included in the Duplicate Elimination Weedout table. However, join buffering destroys ordering. Why would we need to run create_sort_index to produce ordered resultset but after that use join buffering to destroy it?

            Exploring thought#2.

            When I try to debug the testcase that is not using the join buffering, I see
            that setup_semijoin_dups_elimination()

            • concludes that join buffering will be used (and so assigns
              first_table= join->const_tables)
            • calls join->get_sort_by_join_tab() which returns false, from where it
              concludes that no sorting of first table will be done.
            psergei Sergei Petrunia added a comment - Exploring thought#2. When I try to debug the testcase that is not using the join buffering, I see that setup_semijoin_dups_elimination() concludes that join buffering will be used (and so assigns first_table= join->const_tables) calls join->get_sort_by_join_tab() which returns false, from where it concludes that no sorting of first table will be done.

            The case where no join buffering is used could be solved with this approach:
            After the optimizer has made final decisions about whether to use join buffering,
            walk through the plans to do Duplicate Elimination and move the JOIN_TAB that flushes the temporary to be as late as possible.

            psergei Sergei Petrunia added a comment - The case where no join buffering is used could be solved with this approach: After the optimizer has made final decisions about whether to use join buffering, walk through the plans to do Duplicate Elimination and move the JOIN_TAB that flushes the temporary to be as late as possible.

            What is not clear is what should we do when join buffering is used.

            Look at the EXPLAIN in the comment for this bug here:
            https://mariadb.atlassian.net/browse/MDEV-5161?focusedCommentId=35977&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-35977

            It has both join buffering and "Using filesort" (which means: run filesort() for the first table, then do join with other tables). "Using join buffer" will break the ordering that was produced by filesort. Apparently, the bug is not [only] in Duplicate Elimination. join buffer and order by optimizations also work incorrectly.

            psergei Sergei Petrunia added a comment - What is not clear is what should we do when join buffering is used. Look at the EXPLAIN in the comment for this bug here: https://mariadb.atlassian.net/browse/MDEV-5161?focusedCommentId=35977&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-35977 It has both join buffering and "Using filesort" (which means: run filesort() for the first table, then do join with other tables). "Using join buffer" will break the ordering that was produced by filesort. Apparently, the bug is not [only] in Duplicate Elimination. join buffer and order by optimizations also work incorrectly.

            Created a testcase for the problem mentioned in the previous comment. Filed it as MDEV-5293.

            psergei Sergei Petrunia added a comment - Created a testcase for the problem mentioned in the previous comment. Filed it as MDEV-5293 .
            psergei Sergei Petrunia made changes -
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            Fix Version/s 5.5.35 [ 14000 ]
            Fix Version/s 5.3.13 [ 12602 ]
            Fix Version/s 5.5.34 [ 13700 ]

            Now I have a patch for MDEV-5293, continuing to work on this bug.

            psergei Sergei Petrunia added a comment - Now I have a patch for MDEV-5293 , continuing to work on this bug.

            Interesting.. there is code in setup_semijoin_dups_elimination() which disables "Using filesort":

            Fixed LP bug #901478.
            If the duplicate elimination strategy is used for a semi-join and potentially
            one of the block-based join algorithms can be employed to join the inner
            tables of the semi-join then sorting of the head (first non-constant) table
            for a query with ORDER BY / GROUP BY cannot be used.

            It seems, a problem similar to this one was encountered before?

            psergei Sergei Petrunia added a comment - Interesting.. there is code in setup_semijoin_dups_elimination() which disables "Using filesort": Fixed LP bug #901478. If the duplicate elimination strategy is used for a semi-join and potentially one of the block-based join algorithms can be employed to join the inner tables of the semi-join then sorting of the head (first non-constant) table for a query with ORDER BY / GROUP BY cannot be used. It seems, a problem similar to this one was encountered before?

            The fix looks like this:

            Make sure that possible sorting of rows from the head table
            is not to be employed.
            */
            if (join->get_sort_by_join_tab())

            { join->simple_order= 0; join->simple_group= 0; join->need_tmp= join->test_if_need_tmp_table(); }

            and it doesn't work for our testcase, because join->get_sort_by_join_tab()=NULL, which is caused by JOIN::sort_by_table being NULL.

            psergei Sergei Petrunia added a comment - The fix looks like this: Make sure that possible sorting of rows from the head table is not to be employed. */ if (join->get_sort_by_join_tab()) { join->simple_order= 0; join->simple_group= 0; join->need_tmp= join->test_if_need_tmp_table(); } and it doesn't work for our testcase, because join->get_sort_by_join_tab()=NULL, which is caused by JOIN::sort_by_table being NULL.
            serg Sergei Golubchik made changes -
            Fix Version/s 10.0.7 [ 14100 ]
            Fix Version/s 10.0.6 [ 13202 ]

            I've made a fix that runs get_sort_by_table() after const. tables have been detected and read. This caused get_sort_by_join_tab() to return the first JOIN_TAB in this example.

            Tests pass, including valgrind.

            psergei Sergei Petrunia added a comment - I've made a fix that runs get_sort_by_table() after const. tables have been detected and read. This caused get_sort_by_join_tab() to return the first JOIN_TAB in this example. Tests pass, including valgrind.
            psergei Sergei Petrunia made changes -
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            psergei Sergei Petrunia made changes -
            dbart Daniel Bartholomew added a comment - http://bazaar.launchpad.net/~maria-captains/maria/5.5/revision/3963.1.2
            serg Sergei Golubchik made changes -
            Workflow defaullt [ 29331 ] MariaDB v2 [ 43556 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Workflow MariaDB v2 [ 43556 ] MariaDB v3 [ 62690 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 62690 ] MariaDB v4 [ 147145 ]

            People

              psergei Sergei Petrunia
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.