Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21781

Wrong result (extra or missing rows, wrong values) with mrr=on,not_null_range_scan=on and LEFT JOINs

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Cannot Reproduce
    • 10.3.20, 10.4.10, 10.5.1
    • N/A
    • Optimizer
    • None

    Description

      CREATE TABLE t1 (a INT) ENGINE=MyISAM;
      INSERT INTO t1 VALUES (1),(2);
       
      CREATE TABLE t2 (b INT, c INT, KEY(c)) ENGINE=MyISAM;
      INSERT INTO t2 VALUES (1,NULL),(2,10);
       
      CREATE TABLE t3 (d INT) ENGINE=MyISAM;
      INSERT INTO t3 VALUES (11),(12);
       
      CREATE TABLE t4 (e INT, f INT, KEY(f)) ENGINE=MyISAM;
       
      SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      SET optimizer_switch='mrr=on,not_null_range_scan=on';
      SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      

      10.5 9fd30949

      SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      a	b	c	d	e	f
      1	NULL	NULL	NULL	NULL	NULL
      2	NULL	NULL	NULL	NULL	NULL
      SET optimizer_switch='mrr=on,not_null_range_scan=on';
      SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      a	b	c	d	e	f
      1	1	NULL	NULL	NULL	NULL
      2	1	NULL	NULL	NULL	NULL
      1	2	10	NULL	NULL	NULL
      2	2	10	NULL	NULL	NULL
      

      PostgreSQL, MySQL 5.7 and previous versions of MariaDB return the same result as the first query (two rows, all NULLs except for column a).

      Execution plans:

      EXPLAIN EXTENDED SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	SIMPLE	t1	ALL	NULL	NULL	NULL	NULL	2	100.00	
      1	SIMPLE	t2	ALL	c	NULL	NULL	NULL	2	100.00	Using where; Using join buffer (flat, BNL join)
      1	SIMPLE	t4	ALL	f	NULL	NULL	NULL	0	0.00	Using where; Using join buffer (incremental, BNL join)
      1	SIMPLE	t3	ALL	NULL	NULL	NULL	NULL	2	100.00	Using where; Using join buffer (incremental, BNL join)
      Warnings:
      Note	1003	select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t2`.`c` AS `c`,`test`.`t3`.`d` AS `d`,`test`.`t4`.`e` AS `e`,`test`.`t4`.`f` AS `f` from `test`.`t1` left join (`test`.`t2` left join (`test`.`t3` join `test`.`t4`) on(`test`.`t4`.`e` = `test`.`t2`.`b` and `test`.`t3`.`d` = `test`.`t4`.`f`)) on(`test`.`t2`.`c` = `test`.`t1`.`a`) where 1
      SET optimizer_switch='mrr=on,not_null_range_scan=on';
      EXPLAIN EXTENDED SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	SIMPLE	t4	const	f	NULL	NULL	NULL	1	100.00	Impossible ON condition
      1	SIMPLE	t3	const	NULL	NULL	NULL	NULL	1	100.00	Impossible ON condition
      1	SIMPLE	t1	ALL	NULL	NULL	NULL	NULL	2	100.00	
      1	SIMPLE	t2	ALL	c	NULL	NULL	NULL	2	100.00	Using join buffer (flat, BNL join)
      Warnings:
      Note	1003	select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t2`.`c` AS `c`,NULL AS `d`,NULL AS `e`,NULL AS `f` from `test`.`t1` left join (`test`.`t2` left join (`test`.`t3` join `test`.`t4`) on(multiple equal(`test`.`t2`.`b`, NULL) and multiple equal(NULL, NULL))) on(`test`.`t2`.`c` = `test`.`t1`.`a`) where 1
      

      If table t2 is changed from MyISAM to Aria, the result is incorrect in a different way – the second query returns an empty result set:

      CREATE TABLE t1 (a INT) ENGINE=MyISAM;
      INSERT INTO t1 VALUES (1),(2);
       
      CREATE TABLE t2 (b INT, c INT, KEY(c)) ENGINE=Aria;
      INSERT INTO t2 VALUES (1,NULL),(2,10);
       
      CREATE TABLE t3 (d INT) ENGINE=MyISAM;
      INSERT INTO t3 VALUES (11),(12);
       
      CREATE TABLE t4 (e INT, f INT, KEY(f)) ENGINE=MyISAM;
       
      SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      SET optimizer_switch='mrr=on,not_null_range_scan=on';
      SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      

      CREATE TABLE t1 (a INT) ENGINE=MyISAM;
      INSERT INTO t1 VALUES (1),(2);
      CREATE TABLE t2 (b INT, c INT, KEY(c)) ENGINE=Aria;
      INSERT INTO t2 VALUES (1,NULL),(2,10);
      CREATE TABLE t3 (d INT) ENGINE=MyISAM;
      INSERT INTO t3 VALUES (11),(12);
      CREATE TABLE t4 (e INT, f INT, KEY(f)) ENGINE=MyISAM;
      SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      a	b	c	d	e	f
      1	NULL	NULL	NULL	NULL	NULL
      2	NULL	NULL	NULL	NULL	NULL
      SET optimizer_switch='mrr=on,not_null_range_scan=on';
      SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      a	b	c	d	e	f
      

      Execution plans for test case with Aria for t2:

      EXPLAIN EXTENDED SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	SIMPLE	t1	ALL	NULL	NULL	NULL	NULL	2	100.00	
      1	SIMPLE	t2	ref	c	c	5	test.t1.a	2	100.00	Using where
      1	SIMPLE	t4	ALL	f	NULL	NULL	NULL	0	0.00	Using where
      1	SIMPLE	t3	ALL	NULL	NULL	NULL	NULL	2	100.00	Using where
      Warnings:
      Note	1003	select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t2`.`c` AS `c`,`test`.`t3`.`d` AS `d`,`test`.`t4`.`e` AS `e`,`test`.`t4`.`f` AS `f` from `test`.`t1` left join (`test`.`t2` left join (`test`.`t3` join `test`.`t4`) on(`test`.`t4`.`e` = `test`.`t2`.`b` and `test`.`t3`.`d` = `test`.`t4`.`f`)) on(`test`.`t2`.`c` = `test`.`t1`.`a` and `test`.`t1`.`a` is not null) where 1
      SET optimizer_switch='mrr=on,not_null_range_scan=on';
      EXPLAIN EXTENDED SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 JOIN t4 ON t3.d = t4.f ON t2.b = t4.e ON t1.a = t2.c;
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	SIMPLE	t4	const	f	NULL	NULL	NULL	1	100.00	Impossible ON condition
      1	SIMPLE	t3	const	NULL	NULL	NULL	NULL	1	100.00	Impossible ON condition
      1	SIMPLE	t1	ALL	NULL	NULL	NULL	NULL	2	100.00	
      1	SIMPLE	t2	ref	c	c	5	test.t1.a	2	100.00	
      Warnings:
      Note	1003	select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t2`.`c` AS `c`,NULL AS `d`,NULL AS `e`,NULL AS `f` from `test`.`t1` left join (`test`.`t2` left join (`test`.`t3` join `test`.`t4`) on(multiple equal(`test`.`t2`.`b`, NULL) and multiple equal(NULL, NULL))) on(`test`.`t2`.`c` = `test`.`t1`.`a` and `test`.`t1`.`a` is not null) where 1
      

      Attachments

        Activity

          igor Igor Babaev added a comment -

          However in 10.2 the bug cannot be reproduced even with the last query.

          MariaDB [test]> SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t4 FORCE INDEX(f) ON t2.b=t4.f AND t4.f < 1 AND t4.f > 3 ON t1.a = t2.c;
          +------+------+------+------+------+
          | a    | b    | c    | e    | f    |
          +------+------+------+------+------+
          |    1 | NULL | NULL | NULL | NULL |
          |    2 |    2 |    2 | NULL | NULL |
          +------+------+------+------+------+
          2 rows in set (0.002 sec)
           
          MariaDB [test]> explain extended SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t4 FORCE INDEX(f) ON t2.b=t4.f AND t4.f < 1 AND t4.f > 3 ON t1.a = t2.c;
          +------+-------------+-------+------+---------------+------+---------+-----------+------+----------+-------------+
          | id   | select_type | table | type | possible_keys | key  | key_len | ref       | rows | filtered | Extra       |
          +------+-------------+-------+------+---------------+------+---------+-----------+------+----------+-------------+
          |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL      |    2 |   100.00 |             |
          |    1 | SIMPLE      | t2    | ALL  | NULL          | NULL | NULL    | NULL      |    3 |   100.00 | Using where |
          |    1 | SIMPLE      | t4    | ref  | f             | f    | 5       | test.t2.b |    2 |   100.00 | Using where |
          +------+-------------+-------+------+---------------+------+---------+-----------+------+----------+-------------+
          3 rows in set, 1 warning (0.002 sec)
          

          igor Igor Babaev added a comment - However in 10.2 the bug cannot be reproduced even with the last query. MariaDB [test]> SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t4 FORCE INDEX(f) ON t2.b=t4.f AND t4.f < 1 AND t4.f > 3 ON t1.a = t2.c; +------+------+------+------+------+ | a | b | c | e | f | +------+------+------+------+------+ | 1 | NULL | NULL | NULL | NULL | | 2 | 2 | 2 | NULL | NULL | +------+------+------+------+------+ 2 rows in set (0.002 sec)   MariaDB [test]> explain extended SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t4 FORCE INDEX(f) ON t2.b=t4.f AND t4.f < 1 AND t4.f > 3 ON t1.a = t2.c; +------+-------------+-------+------+---------------+------+---------+-----------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+-------+------+---------------+------+---------+-----------+------+----------+-------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | | | 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using where | | 1 | SIMPLE | t4 | ref | f | f | 5 | test.t2.b | 2 | 100.00 | Using where | +------+-------------+-------+------+---------------+------+---------+-----------+------+----------+-------------+ 3 rows in set, 1 warning (0.002 sec)
          igor Igor Babaev added a comment -

          Debugging shows that we have difference in the plan due to the following change in the code

          @@ -4871,39 +4905,80 @@ make_join_statistics(JOIN *join, List<TABLE_LIST> &tables_list,
               
               /*
                 Perform range analysis if there are keys it could use (1). 
          -      Don't do range analysis if we're on the inner side of an outer join (2).
          -      Do range analysis if we're on the inner side of a semi-join (3).
          -      Don't do range analysis for materialized subqueries (4).
          -      Don't do range analysis for materialized derived tables (5)
          +      Don't do range analysis for materialized subqueries (2).
          +      Don't do range analysis for materialized derived tables (3)
               */
               if ((!s->const_keys.is_clear_all() ||
                   !bitmap_is_clear_all(&s->table->cond_set)) &&              // (1)
          -        (!s->table->pos_in_table_list->embedding ||                 // (2)
          -         (s->table->pos_in_table_list->embedding &&                 // (3)
          -          s->table->pos_in_table_list->embedding->sj_on_expr)) &&   // (3)
          -        !s->table->is_filled_at_execution() &&                      // (4)
          -        !(s->table->pos_in_table_list->derived &&                   // (5)
          -          s->table->pos_in_table_list->is_materialized_derived()))  // (5)
          +        !s->table->is_filled_at_execution() &&                      // (2)
          +        !(s->table->pos_in_table_list->derived &&                   // (3)
          +          s->table->pos_in_table_list->is_materialized_derived()))  // (3)
               {
                 bool impossible_range= FALSE;
                 ha_rows records= HA_POS_ERROR;
          

          introduced in the commit 03680a9b4fda9fa15675e137d46521628553c0eb

          MDEV-17518: Range optimization doesn't use ON expressions from nested outer joins

          igor Igor Babaev added a comment - Debugging shows that we have difference in the plan due to the following change in the code @@ -4871,39 +4905,80 @@ make_join_statistics(JOIN *join, List<TABLE_LIST> &tables_list, /* Perform range analysis if there are keys it could use (1). - Don't do range analysis if we're on the inner side of an outer join (2). - Do range analysis if we're on the inner side of a semi-join (3). - Don't do range analysis for materialized subqueries (4). - Don't do range analysis for materialized derived tables (5) + Don't do range analysis for materialized subqueries (2). + Don't do range analysis for materialized derived tables (3) */ if ((!s->const_keys.is_clear_all() || !bitmap_is_clear_all(&s->table->cond_set)) && // (1) - (!s->table->pos_in_table_list->embedding || // (2) - (s->table->pos_in_table_list->embedding && // (3) - s->table->pos_in_table_list->embedding->sj_on_expr)) && // (3) - !s->table->is_filled_at_execution() && // (4) - !(s->table->pos_in_table_list->derived && // (5) - s->table->pos_in_table_list->is_materialized_derived())) // (5) + !s->table->is_filled_at_execution() && // (2) + !(s->table->pos_in_table_list->derived && // (3) + s->table->pos_in_table_list->is_materialized_derived())) // (3) { bool impossible_range= FALSE; ha_rows records= HA_POS_ERROR; introduced in the commit 03680a9b4fda9fa15675e137d46521628553c0eb MDEV-17518 : Range optimization doesn't use ON expressions from nested outer joins
          alice Alice Sherepa added a comment -

          the results are correct on the current 10.3 (4e9206736c403206915c09d)-10.10

          alice Alice Sherepa added a comment - the results are correct on the current 10.3 (4e9206736c403206915c09d)-10.10

          Automated message:
          ----------------------------
          Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

          julien.fritsch Julien Fritsch added a comment - Automated message: ---------------------------- Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.
          JIraAutomate JiraAutomate added a comment -

          Automated message:
          ----------------------------
          Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

          JIraAutomate JiraAutomate added a comment - Automated message: ---------------------------- Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

          People

            alice Alice Sherepa
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.