Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36169

If strategies in hint SEMIJOIN are named but inapplicable for the statement, DUPSWEEDOUT shoud be used, but is does not work for engines Aria and InnoDB

Details

    • Bug
    • Status: In Progress (View Workflow)
    • Major
    • Resolution: Unresolved
    • 12.0
    • 12.0
    • Optimizer
    • None

    Description

      From MDEV-34888: For SEMIJOIN hints, if no strategies are named, semijoin is used if possible based on the strategies enabled according to the optimizer_switch system variable. If strategies are named but inapplicable for the statement, DUPSWEEDOUT is used.

      Testcase:

      CREATE TABLE t1 (a INTEGER NOT NULL, b INT, PRIMARY KEY (a));
      CREATE TABLE t2 (a INTEGER NOT NULL, KEY (a));
      CREATE TABLE t3 (a INTEGER NOT NULL, b INT, KEY (a));
      INSERT INTO t1 VALUES (1,10), (2,20), (3,30), (4,40);
      INSERT INTO t2 VALUES (2), (2), (3), (3), (4), (5);
      INSERT INTO t3 VALUES (10,3), (15,3), (20,4), (30,5);
       
      EXPLAIN EXTENDED
      SELECT /*+ SEMIJOIN(@subq1 LOOSESCAN) SEMIJOIN(@subq2 LOOSESCAN) */ * FROM t1
      WHERE t1.a IN (SELECT /*+ QB_NAME(subq1) */ a FROM t3)
        AND t1.b IN (SELECT /*+ QB_NAME(subq2) */ a FROM t2);
       
      DROP TABLE t1,t2,t3;
      

      This query assumes that LooseScan is not possible for both subqueries at the same time, so in the second case it should be replaced with DuplicateWeedout.

      For engine=MyISAM everything looks good:

      EXPLAIN EXTENDED
      SELECT /*+ SEMIJOIN(@subq1 LOOSESCAN) SEMIJOIN(@subq2 LOOSESCAN) */ * FROM t1
      WHERE t1.a IN (SELECT /*+ QB_NAME(subq1) */ a FROM t3)
      AND t1.b IN (SELECT /*+ QB_NAME(subq2) */ a FROM t2);
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	PRIMARY	t3	index	a	a	4	NULL	4	100.00	Using index; LooseScan
      1	PRIMARY	t1	eq_ref	PRIMARY	PRIMARY	4	test.t3.a	1	25.00	Using where
      1	PRIMARY	t2	ref	a	a	4	test.t1.b	1	66.67	Using index; Start temporary; End temporary
      Warnings:
      Note	1003	select /*+ SEMIJOIN(@`subq1` LOOSESCAN) SEMIJOIN(@`subq2` LOOSESCAN) */ `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b` from `test`.`t1` semi join (`test`.`t3`) semi join (`test`.`t2`) where `test`.`t1`.`a` = `test`.`t3`.`a` and `test`.`t2`.`a` = `test`.`t1`.`b`
      

      But for engine=InnoDB (Aria) DuplicateWeedout is not used:

      XPLAIN EXTENDED
      SELECT /*+ SEMIJOIN(@subq1 LOOSESCAN) SEMIJOIN(@subq2 LOOSESCAN) */ * FROM t1
      WHERE t1.a IN (SELECT /*+ QB_NAME(subq1) */ a FROM t3)
      AND t1.b IN (SELECT /*+ QB_NAME(subq2) */ a FROM t2);
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	PRIMARY	t2	index	a	a	4	NULL	6	100.00	Using index
      1	PRIMARY	t3	index	a	a	4	NULL	4	100.00	Using index; LooseScan
      1	PRIMARY	t1	eq_ref	PRIMARY	PRIMARY	4	test.t3.a	1	25.00	Using where
      Warnings:
      Note	1003	select /*+ SEMIJOIN(@`subq1` LOOSESCAN) SEMIJOIN(@`subq2` LOOSESCAN) */ `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b` from `test`.`t1` semi join (`test`.`t3`) semi join (`test`.`t2`) where `test`.`t1`.`a` = `test`.`t3`.`a` and `test`.`t1`.`b` = `test`.`t2`.`a`
      

      But if we use the DuplicateWeedout strategy in both cases, we can see that its use is acceptable:

      EXPLAIN EXTENDED
      SELECT /*+ SEMIJOIN(@subq1 DUPSWEEDOUT) SEMIJOIN(@subq2 DUPSWEEDOUT) */ * FROM t1
      WHERE t1.a IN (SELECT /*+ QB_NAME(subq1) */ a FROM t3)
      AND t1.b IN (SELECT /*+ QB_NAME(subq2) */ a FROM t2);
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	PRIMARY	t3	index	a	a	4	NULL	4	100.00	Using index; Start temporary
      1	PRIMARY	t1	eq_ref	PRIMARY	PRIMARY	4	test.t3.a	1	25.00	Using where; End temporary
      1	PRIMARY	t2	ref	a	a	4	test.t1.b	1	66.67	Using index; Start temporary; End temporary
      Warnings:
      Note	1003	select /*+ SEMIJOIN(@`subq1` DUPSWEEDOUT) SEMIJOIN(@`subq2` DUPSWEEDOUT) */ `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b` from `test`.`t1` semi join (`test`.`t3`) semi join (`test`.`t2`) where `test`.`t1`.`a` = `test`.`t3`.`a` and `test`.`t2`.`a` = `test`.`t1`.`b`
      

      Attachments

        Issue Links

          Activity

            There are no comments yet on this issue.

            People

              oleg.smirnov Oleg Smirnov
              lstartseva Lena Startseva
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.