Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8857

[Upstream too] EXPLAIN incorrectly shows Distinct for tables using join buffer

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0, 10.1
    • Fix Version/s: 10.2.0
    • Component/s: Optimizer
    • Labels:
      None

      Description

      EXPLAIN may show "Distinct" for tables that use join buffer. This is not a possible execution strategy, one can debug and confirm that Distinct optimization is not used in this case.

      Copying from my report to upstream:

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
       
      create table twenty (a int, filler char(200), key(a));
      insert into twenty select A.a + B.a* 10, 'AAAAAAAAAAAAAAAAAAAA' from ten A, ten B where B.a in (0,1);

      MySQL [test]>  explain select A.a from ten A, twenty B where A.a+B.a> 0;
      +----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------------------------------------------------+
      | id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                                                           |
      +----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------------------------------------------------+
      |  1 | SIMPLE      | A     | ALL   | NULL          | NULL | NULL    | NULL |   10 | NULL                                                            |
      |  1 | SIMPLE      | B     | index | NULL          | a    | 5       | NULL |   20 | Using where; Using index; Using join buffer (Block Nested Loop) |
      +----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------------------------------------------------+

      MySQL [test]> explain select distinct A.a from ten A, twenty B where A.a+B.a> 0; 
      +----+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------------+
      | id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                                                                     |
      +----+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------------+
      |  1 | SIMPLE      | A     | ALL   | NULL          | NULL | NULL    | NULL |   10 | Using temporary                                                           |
      |  1 | SIMPLE      | B     | index | NULL          | a    | 5       | NULL |   20 | Using where; Using index; Distinct; Using join buffer (Block Nested Loop) |
      +----+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------------+

      Look at the second query. It has "Distinct". I think this wrong. "Distinct"
      doesn't work with "Using join buffer".

      To make sure we are on the same page: "Distinct" in table B means that as soon
      as we've got a match for current record in table A, we don't have to look for
      any other matches. The query's SELECT list is "SELECT DISTINCT A.a", there is
      no point to look for additional A.row-B.row pairs with the same A.row.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                psergey Sergei Petrunia
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: