Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27442

Wrong result upon query with DISTINCT and EXISTS subquery

Details

    Description

      CREATE TABLE t (a int, b int, KEY b (b,a)) ENGINE=MyISAM;
      INSERT INTO t VALUES (0,100),(2,100),(2,101),(3,102);
       
      SELECT DISTINCT b FROM t WHERE EXISTS ( SELECT 1 FROM DUAL WHERE a > 1 );
       
      # Cleanup
      DROP TABLE t;
      

      Actual result on 10.5 4c3ad244

      SELECT DISTINCT b FROM t WHERE EXISTS ( SELECT 1 FROM DUAL WHERE a > 1 );
      b
      101
      102
      

      This is naturally wrong, 100 should also be there.

      The failure appeared in 10.5 branch after this commit in 10.5.3:

      commit eb483c5181ab430877c135c16224284cfc517b3d
      Author: Monty
      Date:   Fri Feb 28 12:59:30 2020 +0200
       
          Updated optimizer costs in multi_range_read_info_const() and sql_select.cc
      

      Plan on the current branch (wrong result)

      {
        "query_block": {
          "select_id": 1,
          "table": {
            "table_name": "t",
            "access_type": "range",
            "key": "b",
            "key_length": "5",
            "used_key_parts": ["b"],
            "rows": 5,
            "filtered": 100,
            "attached_condition": "<in_optimizer>(1,exists(subquery#2))",
            "using_index_for_group_by": true
          },
          "subqueries": [
            {
              "expression_cache": {
                "state": "uninitialized",
                "query_block": {
                  "select_id": 2,
                  "table": {
                    "message": "No tables used"
                  }
                }
              }
            }
          ]
        }
      }
      

      Plan before the change (correct result)

      {
        "query_block": {
          "select_id": 1,
          "table": {
            "table_name": "t",
            "access_type": "index",
            "key": "b",
            "key_length": "10",
            "used_key_parts": ["b", "a"],
            "rows": 4,
            "filtered": 100,
            "attached_condition": "<in_optimizer>(1,exists(subquery#2))",
            "using_index": true
          },
          "subqueries": [
            {
              "expression_cache": {
                "state": "uninitialized",
                "query_block": {
                  "select_id": 2,
                  "table": {
                    "message": "No tables used"
                  }
                }
              }
            }
          ]
        }
      }
      

      Attachments

        Activity

          This has nothing to do with my commit.
          One has the same problem with 10.4, with this slight modified test:

          CREATE TABLE t1 (a int, b int, KEY b (b,a)) ENGINE=MyISAM;
          INSERT INTO t1 VALUES (0,100), (2,100),(2,101),(3,102);
          INSERT INTO t1 SELECT -seq,-seq from seq_1_to_100;
          SELECT DISTINCT b FROM t1 WHERE EXISTS ( SELECT 1 FROM DUAL WHERE a > 1 );

          monty Michael Widenius added a comment - This has nothing to do with my commit. One has the same problem with 10.4, with this slight modified test: CREATE TABLE t1 (a int, b int, KEY b (b,a)) ENGINE=MyISAM; INSERT INTO t1 VALUES (0,100), (2,100),(2,101),(3,102); INSERT INTO t1 SELECT -seq,-seq from seq_1_to_100; SELECT DISTINCT b FROM t1 WHERE EXISTS ( SELECT 1 FROM DUAL WHERE a > 1 );

          The problem was that get_best_group_min_max() did not check if fields used
          by the "group_min_max optimization" where used in sub queries.
          Because of this, it did not detect that a key (b,a) was used in the WHERE
          clause for the statement:
          SELECT DISTINCT b FROM t1 WHERE EXISTS ( SELECT 1 FROM DUAL WHERE a > 1 ).

          Fixed by also traversing the sub queries when checking if a field is used.
          This disables group_min_max_optimization for the above query.

          Fix created by Sergei Petrunia

          monty Michael Widenius added a comment - The problem was that get_best_group_min_max() did not check if fields used by the "group_min_max optimization" where used in sub queries. Because of this, it did not detect that a key (b,a) was used in the WHERE clause for the statement: SELECT DISTINCT b FROM t1 WHERE EXISTS ( SELECT 1 FROM DUAL WHERE a > 1 ). Fixed by also traversing the sub queries when checking if a field is used. This disables group_min_max_optimization for the above query. Fix created by Sergei Petrunia

          People

            monty Michael Widenius
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.