Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35180

ref_to_range rewrite causes poor query plan

Details

    Description

      Consider a testcase:

      create table t1 (a int);
      insert into t1 select seq from seq_1_to_100;
       
      create table t2 (
        kp1 int,
        kp2 int,
        filler char(100),
        key(kp1, kp2)
      );
      insert into t2
      select 
        seq, seq,
        'filler-data'
      from seq_1_to_10000;
      

      Suppose table t2 has

      • a ref(non-const) access
      • a range access that uses more keyparts than ref:

      Query Q1:

      explain
      select * 
      from t1, t2 
      where 
       t2.kp1=t1.a and t2.kp1<=100 and t2.kp2<=20
      

      The query plan is:

      +------+-------------+-------+-------+---------------+------+---------+------+------+------------------------------------------------------------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                                                                  |
      +------+-------------+-------+-------+---------------+------+---------+------+------+------------------------------------------------------------------------+
      |    1 | SIMPLE      | t1    | ALL   | NULL          | NULL | NULL    | NULL | 100  | Using where                                                            |
      |    1 | SIMPLE      | t2    | range | kp1           | kp1  | 10      | NULL | 999  | Using index condition; Using where; Using join buffer (flat, BNL join) |
      +------+-------------+-------+-------+---------------+------+---------+------+------+------------------------------------------------------------------------+
      

      One can get a much better if one prevents the optimizer from constructing range access with two keyparts:
      Query Q2:

      explain
      select *
      from t1, t2
      where
        t2.kp1=t1.a and t2.kp1<=1000 and t2.kp2+1<=20; -- make kp2 unusable for range.
      

      +------+-------------+-------+------+---------------+------+---------+-----------+------+-----------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref       | rows | Extra                 |
      +------+-------------+-------+------+---------------+------+---------+-----------+------+-----------------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL      | 100  | Using where           |
      |    1 | SIMPLE      | t2    | ref  | kp1           | kp1  | 5       | test.t1.a | 1    | Using index condition |
      +------+-------------+-------+------+---------------+------+---------+-----------+------+-----------------------+
      

      Examining the optimizer trace for Q1, one can see that the optimizer picks ref access for table t2, but then the rewrite makes it range access:

                    "attached_conditions_computation": [
                      {
                        "ref_to_range": true,
                        "cause": "range uses longer key"
                      }
                    ],
      

      Attachments

        Activity

          commit 3a38d99cbd03893f9be0b6d14fcfe1d98b89e395 (HEAD -> bb-10.6-MDEV-35180, origin/bb-10.6-MDEV-35180)
          Author: Sergei Petrunia <sergey@mariadb.com>
          Date:   Wed Oct 16 18:35:37 2024 +0300
           
              MDEV-35180: ref_to_range rewrite causes poor query plan
              
              (Variant 2: only allow rewrite for ref(const))
          

          psergei Sergei Petrunia added a comment - commit 3a38d99cbd03893f9be0b6d14fcfe1d98b89e395 (HEAD -> bb-10.6-MDEV-35180, origin/bb-10.6-MDEV-35180) Author: Sergei Petrunia <sergey@mariadb.com> Date: Wed Oct 16 18:35:37 2024 +0300   MDEV-35180: ref_to_range rewrite causes poor query plan (Variant 2: only allow rewrite for ref(const))

          Ok to push. Looks good

          monty Michael Widenius added a comment - Ok to push. Looks good

          For the release notes:
          The optimizer could pick a poor query plan if the considered join order
          allowed for some table A. a ref access, and B. a range access
          over the same index which used more key parts, but scanned a much
          wider range. In this case, ref_to_range rewrite would pick the range
          access despite it having a much higher execution time than ref.

          Fixed by also requiring that ref is ref(const). In this case, one can
          prove that range access would read a subset of rows that ref would read.

          psergei Sergei Petrunia added a comment - For the release notes: The optimizer could pick a poor query plan if the considered join order allowed for some table A. a ref access, and B. a range access over the same index which used more key parts, but scanned a much wider range. In this case, ref_to_range rewrite would pick the range access despite it having a much higher execution time than ref. Fixed by also requiring that ref is ref(const). In this case, one can prove that range access would read a subset of rows that ref would read.

          People

            psergei Sergei Petrunia
            psergei Sergei Petrunia
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.