Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17425

Improve range optimizer to stop evaluating cost using index dive when reaching the cost of any competing EQU_REF range

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 10.1.24
    • Fix Version/s: 10.1
    • Component/s: Optimizer
    • Labels:
      None
    • Environment:
      GNU/Linux

      Description

      The following query pattern:
      SELECT c1 FROM t WHERE i_eq_ref IN (c1....c1000) as R1 and i_ref IN (c1...c200) as R2
      on an InnoDB table with around 200 million rows, i_eq_ref being the primary key and i_ref a secondary key with only barely more than 200 possible values (application-level access group identifiers, present on each row, so providing 200 values means the user is member of most groups, and they can see about the totality of the 200 million rows) makes query planner spend a lot of time, and ends up being using the largest amount of overall query execution time: EXPLAIN on a cold cache takes ~2 seconds, and actually running the query takes nearly the same amount of time (so query optimiser ends up picking the correct plan, it just takes too long to do so).

      I would like to thank Stephane Varoqui for his huge help in getting me to understand this, and for providing me with a template on how this should be reported, as it is still quite a bit above my head. Hopefully it will all make sense to mariadb developers.

      Cost R1 is cap by the cost of 1000 rows.

      Cost of R2 have minimum constant cost of the size of the range 200(index dive cost) + SUM(index dives result).

      For each index dive the reduction should be cancelled when it reach R1 cost if not the range can consume possibly in the worth case 200 IO on disk and end up picking the R1 anyway.

      Back porting https://dev.mysql.com/worklog/task/?id=7170 could also help if index does not stay in memory the cost to add 200 index dive on disk is worth the cost of 1000 pk lookup.

      How to reproduce (I stripped several columns which are not involved in the query, i_eq_ref is "uid" and i_ref is "security_uid"):

      > show create table catalog\G
      Create Table: CREATE TABLE `catalog` (
        `uid` bigint(20) unsigned NOT NULL,
        `security_uid` int(10) unsigned DEFAULT NULL,
        `agent_security_uid` int(10) unsigned DEFAULT NULL,
        `viewable_owner` varbinary(255) NOT NULL DEFAULT '',
        `viewable_associate` varbinary(255) NOT NULL DEFAULT '',
        `path` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT ''
        PRIMARY KEY (`uid`),
        KEY `security_uid` (`security_uid`),
        KEY `viewable_owner` (`viewable_owner`),
        KEY `viewable_associate` (`viewable_associate`),
        KEY `Path` (`path`),
        KEY `agent_security_uid` (`agent_security_uid`)
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
      

      > show index from catalog\G
      *************************** 1. row ***************************
              Table: catalog
         Non_unique: 0
           Key_name: PRIMARY
       Seq_in_index: 1
        Column_name: uid
          Collation: A
        Cardinality: 212787844
           Sub_part: NULL
             Packed: NULL
               Null:
         Index_type: BTREE
            Comment:
      Index_comment:
      *************************** 2. row ***************************
              Table: catalog
         Non_unique: 1
           Key_name: security_uid
       Seq_in_index: 1
        Column_name: security_uid
          Collation: A
        Cardinality: 30148
           Sub_part: NULL
             Packed: NULL
               Null: YES
         Index_type: BTREE
            Comment:
      Index_comment:
      *************************** 4. row ***************************
              Table: catalog
         Non_unique: 1
           Key_name: viewable_owner
       Seq_in_index: 1
        Column_name: viewable_owner
          Collation: A
        Cardinality: 95980
           Sub_part: NULL
             Packed: NULL
               Null:
         Index_type: BTREE
            Comment:
      Index_comment:
      *************************** 5. row ***************************
              Table: catalog
         Non_unique: 1
           Key_name: viewable_associate
       Seq_in_index: 1
        Column_name: viewable_associate
          Collation: A
        Cardinality: 1364024
           Sub_part: NULL
             Packed: NULL
               Null:
         Index_type: BTREE
            Comment:
      Index_comment:
      *************************** 7. row ***************************
              Table: catalog
         Non_unique: 1
           Key_name: Path
       Seq_in_index: 1
        Column_name: path
          Collation: A
        Cardinality: 212787844
           Sub_part: NULL
             Packed: NULL
               Null:
         Index_type: BTREE
            Comment:
      Index_comment:
      *************************** 25. row ***************************
              Table: catalog
         Non_unique: 1
           Key_name: agent_security_uid
       Seq_in_index: 1
        Column_name: agent_security_uid
          Collation: A
        Cardinality: 2
           Sub_part: NULL
             Packed: NULL
               Null: YES
         Index_type: BTREE
            Comment:
      Index_comment:
      

      All the timeings below are with a hot cache (to be fair to each measure), but in reality this query will virtually always run on cold cache: it is scanning a pre-determined set of uid values, so each query will pull a fresh set of rows. With around 30 such queries running concurrently, "explain" and execution time increase to around 20 seconds (io starvation ?).

      Explain without "force index":

      +------+-------------+-------------------------------------------------+-------+---------------------------------------------------------------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
      | id   | select_type | table                                           | type  | possible_keys                                                             | key     | key_len | ref                                                                       | rows | Extra                        |
      +------+-------------+-------------------------------------------------+-------+---------------------------------------------------------------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
      |    1 | SIMPLE      | catalog                                         | range | PRIMARY,security_uid,viewable_owner,viewable_associate,agent_security_uid | PRIMARY | 8       | NULL                                                                      | 1000 | Using where; Using temporary |
      |    1 | SIMPLE      | related_destination_section_role_uid_category   | ref   | PRIMARY,Membership                                                        | PRIMARY | 8       | foobarbazprod2.catalog.uid                                                |    2 | Using where; Using index     |
      |    1 | SIMPLE      | related_destination_section_role_uid_1_category | ref   | PRIMARY                                                                   | PRIMARY | 8       | foobarbazprod2.related_destination_section_role_uid_category.category_uid |    2 | Using where; Using index     |
      |    1 | SIMPLE      | related_strict_specialise_uid_1_category        | ref   | PRIMARY                                                                   | PRIMARY | 8       | foobarbazprod2.catalog.uid                                                |    2 | Using where; Using index     |
      +------+-------------+-------------------------------------------------+-------+---------------------------------------------------------------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
      4 rows in set (0.77 sec)
      

      Select without "force index": 1.22 sec

      Explain with "force index (PRIMARY)", which is the column the optimiser ended up picking above:
      +------+-------------+-------------------------------------------------+-------+--------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
      | id   | select_type | table                                           | type  | possible_keys      | key     | key_len | ref                                                                       | rows | Extra                        |
      +------+-------------+-------------------------------------------------+-------+--------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
      |    1 | SIMPLE      | catalog                                         | range | PRIMARY            | PRIMARY | 8       | NULL                                                                      | 1000 | Using where; Using temporary |
      |    1 | SIMPLE      | related_destination_section_role_uid_category   | ref   | PRIMARY,Membership | PRIMARY | 8       | foobarbazprod2.catalog.uid                                                |    2 | Using where; Using index     |
      |    1 | SIMPLE      | related_destination_section_role_uid_1_category | ref   | PRIMARY            | PRIMARY | 8       | foobarbazprod2.related_destination_section_role_uid_category.category_uid |    2 | Using where; Using index     |
      |    1 | SIMPLE      | related_strict_specialise_uid_1_category        | ref   | PRIMARY            | PRIMARY | 8       | foobarbazprod2.catalog.uid                                                |    2 | Using where; Using index     |
      +------+-------------+-------------------------------------------------+-------+--------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
      4 rows in set (0.00 sec)
      

      Select with "force index (PRIMARY)": 0.04 sec

      [EDIT]: damn, formating is hard here

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              psergei Sergei Petrunia
              Reporter:
              vpelletier_nxd Vincent Pelletier
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:

                  Git Integration