Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30532

Wrong index chosen by the optimizer for ORDER BY

    XMLWordPrintable

Details

    Description

      With the optimizer switch:

      index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=off,condition_pushdown_from_having=on

      and optimizer_use_condition_selectivity set to 1, and task_small1 table loaded from the attachment, we get the following:

      analyze format=json SELECT task0.sys_id
      FROM   task_small1 task0
      WHERE 
          ( task0.sys_created_on >= '2020-01-01 06:00:00'           
          AND task0.sys_created_on <= '2020-08-11 04:59:59' )       AND task0.a_ref_4 = '1d979f'  
             AND task0.sys_class_name = 'c'       ORDER  BY task0.sys_created_on DESC LIMIT  0, 20;
       
      ANALYZE
      {
        "query_block": {
          "select_id": 1,
          "r_loops": 1,
          "r_total_time_ms": 2.8276,
          "table": {
            "table_name": "task0",
            "access_type": "range",
            "possible_keys": [
              "task_index_created",
              "release_task_ref2",
              "task_task_class_created"
            ],
            "key": "task_index_created",
            "key_length": "6",
            "used_key_parts": ["sys_created_on"],
            "r_loops": 1,
            "rows": 35807,
            "r_rows": 2004,
            "r_total_time_ms": 2.7249,
            "filtered": 22.336,
            "r_filtered": 0.998,
            "attached_condition": "task0.a_ref_4 <=> '1d979f' and task0.sys_created_on >= '2020-01-01 06:00:00' and task0.sys_created_on <= '2020-08-11 04:59:59' and task0.a_ref_4 = '1d979f' and task0.sys_class_name = 'c'"
          }
        }
      

      It should be using task_task_class_created, which is more specific and still can be used for ORDER BY. I have debugged it with optimizer tracing and in gdb, and got to the point that test_if_skip_sort_order() for some reason is getting called with release_task_ref2 key in tab->ref.key. I attempted to recreate the bug artificially with hand-crafted tables and generated data, but was not successful. The case worked properly choosing the more specific key, and test_if_skip_sort_order was getting called with tab->ref.key set to the more specific key, equivalent of task_task_class_created in this case.

      In this example, the query still runs fast in spite of using the wrong key, but this is a trimmed/obfuscated production data. On the actual production data the choice of the wrong key results in a factor of 100 slower performance.

      Attachments

        1. task_small1_dump.sql
          4.98 MB
        2. good.test.xz
          641 kB
        3. bad.test.xz
          845 kB
        4. bad.test.txz
          844 kB

        Issue Links

          Activity

            People

              rob.schwyzer@mariadb.com Rob Schwyzer
              spachev Sasha Pachev
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.