Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21626

Optimizer misses the details about the picked join order

Details

    Description

      Take the DBT-3 dataset, make sure you have these indexes:

      CREATE INDEX i_o_orderdate ON orders (o_orderdate);
      CREATE INDEX i_o_custkey ON orders (o_custkey);
      

      Run these two queries. The queries are the same, except for the ORDER BY list.
      However this difference causes different join orders to be picked:

      MariaDB [dbt3]> explain select * from orders,customer where o_custkey=c_custkey order by o_orderdate limit 10;
      +------+-------------+----------+--------+---------------+---------------+---------+-----------------------+------+-------------+
      | id   | select_type | table    | type   | possible_keys | key           | key_len | ref                   | rows | Extra       |
      +------+-------------+----------+--------+---------------+---------------+---------+-----------------------+------+-------------+
      |    1 | SIMPLE      | orders   | index  | i_o_custkey   | i_o_orderdate | 4       | NULL                  | 10   | Using where |
      |    1 | SIMPLE      | customer | eq_ref | PRIMARY       | PRIMARY       | 4       | dbt3.orders.o_custkey | 1    |             |
      +------+-------------+----------+--------+---------------+---------------+---------+-----------------------+------+-------------+
      

      MariaDB [dbt3]> explain select * from orders,customer where o_custkey=c_custkey order by c_acctbal limit 10; 
      +------+-------------+----------+-------+---------------+-------------+---------+-------------------------+------+-------+
      | id   | select_type | table    | type  | possible_keys | key         | key_len | ref                     | rows | Extra |
      +------+-------------+----------+-------+---------------+-------------+---------+-------------------------+------+-------+
      |    1 | SIMPLE      | customer | index | PRIMARY       | c_acctbal   | 9       | NULL                    | 1    |       |
      |    1 | SIMPLE      | orders   | ref   | i_o_custkey   | i_o_custkey | 5       | dbt3.customer.c_custkey | 15   |       |
      +------+-------------+----------+-------+---------------+-------------+---------+-------------------------+------+-------+
      

      However, looking at the optimizer trace, one can see that the join optimization parts are identical:

       MariaDB [dbt3]> select * from information_schema.optimizer_trace\G
       *************************** 1. row ***************************
      -                            QUERY: explain select * from orders,customer where o_custkey=c_custkey order by o_orderdate limit 10
      +                            QUERY: explain select * from orders,customer where o_custkey=c_custkey order by c_acctbal limit 10
                                   TRACE: {
         "steps": [
           {
      @@ -18,7 +17,7 @@
               "select_id": 1,
               "steps": [
                 {
      -            "expanded_query": ...
      +            "expanded_query": ...
                 } 
               ] 
             } 
      @@ -177,15 +176,15 @@
                 },
                 {
                   "attaching_conditions_to_tables": {
      -              "original_condition": "customer.c_custkey = orders.o_custkey",
      +              "original_condition": "orders.o_custkey = customer.c_custkey",
                     "attached_conditions_computation": [],
                     "attached_conditions_summary": [
                       {
      -                  "table": "orders",
      -                  "attached": "orders.o_custkey is not null"
      +                  "table": "customer",
      +                  "attached": null
                       },
      

      The first difference is in "attaching_conditions_to_tables" ?

      Attachments

        Activity

          psergei Sergei Petrunia created issue -
          psergei Sergei Petrunia made changes -
          Field Original Value New Value
          Affects Version/s 10.5 [ 23123 ]
          Affects Version/s 10.4 [ 22408 ]

          The test was done on the mdev8306-orig branch... were there any improvements to the trace since then?

          psergei Sergei Petrunia added a comment - The test was done on the mdev8306-orig branch... were there any improvements to the trace since then?
          varun Varun Gupta (Inactive) made changes -
          Comment [ [~psergey] I don't quite understand the problem, are you saying the best_join_order is not shown in the optimizer trace? ]
          psergei Sergei Petrunia made changes -
          Labels optimizer_trace
          elenst Elena Stepanova made changes -
          Fix Version/s 10.5 [ 23123 ]

          The issue here is the cost of adding a temp table is done when the first table doesn't resolve the ORDER BY clause

                { 
                  if (join->sort_by_table &&
                      join->sort_by_table !=
                      join->positions[join->const_tables].table->table)
                    /*
                       We may have to make a temp table, note that this is only a
                       heuristic since we cannot know for sure at this point.
                       Hence it may be wrong.
                    */
                    current_read_time= COST_ADD(current_read_time, current_record_count);
          

          So I guess it would be ok to add this cost to the trace.

          varun Varun Gupta (Inactive) added a comment - The issue here is the cost of adding a temp table is done when the first table doesn't resolve the ORDER BY clause { if ( join ->sort_by_table && join ->sort_by_table != join ->positions[ join ->const_tables]. table -> table ) /* We may have to make a temp table, note that this is only a heuristic since we cannot know for sure at this point. Hence it may be wrong. */ current_read_time= COST_ADD(current_read_time, current_record_count); So I guess it would be ok to add this cost to the trace.
          varun Varun Gupta (Inactive) made changes -
          Status Open [ 1 ] In Progress [ 3 ]

          Discussed this with psergey, it would be fine to add cost of sorting to 10.4 (the one seen in the code snippet) and we can improve this in 10.6 (with MDEV-8306)

          varun Varun Gupta (Inactive) added a comment - Discussed this with psergey , it would be fine to add cost of sorting to 10.4 (the one seen in the code snippet) and we can improve this in 10.6 (with MDEV-8306 )
          psergei Sergei Petrunia made changes -
          Priority Major [ 3 ] Minor [ 4 ]
          varun Varun Gupta (Inactive) added a comment - Patch https://github.com/MariaDB/server/commit/6404645980db51fdc1e5dae2ac94eca57804284b
          varun Varun Gupta (Inactive) made changes -
          Assignee Varun Gupta [ varun ] Sergei Petrunia [ psergey ]
          Status In Progress [ 3 ] In Review [ 10002 ]

          Ok to push

          psergei Sergei Petrunia added a comment - Ok to push
          psergei Sergei Petrunia made changes -
          Assignee Sergei Petrunia [ psergey ] Varun Gupta [ varun ]
          Status In Review [ 10002 ] Stalled [ 10000 ]
          varun Varun Gupta (Inactive) made changes -
          Fix Version/s 10.5.4 [ 24264 ]
          Fix Version/s 10.4.14 [ 24305 ]
          Fix Version/s 10.5 [ 23123 ]
          Resolution Fixed [ 1 ]
          Status Stalled [ 10000 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 103584 ] MariaDB v4 [ 157276 ]

          People

            varun Varun Gupta (Inactive)
            psergei Sergei Petrunia
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.