[MDEV-7946] Optimizer merges derived table in case it would be better to keep it Created: 2015-04-09  Updated: 2023-11-28

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 5.5, 10.0, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11
Fix Version/s: 10.4, 10.5, 10.6, 10.11

Type: Bug Priority: Major
Reporter: Jiri Kavalik Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 1
Labels: optimizer, outer-join, verified
Environment:

Ubuntu 12.04, CentOS 5.11


Attachments: File derived_merge_test.sql.gz     File mdev-7946-1.diff    

 Description   

When derived_merge optimization is ON, server optimizes away derived table even in case when resulting join is much slower than materializing derived table and joining to it. On production database it is difference between 0.37s without merge and 11.85s with merge on warmed up caches and buffer pool, but slow log shows even cases with 5-20 minutes.

I prepared a test case with simpler structure (will upload dump). I use group by primary to make subquery not-mergeable - it does not change results in this case.

explain
SELECT a.handle,
       Sum(IF(TYPE IS NULL AND created >= '2015-04-07 06:00:00' AND created < '2015-04-08 06:00:00',1,0)) AS `count`,
       Sum(IF(TYPE IS NULL AND created >= '2015-04-07 06:00:00' AND created < '2015-04-08 06:00:00',val,0)) AS `amount`,
       Sum(IF(TYPE IS NULL AND created >= '2015-03-31 22:00:00' AND created < '2015-04-08 06:00:00',1,0)) AS `count_month`,
       Sum(IF(TYPE IS NULL AND created >= '2015-03-31 22:00:00' AND created < '2015-04-08 06:00:00',val,0)) AS `amount_month`,
       Sum(IF(collected IS NOT NULL AND collected >= '2015-03-31 22:00:00' AND collected < '2015-04-08 06:00:00',-val,0)) AS `payout_month`
FROM a AS a
LEFT JOIN
  (SELECT b.`type`,b.`created`, b.`val`, b.`collected`, b.`a_id`
   FROM b AS b
   WHERE ((created >= '2015-04-07 06:00:00' AND created < '2015-04-08 06:00:00')
          OR (created >= '2015-03-31 22:00:00' AND created < '2015-04-08 06:00:00')
          OR (collected IS NOT NULL AND collected >= '2015-03-31 22:00:00' AND collected < '2015-04-08 06:00:00'))
  ) AS `T` ON T.a_id = a.id
WHERE (a.handle <> 1000)
GROUP BY a.handle;

+------+-------------+-------+-------+------------------------+--------+---------+-------------------------+------+--------------------------+
| id   | select_type | table | type  | possible_keys          | key    | key_len | ref                     | rows | Extra                    |
+------+-------------+-------+-------+------------------------+--------+---------+-------------------------+------+--------------------------+
|    1 | SIMPLE      | a     | range | handle                 | handle | 5       | NULL                    |  399 | Using where; Using index |
|    1 | SIMPLE      | b     | ref   | a_id,created,collected | a_id   | 5       | derived_merge_test.a.id |  365 | Using where              |
+------+-------------+-------+-------+------------------------+--------+---------+-------------------------+------+--------------------------+
RUN:
399 rows in set (0.41 sec)

explain
SELECT a.handle,
       Sum(IF(TYPE IS NULL AND created >= '2015-04-07 06:00:00' AND created < '2015-04-08 06:00:00',1,0)) AS `count`,
       Sum(IF(TYPE IS NULL AND created >= '2015-04-07 06:00:00' AND created < '2015-04-08 06:00:00',val,0)) AS `amount`,
       Sum(IF(TYPE IS NULL AND created >= '2015-03-31 22:00:00' AND created < '2015-04-08 06:00:00',1,0)) AS `count_month`,
       Sum(IF(TYPE IS NULL  AND created >= '2015-03-31 22:00:00' AND created < '2015-04-08 06:00:00',val,0)) AS `amount_month`,
       Sum(IF(collected IS NOT NULL AND collected >= '2015-03-31 22:00:00'  AND collected < '2015-04-08 06:00:00',-val,0)) AS `payout_month`
FROM a AS a
LEFT JOIN
  (SELECT b.`type`, b.`created`, b.`val`, b.`collected`, b.`a_id`
   FROM b AS b
   WHERE ((created >= '2015-04-07 06:00:00' AND created < '2015-04-08 06:00:00')
          OR (created >= '2015-03-31 22:00:00' AND created < '2015-04-08 06:00:00')
          OR (collected IS NOT NULL AND collected >= '2015-03-31 22:00:00' AND collected < '2015-04-08 06:00:00'))
   GROUP BY id) AS `T` ON T.a_id = a.id
WHERE (a.handle <> 1000)
GROUP BY a.handle;

+------+-------------+------------+-------------+-------------------+-------------------+---------+-------------------------+------+------------------------------------------------------------------+
| id   | select_type | table      | type        | possible_keys     | key               | key_len | ref                     | rows | Extra                                                            |
+------+-------------+------------+-------------+-------------------+-------------------+---------+-------------------------+------+------------------------------------------------------------------+
|    1 | PRIMARY     | a          | range       | handle            | handle            | 5       | NULL                    |  399 | Using where; Using index                                         |
|    1 | PRIMARY     | <derived2> | ref         | key0              | key0              | 5       | derived_merge_test.a.id |   10 |                                                                  |
|    2 | DERIVED     | b          | index_merge | created,collected | created,collected | 8,9     | NULL                    | 3918 | Using sort_union(created,collected); Using where; Using filesort |
+------+-------------+------------+-------------+-------------------+-------------------+---------+-------------------------+------+------------------------------------------------------------------+
RUN:
399 rows in set (0.05 sec)



 Comments   
Comment by Elena Stepanova [ 2015-04-14 ]

Thanks for the report and the test case. Reproducible as described.
The easier way to avoid derived_merge (both for the test case and as a workaround for the problem) is to set optimizer_switch='derived_merge=off'.

Comment by Sergei Petrunia [ 2015-06-14 ]

ANALYZE output from 10.1 (the query plans are the same, so I'm using 10.1 to be able to use ANALYZE-stmt command):

Merged:

{
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 5853.4,
    "table": {
      "table_name": "a",
      "access_type": "range",
      "possible_keys": ["handle"],
      "key": "handle",
      "key_length": "5",
      "used_key_parts": ["handle"],
      "r_loops": 1,
      "rows": 399,
      "r_rows": 399,
      "r_total_time_ms": 1.5039,
      "filtered": 100,
      "r_filtered": 100,
      "attached_condition": "(a.handle <> 1000)",
      "using_index": true
    },
    "table": {
      "table_name": "b",
      "access_type": "ref",
      "possible_keys": ["a_id", "created", "collected"],
      "key": "a_id",
      "key_length": "5",
      "used_key_parts": ["a_id"],
      "ref": ["j5.a.id"],
      "r_loops": 399,
      "rows": 458,
      "r_rows": 493.75,
      "r_total_time_ms": 5604.1,
      "filtered": 100,
      "r_filtered": 1.9324,
      "attached_condition": "trigcond((((b.created >= '2015-04-07 06:00:00') and (b.created < '2015-04-08 06:00:00')) or 
((b.created >= '2015-03-31 22:00:00') and (b.created < '2015-04-08 06:00:00')) or ((b.collected is not null) and 
(b.collected >= '2015-03-31 22:00:00') and (b.collected < '2015-04-08 06:00:00'))))"
    }
  }
}

Non-merged case:

{
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 83.951,
    "table": {
      "table_name": "a",
      "access_type": "range",
      "possible_keys": ["handle"],
      "key": "handle",
      "key_length": "5",
      "used_key_parts": ["handle"],
      "r_loops": 1,
      "rows": 399,
      "r_rows": 399,
      "r_total_time_ms": 1.6208,
      "filtered": 100,
      "r_filtered": 100,
      "attached_condition": "(a.handle <> 1000)",
      "using_index": true
    },
    "table": {
      "table_name": "<derived2>",
      "access_type": "ref",
      "possible_keys": ["key0"],
      "key": "key0",
      "key_length": "5",
      "used_key_parts": ["a_id"],
      "ref": ["j5.a.id"],
      "r_loops": 399,
      "rows": 10,
      "r_rows": 9.5414,
      "r_total_time_ms": 3.9563,
      "filtered": 100,
      "r_filtered": 100,
      "materialized": {
        "query_block": {
          "select_id": 2,
          "r_loops": 1,
          "r_total_time_ms": 67.639,
          "table": {
            "table_name": "b",
            "access_type": "index_merge",
            "possible_keys": ["created", "collected"],
            "key_length": "8,9",
            "index_merge": {
              "sort_union": {
                "range": {
                  "key": "created",
                  "used_key_parts": ["created"]
                },
                "range": {
                  "key": "collected",
                  "used_key_parts": ["collected"]
                }
              }
            },
            "r_loops": 1,
            "rows": 3918,
            "r_rows": 3875,
            "r_total_time_ms": 46.42,
            "filtered": 100,
            "r_filtered": 100,
            "attached_condition": "(((b.created >= '2015-04-07 06:00:00') and (b.created < '2015-04-08 06:00:00')) 
or ((b.created >= '2015-03-31 22:00:00') and (b.created < '2015-04-08 06:00:00')) or ((b.collected is not null) and
(b.collected >= '2015-03-31 22:00:00') and (b.collected < '2015-04-08 06:00:00')))"
          }
        }
      }
    }
  }
}

Comment by Sergei Petrunia [ 2015-06-14 ]

Important parts of ANALYZE output:

Merged query:

table a {
  r_total_time_ms= 1.5039
}
table b {
  "r_loops": 399,
  "r_total_time_ms": 5604.1,
}

Non-merged query:

table a {
  "r_total_time_ms": 1.6208,
}
 
table "<derived2>" {
  "r_loops": 399,
  "r_total_time_ms": 3.9563,
   ...
   materialized { 
    table b { 
      "rows": 3918,
      "r_rows": 3875,
      "r_total_time_ms": 46.42,
    }
}

rows ~= r_rows anywhere. The expensive part is nested-loops join with table b. Individual lookups are not expensive: 5604.1 / 399. = 14 ms, but it adds up.

Comment by Sergei Petrunia [ 2015-06-14 ]

Debugging, I see that the second table does not have a quick select when derived_merge=on. This happens, because range optimizer is not invoked for it. This branch in make_join_statistics:

    /*
      Perform range analysis if there are keys it could use (1). 
      Don't do range analysis if we're on the inner side of an outer join (2).
      Do range analysis if we're on the inner side of a semi-join (3).
      Don't do range analysis for materialized subqueries (4).
      Don't do range analysis for materialized derived tables (5)
    */
    if ((!s->const_keys.is_clear_all() ||
	 !bitmap_is_clear_all(&s->table->cond_set)) &&              // (1)
        (!s->table->pos_in_table_list->embedding ||                 // (2)
         (s->table->pos_in_table_list->embedding &&                 // (3)
          s->table->pos_in_table_list->embedding->sj_on_expr)) &&   // (3)
        !s->table->is_filled_at_execution() &&                      // (4)
        !(s->table->pos_in_table_list->derived &&                   // (5)
          s->table->pos_in_table_list->is_materialized_derived()))  // (5)

is not taken, because

(gdb) p s->table->pos_in_table_list->embedding
  $72 = (TABLE_LIST *) 0x7fff8c0109a8

(all other parts are satisfied)

Comment by Sergei Petrunia [ 2015-06-14 ]

A smaller example:

create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
 
create table one_k(a int);
insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
 
create table t2 (a int, b int, c int, key(a), key(b));
 
insert into t2 
select 
  A.a,
  A.a + B.a* 10 + C.a*100+D.a*1000,
  A.a + B.a* 10 + C.a*100+D.a*1000
from ten A, ten B, ten C, ten D;
analyze table t2;

explain select * 
from 
  ten left join (select a,b,c from t2 where b<20) X on ten.a=X.c;
+------+-------------+-------+------+---------------+------+---------+------+-------+-------------------------------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra                                           |
+------+-------------+-------+------+---------------+------+---------+------+-------+-------------------------------------------------+
|    1 | SIMPLE      | ten   | ALL  | NULL          | NULL | NULL    | NULL |    10 |                                                 |
|    1 | SIMPLE      | t2    | ALL  | b             | NULL | NULL    | NULL | 10557 | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------+------+---------------+------+---------+------+-------+-------------------------------------------------+

Let's try the inner join part separately:

MariaDB [j5a]> explain select a,b,c from t2 where b<20;
+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |
+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
|    1 | SIMPLE      | t2    | range | b             | b    | 5       | NULL |   19 | Using index condition |
+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

Comment by Sergei Petrunia [ 2015-06-16 ]

Made a patch that makes range optimizer to be invoked for the inner sides of
outer joins. It has a gap (we don't do 100% of what we could do when the
range optimizer returns "impossible where"), but it passes the test suite.

With this patch, the "smaller example" is fixed:

MariaDB [j5a]> explain select * 
    -> from 
    ->   ten left join (select a,b,c from t2 where b<20) X on ten.a=X.c;
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------------------------------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                                           |
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------------------------------------------+
|    1 | SIMPLE      | ten   | ALL   | NULL          | NULL | NULL    | NULL |   10 |                                                 |
|    1 | SIMPLE      | t2    | range | b             | b    | 5       | NULL |   19 | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------------------------------------------+

Comment by Sergei Petrunia [ 2015-06-16 ]

... but the bug reporter's example is not. It still picks a bad plan (one can now get the good query plan by putting FROM b AS b ignore index (a_id) into the subquery, though).

The reason that good query plan is not picked is in best_access_path, here:

  if ((records >= s->found_records || best > s->read_time) &&            // (1)
  ....
  {
    if (s->quick)
    {
      /*
        For each record we:
        - read record range through 'quick'
        - skip rows which does not satisfy WHERE constraints
        TODO: 
        We take into account possible use of join cache for ALL/index
        access (see first else-branch below), but we don't take it into 
        account here for range/index_merge access. Find out why this is so.
      */
      tmp= record_count *
        (s->quick->read_time +
         (s->found_records - rnd_records)/(double) TIME_FOR_COMPARE);

Note the above comment. For some reason, this ages-old code does not take into account join buffering when quick select is present.
Because of that, ref access wins with this plan:

+------+-------------+-------+-------+------------------------+--------+---------+---------+------+--------------------------+
| id   | select_type | table | type  | possible_keys          | key    | key_len | ref     | rows | Extra                    |
+------+-------------+-------+-------+------------------------+--------+---------+---------+------+--------------------------+
|    1 | SIMPLE      | a     | range | handle                 | handle | 5       | NULL    |  399 | Using where; Using index |
|    1 | SIMPLE      | b     | ref   | a_id,created,collected | a_id   | 5       | j5.a.id | 1113 | Using where              |
+------+-------------+-------+-------+------------------------+--------+---------+---------+------+--------------------------+

only with ignore index (a_id) I can get this plan:

+------+-------------+-------+-------------+-------------------+-------------------+---------+------+------+--------------------------------------------------------------------------------------+
| id   | select_type | table | type        | possible_keys     | key               | key_len | ref  | rows | Extra                                                                                |
+------+-------------+-------+-------------+-------------------+-------------------+---------+------+------+--------------------------------------------------------------------------------------+
|    1 | SIMPLE      | a     | range       | handle            | handle            | 5       | NULL |  399 | Using where; Using index; Using temporary; Using filesort                            |
|    1 | SIMPLE      | b     | index_merge | created,collected | created,collected | 8,9     | NULL | 3918 | Using sort_union(created,collected); Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------+-------------+-------------------+-------------------+---------+------+------+--------------------------------------------------------------------------------------+

Comment by Sergei Petrunia [ 2015-06-16 ]

... tried to make a quick fix to solve the above:

@@ -6259,9 +6285,21 @@ double matching_candidates_in_table(JOIN_TAB *s, bool with_found_constraint,
         access (see first else-branch below), but we don't take it into 
         account here for range/index_merge access. Find out why this is so.
       */
-      tmp= record_count *
-        (s->quick->read_time +
-         (s->found_records - rnd_records)/(double) TIME_FOR_COMPARE);
+      if (disable_jbuf)
+      { 
+        tmp= record_count *
+          (s->quick->read_time +
+           (s->found_records - rnd_records)/(double) TIME_FOR_COMPARE);
+      }
+      else
+      {
+        tmp=s->quick->read_time;
+
+        tmp*= (1.0 + floor((double) cache_record_length(join,idx) *
+                           record_count /
+                           (double) thd->variables.join_buff_size));
+        tmp+= (s->quick->records - rnd_records)/(double) TIME_FOR_COMPARE;
+      }

It didn't help. Investigating why, looking at
best_access_path( record_count=399, idx=1, ...):

ref access plan:

rows= key_info->rec_per_key[...]= 381
tmp= read_time(key, 1, 381) = 382
tmp *=record_count // = 152,418

best_time= tmp + records/(double) TIME_FOR_COMPARE;
best_time= 152494.2

range (actually index merge) access plan:

s->quick->read_time=4880.7
s->quick->records= 3918

multiplication of tmp due to join buffering = 1 (everything will fit into jbuf)

tmp+= (s->quick->records - rnd_records)/(double) TIME_FOR_COMPARE;
tmp=5076

rnd_records=2939

Comparison of ref vs range(i.e index_merge):

tmp + record_count/(double) TIME_FOR_COMPARE*rnd_records = 239608.70

best + record_count/(double) TIME_FOR_COMPARE*records= 182821.79

That is, cost formulas are such that ref access wins.

Comment by Sergei Petrunia [ 2015-06-16 ]

Interesting points:

best_time= tmp + records/(double) TIME_FOR_COMPARE;

If we add cost of WHERE condition, why is it multiplied by 'records'? Shouldn't it be multiplied by records*record_count ? .

97% of the cost of quick select plan (234532.19) comes from the cost of checking WHERE:

 record_count/(double) TIME_FOR_COMPARE*records

Two issues here: First, 1/TIME_FOR_COMPARE is very expensive as "cost of checking a condition". Second, does this mean that we should take possibility to use "hash join" into account?

Generated at Thu Feb 08 07:23:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.