[MDEV-37732] TPROC-H, no stats: Query 21 is much slower in 11.4 than in 10.11 - Jira

XML

Word

Printable

Details

Type: Bug
Status: Stalled (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 11.4
Fix Version/s: 11.4, 11.8
Component/s: Optimizer
Labels:
- performance

Sprint:
Q4/2025 Server Development

Description

Analyzing Q21 from MDEV-36861.

explain

select

  supplier.S_NAME AS s_name,

  count(0) AS numwait

from

  supplier join

  lineitem l1 join

  orders join

  nation

where

  supplier.S_SUPPKEY = l1.L_SUPPKEY and

  orders.O_ORDERKEY = l1.L_ORDERKEY and

  orders.O_ORDERSTATUS = 'F' and

  l1.L_RECEIPTDATE > l1.L_COMMITDATE and

  exists(select 1

         from lineitem l2

         where

           l2.L_ORDERKEY = l1.L_ORDERKEY and

           l2.L_SUPPKEY <> l1.L_SUPPKEY

           limit 1)

and

  not exists (select 1

              from lineitem l3

              where

                l3.L_ORDERKEY = l1.L_ORDERKEY and

                l3.L_SUPPKEY <> l1.L_SUPPKEY and

                l3.L_RECEIPTDATE > l3.L_COMMITDATE

                limit 1)

and

  supplier.S_NATIONKEY = nation.N_NATIONKEY and

  nation.N_NAME = 'IRAN'

group by

  supplier.S_NAME

order by

  count(0) desc, supplier.S_NAME

limit 100;

The plan in 10.11:

+------+--------------------+----------+--------+-------------------------------------+---------+---------+------------------------------+---------+----------------------------------------------+

| id   | select_type        | table    | type   | possible_keys                       | key     | key_len | ref                          | rows    | Extra                                        |

+------+--------------------+----------+--------+-------------------------------------+---------+---------+------------------------------+---------+----------------------------------------------+

|    1 | PRIMARY            | orders   | ALL    | PRIMARY                             | NULL    | NULL    | NULL                         | 1486929 | Using where; Using temporary; Using filesort |

|    1 | PRIMARY            | l1       | ref    | PRIMARY,IDX_LINEITEM_ORDERKEY_FKIDX | PRIMARY | 4       | dbt3sf1.orders.o_orderkey    | 1       | Using where                                  |

|    1 | PRIMARY            | supplier | eq_ref | PRIMARY,SUPPLIER_NATION_FKIDX       | PRIMARY | 4       | dbt3sf1.l1.l_suppkey         | 1       | Using where                                  |

|    1 | PRIMARY            | nation   | eq_ref | PRIMARY                             | PRIMARY | 4       | dbt3sf1.supplier.s_nationkey | 1       | Using where                                  |

|    1 | PRIMARY            | l2       | ref    | PRIMARY,IDX_LINEITEM_ORDERKEY_FKIDX | PRIMARY | 4       | dbt3sf1.orders.o_orderkey    | 1       | Using where; FirstMatch(nation)              |

|    3 | DEPENDENT SUBQUERY | l3       | ref    | PRIMARY,IDX_LINEITEM_ORDERKEY_FKIDX | PRIMARY | 4       | dbt3sf1.l1.l_orderkey        | 1       | Using where                                  |

+------+--------------------+----------+--------+-------------------------------------+---------+---------+------------------------------+---------+----------------------------------------------+

The plan in 11.4:

+------+--------------------+----------+--------+-------------------------------------+--------------------------+---------+------------------------------+---------+---------------------------------------------------------------+

| id   | select_type        | table    | type   | possible_keys                       | key                      | key_len | ref                          | rows    | Extra                                                         |

+------+--------------------+----------+--------+-------------------------------------+--------------------------+---------+------------------------------+---------+---------------------------------------------------------------+

|    1 | PRIMARY            | l2       | index  | PRIMARY,IDX_LINEITEM_ORDERKEY_FKIDX | LINEITEM_PART_SUPP_FKIDX | 10      | NULL                         | 5768377 | Using index; Start temporary; Using temporary; Using filesort |

|    1 | PRIMARY            | l1       | ref    | PRIMARY,IDX_LINEITEM_ORDERKEY_FKIDX | PRIMARY                  | 4       | dbt3sf1.l2.l_orderkey        | 3       | Using where; End temporary                                    |

|    1 | PRIMARY            | supplier | eq_ref | PRIMARY,SUPPLIER_NATION_FKIDX       | PRIMARY                  | 4       | dbt3sf1.l1.l_suppkey         | 1       | Using where                                                   |

|    1 | PRIMARY            | orders   | eq_ref | PRIMARY                             | PRIMARY                  | 4       | dbt3sf1.l2.l_orderkey        | 1       | Using where                                                   |

|    1 | PRIMARY            | nation   | eq_ref | PRIMARY                             | PRIMARY                  | 4       | dbt3sf1.supplier.s_nationkey | 1       | Using where                                                   |

|    3 | DEPENDENT SUBQUERY | l3       | ref    | PRIMARY,IDX_LINEITEM_ORDERKEY_FKIDX | PRIMARY                  | 4       | dbt3sf1.l1.l_orderkey        | 3       | Using where                                                   |

+------+--------------------+----------+--------+-------------------------------------+--------------------------+---------+------------------------------+---------+---------------------------------------------------------------+

Problem 1: Duplicate Weedout Produces Bad OutRows

In both servers, when considering DuplicateWeedout for the prefix:

idx=1, join->positions[] = {l2, l1}

We get this:
11.4: first_weedout_table_rec_count = 1
10.11: prefix_rec_count=1.

(gdb) print sj_outer_fanout

  $108 = 3

// 10.11, it's 1

(gdb) p sj_inner_fanout

  $109 = 5768377

optimize_semi_joins() ends up producing:
pos->prefix_record_count= 1 // in 10.11
pos->prefix_record_count= 3 // in 11.4

Both are incorrect...

The effect of this

In MDEV-36861, 11.4 uses join order of l2, l1, supplier. supplier has:

               "loops": 3,

              "r_loops": 202493,

loops=3 means the rest of the join's cost is multiplied by a small number which makes it looks very good.

How did 10.11 work?

But why does 10.11 NOT pick DuplicateWeedout then?

Looking at numbers in Duplicate_weedout_picker::check_qep In 11.4:

l2:

  read_time= 853.36323136500005

  records_out=5768377

l1:

  read_time= 7490.6316488899993

  records_out=3

DuplicateWeedout:

  dups_cost= 8343.9948802549989

  write_cost = 0.025482910000000001

  full_lookup_cost = 2785.6069370700002

Numbers in Duplicate_weedout_picker::check_qep in 10.11:

l2:

  read_time= 56320

  records_read= 5 782 432

  current_fanout / TIME_FOR_COMPARE = 1 156 486.4

l1:

  read_time= 5 802 760.9

  records_read= 1

  current_fanout / TIME_FOR_COMPARE = 1 156 486.4

DuplicateWeedout:

   write_cost= = 289 121.6

full_lookup_cost= 1 671 825 991 731.2 // !!!!!!!!

This comes from:

      double full_lookup_cost=

               COST_MULT(join->positions[first_tab].prefix_record_count,

                         COST_MULT(sj_outer_fanout,

                                   sj_inner_fanout * one_lookup_cost));

the multiplicands are:

(gdb) p join->positions[first_tab].prefix_record_count

  $58 = 5782432

(gdb) print sj_outer_fanout

  $59 = 1

(gdb) print sj_inner_fanout

  $60 = 5782432

(gdb) print one_lookup_cost

  $61 = 0.050000000000000003

So the number of rows read from the first table is multiplied twice...
Denote this as 10.11-Duplicate-Weedout-Temptable-Very-Expensive.

Conclusions so far

So, in 10.11 we had

Duplicate Weedout Produces Bad OutRows
10.11-Duplicate-Weedout-Temptable-Very-Expensive.

which seem to have canceled each other out.

11.4 doesn't have 10.11-Duplicate-Weedout-Temptable-Very-Expensive, so we hit the Duplicate Weedout Produces Bad OutRows.

Solution for Bad OutRows problem

Problem details

Current code just walks through the join prefix and computes

sj_inner_fanout : fanout produced by inner tables
sj_outer_fanout :fanout produced by outer tables

Consider a query

select * from customer where  user.name in (select orders.customer from  orders where  correlated_cond)

and a join order of

orders , full scan.  rows=1M

customer , ref access.  rows=1 as there's one customer per order.

The used approach will generate:

sj_inner_fanout=1M

sj_outer_fanout=1

and then the optimizer will think there's 1 row left after the subquery duplicates are removed.

Semi-joining with subquery tables may produce duplicate rows.
It may also reduce the number of rows due to non-matches.
Example: a subquery that is a non-match for most row combinations:

select * from customer

where

  user.name in (select orders.customer from orders where orders.amount > 1M )

we're interested in computing the ratio of duplicate rows.

Proposed solution

Analyze the subquery' select list and see how often it would produce duplicates.
For non-correlated subqueries, it is easy

 ... IN (SELECT col

             FROM ....

Compute the estimated number of records in the subquery output;
Estimate the number of distinct values of select list.
Divide the first by the second.
We've done something similar in ~~MDEV-30877~~ for GROUP BY clause.

For correlated subqueries, use a similar approach:
Estimate output cardinality of a subquery ignoring the correlations.
That is, for

 SELECT col ... FROM it1, it2 ... itN WHERE  subquery_cond and correlation_cond

build a join prefix of all subquery tables:

itX, itY  ... itZ

get its output cardinality.
Estimate the number of distinct values of select list.
Divide the first by the second.

Evaluation

Applying the above for the subquery in the example:

  exists(select 1

         from lineitem l2

         where

           l2.L_ORDERKEY = l1.L_ORDERKEY and

           l2.L_SUPPKEY <> l1.L_SUPPKEY

           limit 1)

Looking at the IN form

  l1.L_ORDERKEY IN (select l2.L_ORDERKEY

                    from lineitem l2

                    where l2.L_SUPPKEY <> l1.L_SUPPKEY)

gives rec_per_key(lineitem.L_ORDERKEY) as output fanout.
This ignores the l2.L_SUPPKEY <> l1.L_SUPPKEY condition...

Possible issues

Note that optimize_semijoin_nests() also has a piece of code compute "number of records in the output of the materialized subquery after de-duplication". Which uses a different logic from opt_group_by_cardinality.cc ...

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

psergey-fix-fanout-variant2-v2.diff
10 kB
2025-10-01 18:11
psergey-fix-fanout-variant2.diff
8 kB
2025-09-30 18:00

Issue Links

relates to

MDEV-36861 TPC-H Query 21 takes 10X longer on 11.4 compared to 10.11

In Progress

Activity

People

Assignee:: Sergei Petrunia

Reporter:: Sergei Petrunia

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2025-09-24 15:44

Updated:: 2025-10-24 11:40

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.