[MDEV-25020] SELECT if there is IN clause with binary UUID in binary form is extremely slow since MariaDB 10.5.9 Created: 2021-03-01  Updated: 2023-10-24  Resolved: 2022-08-01

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.5.9, 10.5.13
Fix Version/s: 10.5.17, 10.6.9, 10.7.5, 10.8.4

Type: Bug Priority: Blocker
Reporter: Ján Forgáč Assignee: Sergei Petrunia
Resolution: Fixed Votes: 2
Labels: performance
Environment:

Linux


Attachments: Text File log_fast.log     Text File log_slow.log     PNG File screenshot-1.png    
Issue Links:
Problem/Incident
causes MDEV-29242 Assertion `computed_weight == weight'... Closed
is caused by MDEV-9750 Quick memory exhaustion with 'extende... Closed
Relates
relates to MDEV-27530 InnoDB - Performance issues after upg... Closed
relates to MDEV-28518 After update to 10.5 a lot of time is... Closed
relates to MDEV-28638 ANALYZE FORMAT=JSON should print time... Closed

 Description   

table with primery key binary(16)

`SELECT from table WHERE key IN (binary form ids, ....)` is extremly slow

while

`SELECT from table WHERE key IN ((0x11e6a2ad166fc342abde002590a2360a),
(0x11e6a2ad16750370aeaf002590a2360a),
(0x11e6a2ad167a186a9e19002590a2360a),....)`
is perfectly fast.

IDs in binary form I mean this: https://tinyurl.com/ycfyftby



 Comments   
Comment by Sergei Petrunia [ 2021-06-09 ]

I was unable to reproduce. Can you share more details:

  • What collation is used in your connection to the database? ( select collation('abc'); will show it)
  • EXPLAINs for both queries
  • ideally, do this:

    set optimizer_trace=1;
    {run the query}
    select * from information_schema.optimizer_trace;
    

    and share the output with us.

Comment by Sergei Petrunia [ 2021-06-09 ]

Closing as "Cannot reproduce" for now. Feel free to re-open when more data is available.

Comment by Ján Forgáč [ 2021-06-10 ]

Rapid performance drop is obvious if in IN (....) are thousands or ten of thousands items.

  • 'collation("abc")' => 'utf8mb4_general_ci'
  • fast trace log: log_fast.log – not many iems in IN query so log is shorter
  • slow trace log: log_slow.log – slightly different query, but uses UUIDS in IN clause + not many iems in IN query so log is shorter + binary UUIDs are not properly represented in the text log ...

Are these outputs enough for identifiing the problem?

I think I do not have privilege to reopen this issue. Do I?

Comment by Sergei Petrunia [ 2021-06-15 ]

Re-opening

Comment by Sergei Petrunia [ 2022-05-16 ]

Analyzing the data in the provided files.
10.1_test_result, 10.5_test_result:

        10.5-reg        10.5-binary     10.1-reg    10.1-binary
    10  0.396300086     0.165200131     2.3458       2.0851
   100  2.345800193     1.408499915     11.937       0.1236
  1000  49.52399843     11.65679959     134.73       0.9971
 10000  488.1753836     113.1179959     1209.7       9.2002
100000  771.0653742     179.5780646     1694         14.288

Query plans:

       10.5-regular  10.5-binary  10.1-regular  10.1-binary:
    10  range        range         range        range
   100  range        range         range        range
  1000  subquery     range         range        range
 10000  subquery     range         range        range
100000  subquery     range         range        range

Comment by Sergei Petrunia [ 2022-05-16 ]

Putting that on a chart:

That is, for lookups on binary field, 10.1 was faster than 10.2.
For non-binary (regular) field lookup, 10.5 is faster than 10.2. Note that 10.5 switches to using a subquery plan.

Comment by Wei Jiang [ 2022-05-27 ]

I have observed the slowness even in 10.5.8 but the symptoms wasn't there on 10.5.4.

Comment by Sergei Golubchik [ 2022-07-27 ]

explain takes much more time in 10.5 as compared to 10.1 (140 sec vs 0.1 sec, ymmv).

99.66% of the time 10.5 spends in SEL_ARG::update_weight_locally()

With

+  if (param->thd->variables.optimizer_max_sel_arg_weight)
    key1->update_weight_locally();

explain in 10.5 runs with the same speed as in 10.1

Comment by Sergei Petrunia [ 2022-07-27 ]

Indeed, the call to update_weight_locally() is the problem.

update_weight_locally() traverses the whole SEL_ARG tree ("locally" means it doesn't walk across next_key_part edges).

Lets consider a call

key_or(tree1, tree2) 

where tree1 has a lot of SEL_ARG nodes,
and tree2 is a tree that has just one SEL_ARG() node.

The code in key_or() will use RB-Tree descend to find the location in tree1 where tree2 should be put in (grep for key1->find_range(key2) call). This takes O(LOG(N)). Adding O(N) in update_weight_locally() will indeed cause a big slowdown.

Comment by Sergei Petrunia [ 2022-07-27 ]

Note that SEL_ARG::tree_delete() and SEL_ARG::insert() do update the value of SEL_ARG::weight.

key_or() also messes with SEL_ARG nodes directly, when it joins trees with SEL_ARG objects representing identical or overlapping intervals. I've went through the code and it looks like neither of these direct SEL_ARG modifications change the weight of the tree.

Comment by Sergei Petrunia [ 2022-07-27 ]

This patch passes the tests:

diff --git a/sql/opt_range.cc b/sql/opt_range.cc
index e3287e1bbea..e469070c8f6 100644
--- a/sql/opt_range.cc
+++ b/sql/opt_range.cc
@@ -10206,6 +10206,9 @@ SEL_ARG *key_and_with_limit(RANGE_OPT_PARAM *param, uint keyno,
    ( 1  <  kp1 <= 2 AND 1 < kp2 < 10 ) OR
    ( 2  <  kp1 < 10 AND 1 < kp2 < 20 ) OR
    ( 10 <= kp1 < 20 AND 4 < kp2 < 20 )
+
+   Note: There is no need to manually update key1->weight when joining it with
+   key2. Every operation we do on SEL_ARG tree key1 maintains its weight.
 */
 static SEL_ARG *
 key_or(RANGE_OPT_PARAM *param, SEL_ARG *key1,SEL_ARG *key2)
@@ -10828,9 +10831,6 @@ key_or(RANGE_OPT_PARAM *param, SEL_ARG *key1,SEL_ARG *key2)
   }
   key1->use_count++;
 
-  /* Re-compute the result tree's weight. */
-  key1->update_weight_locally();
-
   key1->max_part_no= max_part_no;
   return key1;
 }

The only remaining place that uses update_weight_locally() is and_all_keys(). One can't just remove the update_weight_locally() call from there.

Comment by Sergei Petrunia [ 2022-07-27 ]

https://github.com/MariaDB/server/commit/d17ed38893fa0213575813975212c8f4227c9972

Comment by Sergei Petrunia [ 2022-08-01 ]

Addressed review input by Monty and pushed into 10.5

Generated at Thu Feb 08 09:34:30 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.