[MDEV-32351] Significant slowdown for query with many outer joins Created: 2023-10-04  Updated: 2023-11-01  Resolved: 2023-10-28

Status: Closed
Project: MariaDB Server
Component/s: Optimizer, Storage Engine - InnoDB
Affects Version/s: 10.2.38, 10.3.29, 10.4.19, 10.5.10, 10.6.1
Fix Version/s: 10.4.32, 10.5.23, 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3

Type: Bug Priority: Blocker
Reporter: Cyprien Devillez Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: performance, regression
Environment:

Ubuntu 22.04, Docker 24.0.6


Attachments: Text File explain_10237.txt     Text File explain_10418.txt     Text File explain_10419.txt     File mdev-32351.test     File mdev-32351_v2.test     File perf_10.2.37.svg     File perf_10.2.38.svg     File perf_10.4.32.svg     File perf_full_10.4.18_8mn03.svg     File perf_full_10.4.19_3h48mn28.svg     File perf_full_10.4.19_reverted_7mn52s.svg     File trace_10.4.18.json     File trace_10.4.19.json    
Issue Links:
Problem/Incident
is caused by MDEV-21104 Wrong result (extra rows and wrong va... Closed

 Description   

Summary

With MariaDB 10.2.38,10.3.29,10.4.19,10.5.10,10.6.1 and later version, this request (https://github.com/cypx/mariadb-issue/blob/main/request.sql) launched on this database (https://github.com/cypx/mariadb-issue/blob/main/magentodb.sql anonymized tables from a partial export of a Magento 2 database) need 20x more time to finish.
I get consistent test results with my laptop and a cloud VM on many version (including latest) of the 10.2,10.3,10.4,10.5 and 10.6 branch. I don't run any test on 10.7 branch or later because they are not officially supported by the affected software (Magento 2).

Reproduce

On a Linux system with an up-to-date docker install you could clone the following repository https://github.com/cypx/mariadb-issue and launch the ./bench.sh script followed by the MariaDB you want to test (ex: ./bench.sh 10.2.37 )

Example on my laptop (CPU: Intel i7-1065G7, Memory: 16 GB)

$ git clone https://github.com/cypx/mariadb-issue.git
Cloning into 'mariadb-issue'...
remote: Enumerating objects: 15, done.
remote: Total 15 (delta 0), reused 0 (delta 0), pack-reused 15
Receiving objects: 100% (15/15), 59.30 MiB | 22.61 MiB/s, done.
Resolving deltas: 100% (3/3), done.
$ cd mariadb-issue
$ ./bench.sh 10.2.37
[+] Running 14/14
 ✔ mariadb 13 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulled                                                                                                                                                                         9.7s
   ✔ 01bf7da0a88c Pull complete                                                                                                                                                                                                     1.1s
   ✔ f3b4a5f15c7a Pull complete                                                                                                                                                                                                     0.3s
   ✔ 57ffbe87baa1 Pull complete                                                                                                                                                                                                     0.3s
   ✔ 7ac3ea3a19e9 Pull complete                                                                                                                                                                                                     0.7s
   ✔ b89d9d23853d Pull complete                                                                                                                                                                                                     0.8s
   ✔ 6512bf94537d Pull complete                                                                                                                                                                                                     1.0s
   ✔ d75afd385b71 Pull complete                                                                                                                                                                                                     1.1s
   ✔ ab1935952eac Pull complete                                                                                                                                                                                                     1.4s
   ✔ 209bb6034896 Pull complete                                                                                                                                                                                                     1.4s
   ✔ a193656db97c Pull complete                                                                                                                                                                                                     1.4s
   ✔ 392397d32761 Pull complete                                                                                                                                                                                                     4.0s
   ✔ 48dab60bfc0f Pull complete                                                                                                                                                                                                     1.7s
   ✔ 76d69b7780e7 Pull complete                                                                                                                                                                                                     1.8s
[+] Running 2/2
 ✔ Network mariadb-issue_default      Created                                                                                                                                                                                       0.1s
 ✔ Container mariadb-issue-mariadb-1  Healthy                                                                                                                                                                                       1.4s
 # MariaDB 10.2.37 - request start: mer. 04 oct. 2023 16:34:10 CEST
 
real    4m7,279s
user    0m0,713s
sys     0m1,516s
 # MariaDB 10.2.37 - request end: mer. 04 oct. 2023 16:38:18 CEST
[+] Running 2/2
 ✔ Container mariadb-issue-mariadb-1  Removed                                                                                                                                                                                       2.6s
 ✔ Network mariadb-issue_default      Removed                                                                                                                                                                                       0.3s
$ ./bench.sh 10.2.38
[+] Running 12/12
 ✔ mariadb 11 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulled                                                                                                                                                                          10.9s
   ✔ 25fa05cd42bd Pull complete                                                                                                                                                                                                     1.1s
   ✔ 09d7f527a85d Pull complete                                                                                                                                                                                                     0.4s
   ✔ 21ffc1a82bcc Pull complete                                                                                                                                                                                                     0.5s
   ✔ 499ae120a92c Pull complete                                                                                                                                                                                                     0.9s
   ✔ b1026105733a Pull complete                                                                                                                                                                                                     0.8s
   ✔ f93078b4478c Pull complete                                                                                                                                                                                                     1.3s
   ✔ dafd08f8c025 Pull complete                                                                                                                                                                                                     1.2s
   ✔ 25dd530577f8 Pull complete                                                                                                                                                                                                     1.6s
   ✔ 8ea3241729e6 Pull complete                                                                                                                                                                                                     3.6s
   ✔ f489e9c74b8e Pull complete                                                                                                                                                                                                     1.7s
   ✔ cd3a9c186264 Pull complete                                                                                                                                                                                                     1.9s
[+] Running 2/2
 ✔ Network mariadb-issue_default      Created                                                                                                                                                                                       0.1s
 ✔ Container mariadb-issue-mariadb-1  Healthy                                                                                                                                                                                       2.7s
 # MariaDB 10.2.38 - request start: mer. 04 oct. 2023 16:41:26 CEST
 
real    102m4,558s
user    0m0,977s
sys     0m1,410s
 # MariaDB 10.2.38 - request end: mer. 04 oct. 2023 18:23:30 CEST
[+] Running 2/2
 ✔ Container mariadb-issue-mariadb-1  Removed                                                                                                                                                                                       2.2s
 ✔ Network mariadb-issue_default      Removed                                                                                                                                                                                       0.4s



 Comments   
Comment by Marko Mäkelä [ 2023-10-04 ]

I think that to find the source of the regression, flame graphs produced from perf record samples would be most useful.

Would it be possible for you to upload the flame graphs of the benchmark on MariaDB 10.2.37 and 10.2.38?

Another option might be to compare MariaDB 10.4.18 and 10.4.19 (assuming that the same regression is observable between these releases) and to enable the optimizer trace (MDEV-6111). I would expect the problem to be a worse query execution plan to be chosen, and not something in the InnoDB storage engine.

Comment by Cyprien Devillez [ 2023-10-05 ]

I added FlameGraph to the ticket.
But perf do not resolved the symbols despite I install dbgsym package (and reinstall mariadb package from http://archive.mariadb.org/mariadb-$MARIADB_VERSION/repo/ubuntu/ ).
Do you have an advice to fix that?
I don't have many time today but I try to have a look to the optimizer trace tomorrow.

Comment by Marko Mäkelä [ 2023-10-05 ]

cypx, thank you for trying it. One more thing that could be helpful would be the EXPLAIN output for the SELECT statement in each environment.

danblack, can you help with the debug symbols?

Comment by Cyprien Devillez [ 2023-10-06 ]

Trace and explain added!

Comment by Daniel Black [ 2023-10-07 ]

https://github.com/cypx/mariadb-issue/pull/1

rough explaination.

Comment by Marko Mäkelä [ 2023-10-08 ]

cypx, thank you. I do not see a huge difference between the optimizer trace and the EXPLAIN SELECT output between 10.4.18 and 10.4.19. Did you reproduce the 20× performance difference between those 2 versions?

Comment by Cyprien Devillez [ 2023-10-08 ]

@danblack thank for the explanation, I think understand, I will retry tomorrow.

@marko Yes I reproduce, It took less than 5mn to get the 1.4.18 trace and more than one and half hour to get the 1.4.19 one.
At the end of the trace the attaching_conditions_to_tables have difference for some table (eao, eao2...), could it be significant?

Also, I realize I don't have say that the issue have been initially spotted on baremetal servers running Debian (so it's not related to docker and/or virtualization side effect)

Comment by Marko Mäkelä [ 2023-10-09 ]

cypx, the difference between the end of trace_10.4.18.json and trace_10.4.19.json seems to be that the 10.4.19 output uses the format database_name.table_name.column_name while the 10.4.18 output omits the database_name.

I am not an expert at reading this output, but the difference between the numbers of rows or the costs seems to be rather small. I hope that the flame graphs will shed some light on this. I can’t yet rule out InnoDB (my area of expertise) as the culprit of this regression.

Comment by Cyprien Devillez [ 2023-10-09 ]

I gave up trying to do perf report with symbols on docker because despite Daniel Black advice and some others documentation found on the web it's not working on my environment.
So I set up Ubuntu 20.04 VMs using http://archive.mariadb.org/mariadb-$MARIADB_VERSION/repo/ubuntu/ repository and I get flamegraph with symbols.
The issue has been reproduced, as the VMs is a bit slower it's take 8mn 03s with 10.4.18 and 3hours 48mn 28s with 10.4.18...
I added the new flamegraph to the the issue.

Comment by Daniel Black [ 2023-10-09 ]

__nss_database_lookup stands out, though this appears to be a [false lookup|
https://stackoverflow.com/questions/66762523/why-does-nss-database-lookup-take-most-of-the-time-in-my-numerical-c-program]. Being under JOIN_CACHE::read_all_record_fields, a very small function of functions it called appear in the flamegraph already. So I don't know what is all the time in this sample.

Only a little innodb shows up in 10.4.18 (index read). I failed to find a significant sample in 10.4.19.

Comment by Marko Mäkelä [ 2023-10-10 ]

The function JOIN_CACHE::join_matching_records() stands out in the call graphs. I do not see any significant time being spent in InnoDB. In perf_full_10.4.18_8mn03.svg, JOIN_TAB_SCAN::next() or rr_next() is spending 0.45% of the recorded samples, mostly executing inside InnoDB. In perf_full_10.4.18_8mn03.svg the function JOIN_CACHE::join_matching_records() seems to invoke something else to read the data. According to the search function in Mozilla Firefox, function names containing ha_innobase account for 0.4% of the samples. I could not visually find any such stack trace in the output.

A search

git log -SJOIN_CACHE mariadb-10.4.18..mariadb-10.4.19 sql

identified this change that was supposed to fix MDEV-21104.

Would reverting that change fix the regression?

Comment by Cyprien Devillez [ 2023-10-10 ]

@marko Good catch! I confirm reverting 90780bb5a949d5e27b39df4adaa1d7b1ad98948e on 10.4.19 fix the regression.

On the same VM, it's take 7mn52s to run the query with a reverted 10.4.19 build.
I added the flamegraph to the issue.

Comment by Daniel Black [ 2023-10-10 ]

Raising to blocker priority as the level of regression in the query time is of a very large magnitude and occurring in such a common application like Magento 2 should be given the highest priority possible, especially as the causing commit has been identified.

Comment by Igor Babaev [ 2023-10-10 ]

The patch 90780bb5a949d5e27b39df4adaa1d7b1ad98948e fixed an algorithmic bug in the implementation of BNL for outer joins. That bug led to wrong result set for a class of queries. The cause of the bug were missing actions needed for BNL to be valid for any dataset used by an affected query. It may happens that for other datasets that actions are not needed to get the right result set.

Comment by Igor Babaev [ 2023-10-11 ]

cypx, please could you confirm that the following piece of your query is as it was intended to be:

LEFT JOIN magentodb.eav_attribute_option AS eao3
        ON ea.attribute_id = eao.attribute_id AND ce_varchar_1.value = eao.option_id
        LEFT JOIN magentodb.eav_attribute_option_value eaov_1
        ON eaov_1.option_id = eao.option_id

If it's so then there is no conditions imposed on eao3. Is it what should be?
I'm asking because I have to force the optimizer to produce a plan similar to yours for my simulation test case.

Comment by Cyprien Devillez [ 2023-10-11 ]

@igor I'm not a Magento developper but as far as I understand it's seems intended and the result match the expectation.
Note also that the query is generated by ORM so it could be suboptimal.

Comment by Daniel Black [ 2023-10-13 ]

alice, can I trouble you for a minimised test case of https://github.com/cypx/mariadb-issue

Comment by Alice Sherepa [ 2023-10-16 ]

I was afraid to oversimplify the test and to loose the problem, so please also check the initial test case.
With the test below - the query takes around 0.2s on 10.4.18 vs 3.2s on 10.4.19. Explain/Analyze for the query show the same.
Also with less rows the test will be shorter, but I've left as it is now, because it is easier to see the difference. Attached as mdev-32351.test

| id   | select_type | table        | type   | possible_keys                   | key              | key_len | ref                                          | rows | r_rows  | filtered | r_filtered | Extra                                                        |
+------+-------------+--------------+--------+---------------------------------+------------------+---------+----------------------------------------------+------+---------+----------+------------+--------------------------------------------------------------+
|    1 | SIMPLE      | t1           | index  | NULL                            | attribute_set_id | 2       | NULL                                         | 53   | 53.00   |   100.00 |     100.00 | Using index; Using temporary                                 |
|    1 | SIMPLE      | t2           | ALL    | entity_type_id                  | NULL             | NULL    | NULL                                         | 240  | 240.00  |   100.00 |     100.00 | Using where; Using join buffer (flat, BNL join)              |
|    1 | SIMPLE      | ce_varchar_0 | eq_ref | entity_id,attribute_id,store_id | entity_id        | 8       | test.t1.entity_id,test.t2.attribute_id,const | 1    | 0.00    |   100.00 |     100.00 | Using where                                                  |
|    1 | SIMPLE      | ce_varchar_1 | eq_ref | entity_id,attribute_id,store_id | entity_id        | 8       | test.t1.entity_id,test.t2.attribute_id,const | 1    | 0.00    |   100.00 |     100.00 | Using where                                                  |
|    1 | SIMPLE      | t3           | eq_ref | PRIMARY,attribute_id            | PRIMARY          | 4       | test.ce_varchar_0.value                      | 1    | 0.00    |   100.00 |     100.00 | Using where                                                  |
|    1 | SIMPLE      | eao3         | index  | NULL                            | attribute_id     | 2       | NULL                                         | 2447 | 2447.00 |   100.00 |       0.04 | Using where; Using index; Using join buffer (flat, BNL join) |
+------+-------------+--------------+--------+---------------------------------+------------------+---------+----------------------------------------------+------+---------+----------+------------+--------------------------------------------------------------+
6 rows in set (0.227 sec)

on debug build:
1af855819300ba4e4dd3d582c5c521ff99321152

SHOW PROFILES;
Query_ID	Duration	Query
1	1.67817776	SELECT DISTINCT 1 ...

90780bb5a949d5e27b39df4adaa1d7b1ad98948e

SHOW PROFILES;
Query_ID	Duration	Query
1	20.51277929	SELECT DISTINCT 1 ...

10.4 0c7af6a2a19343cb9d4fedbd7165b8

SHOW PROFILES;
Query_ID	Duration	Query
1	32.21734665	SELECT DISTINCT 1 ..

Comment by Alice Sherepa [ 2023-10-17 ]

--source include/have_sequence.inc 
 
CREATE TABLE t1 (b int NOT NULL, PRIMARY KEY (b));
INSERT INTO t1 select seq from seq_1_to_10000;
 
CREATE TABLE t3 ( c int NOT NULL, PRIMARY KEY (c)) ;
INSERT INTO t3 select seq from seq_1_to_3000;
 
CREATE TABLE t4 (  b int NOT NULL, d varchar(255), KEY (b)) ;
insert into t4 VALUES (1,1),(2,2),(3,3);
 
set session profiling = ON;
 
SELECT  1
FROM t1 
LEFT JOIN t4 ON t1.b = t4.b 
LEFT JOIN t3 ON t4.d = t3.c
LEFT JOIN t3 AS eao3 ON t3.c=1;
 
SHOW PROFILES;
 
set session profiling = OFF;

Comment by Daniel Black [ 2023-10-18 ]

Folks, igor has done a first proof of concept fix. This hasn't been reviewed yet, but if you are able to test this on a non-production environment we welcome the feedback.

The container image: quay.io/mariadb-foundation/mariadb-devel:10.4-mdev-32351-pkgtest (arm64/amd64)

Various packages: https://ci.mariadb.org/39481/

Debian repos can use a repo line:

deb [trusted=yes] https://ci.mariadb.org/39481/amd64-ubuntu-2204-deb-autobake/debs ./

Adjust arch/distro/distro version as need.

RPM autobake directories have a repo file in them that can be used.

Feedback welcome.

Comment by Cyprien Devillez [ 2023-10-18 ]

With 10.4.32 build, the query takes 08mn 24s on the same VM used for previous testing (8mn 03s with 10.4.18 and 3hours 48mn 28s with 10.4.19).
So fhe fix look good to me.
I added 10.4.32 FlameGraph to the issue.

Comment by Daniel Black [ 2023-10-18 ]

Thanks for testing cypx. Good to know the positive case worked. As we arrived at this bug however its making sure all the other queries also perform correctly and fast as well.

Comment by Daniel Black [ 2023-10-20 ]

cypx, I should have also asked, the result was correct for the query?

Comment by Cyprien Devillez [ 2023-10-23 ]

Daniel Black, The result was correct and it was the same between 10.4.18 & 10.4.32 (diff & md5 match).

Comment by Sergei Petrunia [ 2023-10-26 ]

igor, it's a problem that there's no visible test coverage for "Don't unpack the buffer if we have MATCH_IMPOSSIBLE" functionality.

I've backported r_unpack_time_ms patch from 10.6: https://github.com/MariaDB/server/commit/a9cf22654e99b6b40d7eedd6e7443c1a2813da17

After that, one can do with a very small patch to add tests for this: https://github.com/MariaDB/server/commit/5745468d17e6f48d136935329a55cc97090024e1

Both patches are in bb-10.4-mdev32351-tests, on top of your fix. What do you think?

Comment by Sergei Petrunia [ 2023-10-26 ]

... The fix itself is ok, although it's difficult to understand.

Generated at Thu Feb 08 10:30:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.