[MDEV-29172] Performance degradation with sub queries in MariaDB 10.5 and later Created: 2022-07-26  Updated: 2023-04-27

Status: Open
Project: MariaDB Server
Component/s: Optimizer, Storage Engine - InnoDB
Affects Version/s: 10.5.16, 10.6.8, 10.7.4, 10.8.3
Fix Version/s: 10.5, 10.6

Type: Bug Priority: Major
Reporter: James Coleman Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: None
Environment:

CentOS Linux release 7.9.2009 (Core)
Linux james-test 3.10.0-1160.71.1.el7.x86_64 #1 SMP Tue Jun 28 15:37:28 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Virtual Machine


Attachments: File mariadb10_5_8_optimizer_trace.json     File query.sql     File test.csv    

 Description   

In investigating a report of an issue with Magento 2.4.2 Enterprise Edition having queries take upwards of 2 minutes to process that used to only take a few milliseconds in MariaDB 10.3, we have discovered a major performance degradation between MariaDB 10.4 and the later releases of MariaDB.

Looking at the process list for MariaDB, the queries appear to be in statistics state:

MariaDB [(none)]> show processlist;
+--------+---------+-----------------+---------+---------+------+--------------+------------------------------------------------------------------------------------------------------+----------+
| Id     | User    | Host            | db      | Command | Time | State        | Info                                                                                                 | Progress |
+--------+---------+-----------------+---------+---------+------+--------------+------------------------------------------------------------------------------------------------------+----------+
| 557804 | mage2ee | 127.0.0.1:48967 | mage2ee | Query   |   15 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 557805 | mage2ee | 127.0.0.1:48969 | mage2ee | Query   |   76 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 558123 | mage2ee | 127.0.0.1:52219 | mage2ee | Query   |    8 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 558148 | mage2ee | 127.0.0.1:52685 | mage2ee | Query   |  136 | Sending data | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 558265 | mage2ee | 127.0.0.1:54449 | mage2ee | Query   |   30 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 558524 | mage2ee | 127.0.0.1:57229 | mage2ee | Query   |  127 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 558559 | mage2ee | 127.0.0.1:57549 | mage2ee | Query   |   12 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 558746 | mage2ee | 127.0.0.1:59659 | mage2ee | Query   |  144 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 559094 | mage2ee | 127.0.0.1:63031 | mage2ee | Query   |   42 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 559732 | mage2ee | 127.0.0.1:26585 | mage2ee | Query   |   40 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 559799 | mage2ee | 127.0.0.1:27695 | mage2ee | Query   |   13 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 559828 | mage2ee | 127.0.0.1:28109 | mage2ee | Query   |   24 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 559900 | mage2ee | 127.0.0.1:28501 | mage2ee | Query   |   22 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 560829 | mage2ee | 127.0.0.1:35231 | mage2ee | Query   |   27 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 561263 | mage2ee | 127.0.0.1:38325 | mage2ee | Query   |   30 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 561522 | mage2ee | 127.0.0.1:39811 | mage2ee | Query   |   48 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 561727 | mage2ee | 127.0.0.1:40905 | mage2ee | Query   |   88 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 561780 | mage2ee | 127.0.0.1:41085 | mage2ee | Query   |   23 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 562695 | mage2ee | 127.0.0.1:46377 | mage2ee | Query   |   53 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 562885 | mage2ee | 127.0.0.1:47351 | mage2ee | Query   |   14 | Statistics   | SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `eav`.`entity_id`, `eav`.`value`, `so |    0.000 |
| 566453 | root    | localhost       | NULL    | Query   |    0 | starting     | show processlist                                                                                     |    0.000 |
+--------+---------+-----------------+---------+---------+------+--------------+------------------------------------------------------------------------------------------------------+----------+

In examining this issue, I have been able to reproduce consistently in a virtual machine with CentOS 7 installed and have tested MariaDB 10.5.16, 10.6.8, 10.7.4, and 10.8.3 to all suffer for this problem. To reproduce you just need a table with the InnoDB engine and some keys, adding more keys seems to cause the performance to worsen.

create_table.sql

DROP TABLE IF EXISTS test_table;
CREATE TABLE test_table (
    `entity_id` int(10) unsigned NOT NULL,
    `value` int(10) unsigned NOT NULL,
    PRIMARY KEY (`entity_id`,`value`),
    KEY `TEST_TABLE_VALUE` (`value`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

The test query which reproduces the performance issues is similar to the following, I have attached the test query I am running in my testing as a file as it is large.

SELECT `sub`.`value`, COUNT(sub.value) AS `count` FROM (SELECT `test`.`entity_id`, `test`.`value` FROM `test_table` AS `test` WHERE test.entity_id IN (...)) AS `sub` GROUP BY `sub`.`value`;

Each test, I have used the following commands to reinstall MariaDB with the version I am testing with:

cat <<EOF > /etc/yum.repos.d/mariadb.repo
# MariaDB CentOS repository list
# http://downloads.mariadb.org/mariadb/repositories/
[mariadb]
name=MariaDB
baseurl=http://yum.mariadb.org/10.4/centos7-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1
enabled=1
EOF
 
yum clean all
yum -y remove 'MariaDB-*'
yum -y remove 'galera-*'
rm -Rf /var/lib/mysql
yum -y --enablerepo=mariadb install MariaDB-server
 
systemctl start mariadb

MariaDB 10.4 results:

[root@james-test ~]# mysql test
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.4.25-MariaDB MariaDB Server
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [test]> source create_table.sql
Query OK, 0 rows affected, 1 warning (0.001 sec)
 
Query OK, 0 rows affected (0.013 sec)
 
MariaDB [test]> source query.sql
Empty set (0.061 sec)
 
MariaDB [test]> source query.sql
Empty set (0.057 sec)
 
MariaDB [test]>

MariaDB 10.5 results:

[root@james-test ~]# mysql test
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 10.5.16-MariaDB MariaDB Server
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [test]> source create_table.sql
Query OK, 0 rows affected, 1 warning (0.001 sec)
 
Query OK, 0 rows affected (0.010 sec)
 
MariaDB [test]> source query.sql
Empty set (7.881 sec)
 
MariaDB [test]> source query.sql
Empty set (7.628 sec)

In-case it isn't obvious, this is testing on an empty table in an fresh install of MariaDB without customization to the my.cnf. We have tried the following options in multiple states without affect on the issue.

optimizer_switch=rowid_filter=off
optimizer_use_condition_selectivity=1
optimizer_switch=optimize_join_buffer_size=off
use_stat_tables=NEVER
optimizer_search_depth=0



 Comments   
Comment by James Coleman [ 2022-07-30 ]

Hello,

I did some more research on this problem and found that it was introduced in MariaDB 10.5.9, and does not exist in version 10.5.8 or prior. I ran an optimizer trace between 10.5.8 and 10.5.9, but they show no difference:

root@bunt:~# mysql test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 16
Server version: 10.5.8-MariaDB Source distribution
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [test]> SET optimizer_trace='enabled=on';
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> source query.sql
Empty set (0.076 sec)
 
MariaDB [test]> SET optimizer_trace='enabled=off';
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE INTO OUTFILE 'mariadb1058.csv';
Query OK, 1 row affected, 1 warning (0.040 sec)

mariadb10_5_8_optimizer_trace.json

root@bunt:~# mysql test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 20
Server version: 10.5.9-MariaDB Source distribution
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [test]> SET optimizer_trace='enabled=on';
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> source query.sql
Empty set (7.367 sec)
 
MariaDB [test]> SET optimizer_trace='enabled=off';
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE INTO OUTFILE 'mariadb1059.csv';
Query OK, 1 row affected, 1 warning (0.043 sec)
 
MariaDB [test]> Bye
root@bunt:~# diff /var/lib/mysql/test/mariadb1058.csv /var/lib/mysql/test/mariadb1059.csv
root@bunt:~# md5sum /var/lib/mysql/test/mariadb1058.csv /var/lib/mysql/test/mariadb1059.csv
9130b0e27e29ce1ecb2c0c315dbb345e  /var/lib/mysql/test/mariadb1058.csv
9130b0e27e29ce1ecb2c0c315dbb345e  /var/lib/mysql/test/mariadb1059.csv

As they are the same output, I don't see how it would be useful, but I have uploaded it anyway in-case it is of some use.

Comment by James Coleman [ 2022-08-01 ]

Upon further review, a colleague found the responsible commit to be https://github.com/MariaDB/server/commit/c36720388d598ca3aa1c4d2dab2266656c528b50 and we found that setting the optimizer_max_sel_arg_weight from 32000 down to 20 to solve our problem with Magento.

Comment by James Coleman [ 2022-08-03 ]

Hello @Sergei, I did a bit of performance testing and we were wondering if you had any recommendations on a general setting for this option. From looking at the numbers, we've seen where lower weight values do degrade performance of some queries so I just performed a performance test with an average of 30 runs of the query we've had issues with against a clone of the database it was ran against. From a quick review of the numbers, it looks to start slowing down around a weight of 400 so I'm thinking an ok compromise may be a weight of 512.

I have attached the results of my testing with the duration column being in milliseconds (77630166 is 0.0776 seconds basically) averaged of 30 query runs.

Generated at Thu Feb 08 10:06:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.