[MDEV-29129] Performance regression starting in 10.6: unlimited "select order by limit" always using temporary, taking between 60x and 2000x longer in 10.6, 10.7, and 10.8 than in 10.5 Created: 2022-07-19  Updated: 2023-05-19  Resolved: 2022-12-03

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.6.8, 10.7.4, 10.8.3
Fix Version/s: 10.11.2, 10.6.12, 10.7.8, 10.8.7, 10.9.5, 10.10.3

Type: Bug Priority: Critical
Reporter: Juan Assignee: Sergei Petrunia
Resolution: Fixed Votes: 4
Labels: regression, regression-10.6, regression-10.7, regression-10.8
Environment:

Rocky 8.6
Intel i7-9750H (1) @ 2.591GHz 1G RAM
NVMe 493MiB/s 31.5k IOPS


Attachments: File cs0412805-20220718-simplified-issue-reproduction.sql    
Issue Links:
Blocks
Problem/Incident
is caused by MDEV-24089 support oracle syntax: rownum Closed

 Description   

Attached file reproduces: a simple select using no where clause but using limit, with order-by on an unindexed attribute, against a table with a little over 500,000 rows, takes about 0.150s in 10.5.15, while in 10.6.8, 10.7.4, and 10.8.3 the same query consistently takes about 10s. When an index is added, the difference is much more dramatic. 10.5 drops from 0.150s to 0.004s, while 10.6, 10.7, and 10.8 all remain unchanged at 10s.

The explain plan shows that 10.6-10.8 always create a temporary table in addition to using filesort (or an index), while 10.5 just uses filesort in the first case and index in the second w/o needing to create a temporary table in either case. Increasing sort_buffer_size up to 512M makes no difference. One significant change is that the metadata (system) character set in 10.5 is utf8, whereas starting with 10.6 it's utf8mb3.

MariaDB [test]> show create table TEST_Table\G
*************************** 1. row ***************************
       Table: TEST_Table
Create Table: CREATE TABLE `TEST_Table` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `categoryLookupId` int(11) DEFAULT NULL,
  `date` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `aap_category` (`categoryLookupId`),
  KEY `TEST_Table_date_cat` (`date`,`categoryLookupId`)
) ENGINE=InnoDB AUTO_INCREMENT=599441 DEFAULT CHARSET=utf8mb3
1 row in set (0.001 sec)

select date, `TEST_RETURN_LOOKUP_VALUE`(`categoryLookupId`) AS `propertyCategory` from `TEST_Table` order by date desc limit 25;

Unindexed 10.5 explain:

+------+-------------+------------+------+---------------+------+---------+------+--------+----------------+
| id   | select_type | table      | type | possible_keys | key  | key_len | ref  | rows   | Extra          |
+------+-------------+------------+------+---------------+------+---------+------+--------+----------------+
|    1 | SIMPLE      | TEST_Table | ALL  | NULL          | NULL | NULL    | NULL | 598694 | Using filesort |
+------+-------------+------------+------+---------------+------+---------+------+--------+----------------+

Unindexed 10.6, 10.7, 10.8 explain

+------+-------------+------------+------+---------------+------+---------+------+------+---------------------------------+
| id   | select_type | table      | type | possible_keys | key  | key_len | ref  | rows | Extra                           |
+------+-------------+------------+------+---------------+------+---------+------+------+---------------------------------+
|    1 | SIMPLE      | TEST_Table | ALL  | NULL          | NULL | NULL    | NULL | 1    | Using temporary; Using filesort |
+------+-------------+------------+------+---------------+------+---------+------+------+---------------------------------+

Typical indexed 10.5 performance

MariaDB [test]> select date, `TEST_RETURN_LOOKUP_VALUE`(`categoryLookupId`) AS `propertyCategory` from `TEST_Table` order by date desc limit 5;
+---------------------+----------------------------------+
| date                | propertyCategory                 |
+---------------------+----------------------------------+
| 2022-07-18 19:49:09 | 74968a188eb4f4dcc022669eff344f0f |
| 2022-07-18 19:49:09 | 1a110af0481e6221693ec1ac8f9ebee6 |
| 2022-07-18 19:49:04 | 74968a188eb4f4dcc022669eff344f0f |
| 2022-07-18 19:49:01 | b323588d8d57ba607870ccf1df312b5c |
| 2022-07-18 19:48:57 | 822b38cf00f26ad6cdc0c59f5b0f6016 |
+---------------------+----------------------------------+
5 rows in set (0.002 sec)

Typical unindexed 10.5 performance

MariaDB [test]> select date, `TEST_RETURN_LOOKUP_VALUE`(`categoryLookupId`) AS `propertyCategory` from `TEST_Table` order by date desc limit 5;
+---------------------+----------------------------------+
| date                | propertyCategory                 |
+---------------------+----------------------------------+
| 2022-07-18 19:49:09 | 1a110af0481e6221693ec1ac8f9ebee6 |
| 2022-07-18 19:49:09 | 74968a188eb4f4dcc022669eff344f0f |
| 2022-07-18 19:49:04 | 74968a188eb4f4dcc022669eff344f0f |
| 2022-07-18 19:49:01 | b323588d8d57ba607870ccf1df312b5c |
| 2022-07-18 19:48:57 | 822b38cf00f26ad6cdc0c59f5b0f6016 |
+---------------------+----------------------------------+
5 rows in set (0.104 sec)

Typical indexed 10.6 through 10.8 performance

MariaDB [test]> select date, `TEST_RETURN_LOOKUP_VALUE`(`categoryLookupId`) AS `propertyCategory` from `TEST_Table` order by date desc limit 5;
+---------------------+----------------------------------+
| date                | propertyCategory                 |
+---------------------+----------------------------------+
| 2022-07-18 19:49:05 | 47fddf54db605db008ba78df7ab6c02e |
| 2022-07-18 19:49:05 | 19736280721c57a798064fad686b684a |
| 2022-07-18 19:49:04 | 19736280721c57a798064fad686b684a |
| 2022-07-18 19:49:03 | 47fddf54db605db008ba78df7ab6c02e |
| 2022-07-18 19:49:00 | 5c884e67b699c62af82394735b4dfe1a |
+---------------------+----------------------------------+
5 rows in set (9.803 sec)

Typical unindexed 10.6 through 10.8 performance

MariaDB [test]> select date, `TEST_RETURN_LOOKUP_VALUE`(`categoryLookupId`) AS `propertyCategory` from `TEST_Table` order by date desc limit 5;
+---------------------+----------------------------------+
| date                | propertyCategory                 |
+---------------------+----------------------------------+
| 2022-07-18 19:49:05 | 47fddf54db605db008ba78df7ab6c02e |
| 2022-07-18 19:49:05 | 19736280721c57a798064fad686b684a |
| 2022-07-18 19:49:04 | 19736280721c57a798064fad686b684a |
| 2022-07-18 19:49:03 | 47fddf54db605db008ba78df7ab6c02e |
| 2022-07-18 19:49:00 | 5c884e67b699c62af82394735b4dfe1a |
+---------------------+----------------------------------+
5 rows in set (9.681 sec)



 Comments   
Comment by Sergei Petrunia [ 2022-09-14 ]

Ok, the query is using a stored routine call:

select 
  date, 
  `TEST_RETURN_LOOKUP_VALUE`(`categoryLookupId`) AS `propertyCategory` 
from `TEST_Table` 
  order by date desc limit 25;

I added

set @calls=@calls+1;

into the TEST_RETURN_LOOKUP_VALUE function code.

On 10.9, after running the query, I get this:

+--------+
| @calls |
+--------+
| 554293 |
+--------+

... while on 10.5, I get this:

MariaDB [test]> select @calls;
+--------+
| @calls |
+--------+
|      5 |
+--------+

Comment by Oleg Smirnov [ 2022-09-14 ]

The change was introduced with
commit be093c81a7955aa8309e650b0a0c44a5afe113fb
Author: Monty <monty@mariadb.org>
Date: Mon Jan 4 17:03:16 2021 +0200

MDEV-24089 support oracle syntax: rownum

The workaround for this issue is to declare the function as DETERMINISTIC:

DELIMITER //
CREATE FUNCTION `TEST_RETURN_LOOKUP_VALUE`(LOOKUPID_IN INT) RETURNS varchar(255) CHARSET utf8
DETERMINISTIC
BEGIN
DECLARE LOOKUP_VALUE VARCHAR(255);
 
SET LOOKUP_VALUE = (SELECT `value` FROM `TEST_Lookup` WHERE `lookupId` = LOOKUPID_IN);
 
RETURN LOOKUP_VALUE;
END;
//
DELIMITER ;

We are currently not sure if it's a bug or not, will discuss it with the team.

Comment by Sergei Petrunia [ 2022-09-15 ]

Takeaways from the discussion at the SQL Processor Call:

It looks like Stored Routines that are not declared as DETERMINISTIC have properties similar to RAND(). The optimizer has very little freedom in optimizing how to invoke them, they should be computed for every row combination that's checked.

If this is so, the observed behavior is expected. If one wants the optimizer to be able to reduce the number of times the function is evaluated they should either declare the routine as DETERMINISTIC, or rewrite the query so that the stored function is invoked after the LIMIT is applied:

select 
  date,
   `TEST_RETURN_LOOKUP_VALUE`(`categoryLookupId`) AS `propertyCategory` 
from 
(select  date,   categoryLookupId from `TEST_Table`  order by date desc limit 25) as TBL;

Comment by Oleg Smirnov [ 2022-09-28 ]

Closing this as not a bug since the customer is satisfied with the suggested solution.

Comment by Sergei Petrunia [ 2022-11-29 ]

Current code has two algorithms for ORDER BY:

1. Sort the first table.
2. Sort the temporary table.

When sorting the temporary table, the values of select list are computed when writing into the temporary table (This is the cause of the issues observed here)

Monty's patch added this:
if the select list has a function with RAND_TABLE_BIT, force use of temporary table.

Simple solution (#1)

Adjust Monty's patch: Set JOIN::rand_table_in_field_list only if the select list has
"truly random" functions.

Simple solution (#2)

Should a stored function call have RAND_TABLE_BIT?

Alternative solution (#3)

If the select list has EXPENSIVE_FUNC(t1.col1, t2.col2, ...)
then the temporary table's fields should include the arguments: (t1.col1, t2.col2, ...) but should not have the function call EXPENSIVE_FUNC(...). EXPENSIVE_FUNC(...) will be computed when reading from the temp. table.

Note that references to (t1.col1, t2.col2, ...) in EXPENSIVE_FUNC(t1.col1, t2.col2, ...) should be replaced with references to temp.table fields.

This all is implemented for GROUP BY already, consider

select EXPENSIVE_FUNC(col10, col11, ...) 
from t1
group by 
 col10, col11, ...

We just need to enable it here. This is however much more risky than solution #1 or #2.

Comment by Sergei Petrunia [ 2022-11-29 ]

It's easy implement solution #1: https://github.com/MariaDB/server/tree/bb-10.6-mdev29129 ...

(EDIT: wait it might be incorrect)

Comment by Sergei Petrunia [ 2022-12-03 ]

Fix pushed into 10.6.

Generated at Thu Feb 08 10:06:07 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.