[MDEV-9860] TokuDB ORDER BY DESC query is slower in 10.1 with index_condition_pushdown=on Created: 2016-04-01  Updated: 2016-04-03  Resolved: 2016-04-03

Status: Closed
Project: MariaDB Server
Component/s: Optimizer, Storage Engine - TokuDB
Affects Version/s: 10.1.13
Fix Version/s: 10.1.14

Type: Bug Priority: Major
Reporter: Geoff Montee (Inactive) Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: optimizer, tokudb

Attachments: File tokudb_icp_bug.sql.gz    

 Description   

Certain queries of TokuDB tables that use ORDER BY DESC have gotten slower in MariaDB 10.1 compared to 10.0 when index_condition_pushdown=on.

This may be related to this bug: https://tokutek.atlassian.net/browse/DB-534

Let's say we have the following table:

CREATE TABLE `tokudb_icp_bug` (
  `col1` int(11) NOT NULL DEFAULT '0',
  `col2` varchar(7) NOT NULL DEFAULT '',
  `col3` char(3) NOT NULL DEFAULT '',
  `col4` date NOT NULL DEFAULT '0000-00-00',
  `col5` varchar(6) NOT NULL DEFAULT '',
  `col6` char(1) DEFAULT NULL,
  PRIMARY KEY (`col4`,`col5`,`col1`),
  UNIQUE KEY `col1_idx` (`col1`),
  KEY `col2col3col4col5_idx` (`col2`,`col3`,`col4`,`col5`) `CLUSTERING`=YES
) ENGINE=TokuDB DEFAULT CHARSET=latin1;

And the following query:

SELECT SQL_NO_CACHE col4 , col5, col1 
FROM tokudb_icp_bug
WHERE col2 = "1999888" 
AND col3="099" 
AND col6 IS NOT NULL 
AND col4 BETWEEN STR_TO_DATE( "01/03/2016" , '%d/%m/%Y') 
AND STR_TO_DATE( "31/03/2016" , '%d/%m/%Y') 
ORDER BY col4 DESC, col5 DESC, col1 DESC;

On MariaDB 10.0 with index_condition_pushdown=on, we get the following query plan:

MariaDB [db1]> SELECT VERSION();
+-----------------+
| VERSION()       |
+-----------------+
| 10.0.24-MariaDB |
+-----------------+
1 row in set (0.00 sec)
 
MariaDB [db1]> SHOW GLOBAL VARIABLES LIKE 'optimizer_switch';
+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name    | Value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| optimizer_switch | index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on |
+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
 
MariaDB [db1]> EXPLAIN SELECT SQL_NO_CACHE col4 , col5, col1
    -> FROM tokudb_icp_bug
    -> WHERE col2 = "1999888"
    -> AND col3="099"
    -> AND col6 IS NOT NULL
    -> AND col4 BETWEEN STR_TO_DATE( "01/03/2016" , '%d/%m/%Y')
    -> AND STR_TO_DATE( "31/03/2016" , '%d/%m/%Y')
    -> ORDER BY col4 DESC, col5 DESC, col1 DESC;
+------+-------------+----------------+-------+------------------------------+----------------------+---------+------+------+-----------------------------+
| id   | select_type | table          | type  | possible_keys                | key                  | key_len | ref  | rows | Extra                       |
+------+-------------+----------------+-------+------------------------------+----------------------+---------+------+------+-----------------------------+
|    1 | SIMPLE      | tokudb_icp_bug | range | PRIMARY,col2col3col4col5_idx | col2col3col4col5_idx | 15      | NULL |   24 | Using where; Using filesort |
+------+-------------+----------------+-------+------------------------------+----------------------+---------+------+------+-----------------------------+
1 row in set (0.00 sec)

On MariaDB 10.1 with index_condition_pushdown=on, we get the following query plan:

MariaDB [db1]> SELECT VERSION();
+-----------------+
| VERSION()       |
+-----------------+
| 10.1.13-MariaDB |
+-----------------+
1 row in set (0.00 sec)
 
MariaDB [db1]> SHOW GLOBAL VARIABLES LIKE 'optimizer_switch';
+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name    | Value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| optimizer_switch | index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on |
+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
 
MariaDB [db1]> EXPLAIN SELECT SQL_NO_CACHE col4 , col5, col1
    -> FROM tokudb_icp_bug
    -> WHERE col2 = "1999888"
    -> AND col3="099"
    -> AND col6 IS NOT NULL
    -> AND col4 BETWEEN STR_TO_DATE( "01/03/2016" , '%d/%m/%Y')
    -> AND STR_TO_DATE( "31/03/2016" , '%d/%m/%Y')
    -> ORDER BY col4 DESC, col5 DESC, col1 DESC;
+------+-------------+----------------+-------+------------------------------+----------------------+---------+------+------+-------------+
| id   | select_type | table          | type  | possible_keys                | key                  | key_len | ref  | rows | Extra       |
+------+-------------+----------------+-------+------------------------------+----------------------+---------+------+------+-------------+
|    1 | SIMPLE      | tokudb_icp_bug | range | PRIMARY,col2col3col4col5_idx | col2col3col4col5_idx | 15      | NULL |   24 | Using where |
+------+-------------+----------------+-------+------------------------------+----------------------+---------+------+------+-------------+
1 row in set (0.00 sec)

MariaDB 10.0 completes the query in 0.0 seconds:

MariaDB [db1]> SELECT SQL_NO_CACHE col4 , col5, col1
    -> FROM tokudb_icp_bug
    -> WHERE col2 = "1999888"
    -> AND col3="099"
    -> AND col6 IS NOT NULL
    -> AND col4 BETWEEN STR_TO_DATE( "01/03/2016" , '%d/%m/%Y')
    -> AND STR_TO_DATE( "31/03/2016" , '%d/%m/%Y')
    -> ORDER BY col4 DESC, col5 DESC, col1 DESC;
+------------+--------+-----------+
| col4       | col5   | col1      |
+------------+--------+-----------+
| 2016-03-31 | 154732 | 617312062 |
| 2016-03-31 | 152220 | 617304573 |
| 2016-03-31 | 150846 | 617300219 |
| 2016-03-31 | 125003 | 617259519 |
| 2016-03-31 | 115240 | 617241909 |
| 2016-03-31 | 114357 | 617239150 |
| 2016-03-31 | 103555 | 617221171 |
| 2016-03-31 | 102942 | 617219162 |
| 2016-03-30 | 203301 | 617107392 |
| 2016-03-30 | 202449 | 617105258 |
| 2016-03-30 | 192830 | 617090437 |
| 2016-03-30 | 191350 | 617086349 |
| 2016-03-30 | 175306 | 617060614 |
| 2016-03-30 | 164238 | 617039550 |
| 2016-03-28 | 173032 | 616444951 |
| 2016-03-17 | 135427 | 613723098 |
| 2016-03-16 | 153202 | 613477432 |
| 2016-03-16 | 094315 | 613377198 |
| 2016-03-15 | 211529 | 613295731 |
| 2016-03-07 | 144453 | 610974835 |
| 2016-03-04 | 131400 | 610138965 |
| 2016-03-04 | 102825 | 610084403 |
| 2016-03-04 | 093225 | 610064611 |
| 2016-03-01 | 114944 | 609236733 |
+------------+--------+-----------+
24 rows in set (0.00 sec)

MariaDB 10.1 takes a little longer:

MariaDB [db1]> SELECT SQL_NO_CACHE col4 , col5, col1
    -> FROM tokudb_icp_bug
    -> WHERE col2 = "1999888"
    -> AND col3="099"
    -> AND col6 IS NOT NULL
    -> AND col4 BETWEEN STR_TO_DATE( "01/03/2016" , '%d/%m/%Y')
    -> AND STR_TO_DATE( "31/03/2016" , '%d/%m/%Y')
    -> ORDER BY col4 DESC, col5 DESC, col1 DESC;
+------------+--------+-----------+
| col4       | col5   | col1      |
+------------+--------+-----------+
| 2016-03-31 | 154732 | 617312062 |
| 2016-03-31 | 152220 | 617304573 |
| 2016-03-31 | 150846 | 617300219 |
| 2016-03-31 | 125003 | 617259519 |
| 2016-03-31 | 115240 | 617241909 |
| 2016-03-31 | 114357 | 617239150 |
| 2016-03-31 | 103555 | 617221171 |
| 2016-03-31 | 102942 | 617219162 |
| 2016-03-30 | 203301 | 617107392 |
| 2016-03-30 | 202449 | 617105258 |
| 2016-03-30 | 192830 | 617090437 |
| 2016-03-30 | 191350 | 617086349 |
| 2016-03-30 | 175306 | 617060614 |
| 2016-03-30 | 164238 | 617039550 |
| 2016-03-28 | 173032 | 616444951 |
| 2016-03-17 | 135427 | 613723098 |
| 2016-03-16 | 153202 | 613477432 |
| 2016-03-16 | 094315 | 613377198 |
| 2016-03-15 | 211529 | 613295731 |
| 2016-03-07 | 144453 | 610974835 |
| 2016-03-04 | 131400 | 610138965 |
| 2016-03-04 | 102825 | 610084403 |
| 2016-03-04 | 093225 | 610064611 |
| 2016-03-01 | 114944 | 609236733 |
+------------+--------+-----------+
24 rows in set (0.08 sec)

But this speed on 10.1 is much improved if index_condition_pushdown is disabled:

MariaDB [db1]> SET optimizer_switch='index_condition_pushdown=off';                                                                                          Query OK, 0 rows affected (0.00 sec)
 
MariaDB [db1]> SELECT SQL_NO_CACHE col4 , col5, col1
    -> FROM tokudb_icp_bug
    -> WHERE col2 = "1999888"
    -> AND col3="099"
    -> AND col6 IS NOT NULL
    -> AND col4 BETWEEN STR_TO_DATE( "01/03/2016" , '%d/%m/%Y')
    -> AND STR_TO_DATE( "31/03/2016" , '%d/%m/%Y')
    -> ORDER BY col4 DESC, col5 DESC, col1 DESC;
+------------+--------+-----------+
| col4       | col5   | col1      |
+------------+--------+-----------+
| 2016-03-31 | 154732 | 617312062 |
| 2016-03-31 | 152220 | 617304573 |
| 2016-03-31 | 150846 | 617300219 |
| 2016-03-31 | 125003 | 617259519 |
| 2016-03-31 | 115240 | 617241909 |
| 2016-03-31 | 114357 | 617239150 |
| 2016-03-31 | 103555 | 617221171 |
| 2016-03-31 | 102942 | 617219162 |
| 2016-03-30 | 203301 | 617107392 |
| 2016-03-30 | 202449 | 617105258 |
| 2016-03-30 | 192830 | 617090437 |
| 2016-03-30 | 191350 | 617086349 |
| 2016-03-30 | 175306 | 617060614 |
| 2016-03-30 | 164238 | 617039550 |
| 2016-03-28 | 173032 | 616444951 |
| 2016-03-17 | 135427 | 613723098 |
| 2016-03-16 | 153202 | 613477432 |
| 2016-03-16 | 094315 | 613377198 |
| 2016-03-15 | 211529 | 613295731 |
| 2016-03-07 | 144453 | 610974835 |
| 2016-03-04 | 131400 | 610138965 |
| 2016-03-04 | 102825 | 610084403 |
| 2016-03-04 | 093225 | 610064611 |
| 2016-03-01 | 114944 | 609236733 |
+------------+--------+-----------+
24 rows in set (0.00 sec)

The query is also faster on 10.1 if ORDER BY ASC is used instead of DESC:

MariaDB [db1]> SET optimizer_switch='index_condition_pushdown=on';
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [db1]> SELECT SQL_NO_CACHE col4 , col5, col1
    -> FROM tokudb_icp_bug
    -> WHERE col2 = "1999888"
    -> AND col3="099"
    -> AND col6 IS NOT NULL
    -> AND col4 BETWEEN STR_TO_DATE( "01/03/2016" , '%d/%m/%Y')
    -> AND STR_TO_DATE( "31/03/2016" , '%d/%m/%Y')
    -> ORDER BY col4 ASC, col5 ASC, col1 ASC;
+------------+--------+-----------+
| col4       | col5   | col1      |
+------------+--------+-----------+
| 2016-03-01 | 114944 | 609236733 |
| 2016-03-04 | 093225 | 610064611 |
| 2016-03-04 | 102825 | 610084403 |
| 2016-03-04 | 131400 | 610138965 |
| 2016-03-07 | 144453 | 610974835 |
| 2016-03-15 | 211529 | 613295731 |
| 2016-03-16 | 094315 | 613377198 |
| 2016-03-16 | 153202 | 613477432 |
| 2016-03-17 | 135427 | 613723098 |
| 2016-03-28 | 173032 | 616444951 |
| 2016-03-30 | 164238 | 617039550 |
| 2016-03-30 | 175306 | 617060614 |
| 2016-03-30 | 191350 | 617086349 |
| 2016-03-30 | 192830 | 617090437 |
| 2016-03-30 | 202449 | 617105258 |
| 2016-03-30 | 203301 | 617107392 |
| 2016-03-31 | 102942 | 617219162 |
| 2016-03-31 | 103555 | 617221171 |
| 2016-03-31 | 114357 | 617239150 |
| 2016-03-31 | 115240 | 617241909 |
| 2016-03-31 | 125003 | 617259519 |
| 2016-03-31 | 150846 | 617300219 |
| 2016-03-31 | 152220 | 617304573 |
| 2016-03-31 | 154732 | 617312062 |
+------------+--------+-----------+
24 rows in set (0.00 sec)



 Comments   
Comment by Sergei Petrunia [ 2016-04-03 ]

MySQL 5.6 has this fixed with this patch:
https://github.com/mysql/mysql-server/commit/da1d92fd46071cd86de61058b6ea39fd9affcd87

there was a "followup patch":

https://github.com/mysql/mysql-server/commit/d8e6db5de0b0a1497e9b07cecba34bb6cc1aaa32

and another one:

https://github.com/mysql/mysql-server/commit/b65ca959efd6ec5369165b1849407318b4886634

Comment by Sergei Petrunia [ 2016-04-03 ]

Comments at https://tokutek.atlassian.net/browse/DB-534 describe how the problem could be fixed on TokuDB side.

Besides that, we've also added this call for TokuDB:

```
result= file->prepare_range_scan((last_range->flag & NO_MIN_RANGE) ? NULL : &start_key,
(last_range->flag & NO_MAX_RANGE) ? NULL : &end_key);
```

so TokuDB has information about end_range.

Comment by Sergei Petrunia [ 2016-04-03 ]

Another related commit: we've tried to disable ICP for ORDER BY DESC queries:

https://github.com/MariaDB/server/commit/1d3ba8a791c50b7fe7036470d5f26e3f6525156d

The code on the upper layer works ok for this example: I can see that execution reaches the

           tab->table->file->cancel_pushed_idx_cond();

call.

The problem is that ha_tokudb doesn't implement it.

Comment by Sergei Petrunia [ 2016-04-03 ]

I've pushed a fix into 10.1 that disables ICP for reverse-ordered scans on TokuDB (like it is currently done for other engines):
https://github.com/MariaDB/server/commit/04737330bea89437d8069d8eeb4bc484213dfdf0

Comment by Sergei Petrunia [ 2016-04-03 ]

I was going to port MySQL's fix into 10.2, but then discovered MySQL's fix won't help for TokuDB. The reason is that the fix has this code:

    if (file->pushed_idx_cond && !eqrange_all_keyparts)
    {
      if (!eqrange_all_keyparts)
      {
        key_range min_range;
        last_range->make_min_endpoint(&min_range);
        if(min_range.length > 0)
          file->set_end_range(&min_range, handler::RANGE_SCAN_DESC);
        else
          file->set_end_range(NULL, handler::RANGE_SCAN_DESC);

and ha_tokudb objects never have a non-NULL in file->pushed_idx_cond.

Generated at Thu Feb 08 07:37:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.