[MDEV-17414] MyROCKS order desc limit 1 fails Created: 2018-10-09  Updated: 2018-10-29  Resolved: 2018-10-29

Status: Closed
Project: MariaDB Server
Component/s: Optimizer, Storage Engine - RocksDB
Affects Version/s: 10.3.10, 10.2, 10.3
Fix Version/s: 10.3.11, 10.2.19

Type: Bug Priority: Critical
Reporter: Henry Baragar Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None
Environment:

Ubuntu 18.04.1 LTS



 Description   

select * from days where date < '2018-10-09' order by date desc limit 1

fails if there is an index on date and a record for 2018-10-09 does not exist in the table.

To recreate ....

create table days (date date);
insert into days values ('2018-10-04'), ('2018-10-05');
select * from days where date < '2018-10-09' order by date desc limit 1; # Works as expected
alter table days add index date_index (date);
select * from days where date < '2018-10-05' order by date desc limit 1; # Works as expected
select * from days where date < '2018-10-09' order by date desc limit 1; # FAILS: Empty set (0.001 sec)



 Comments   
Comment by Elena Stepanova [ 2018-10-23 ]

Thanks for the report and test case. Reproducible as described.
Same workflow with InnoDB or MyISAM works all right.

Comment by Sergei Petrunia [ 2018-10-27 ]

Not reproducible on the upstream.
Well, the upstream produces a difference query plan:

MariaDB:

mysql> explain select * from days where date < '2018-10-09' order by date desc limit 1;
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key        | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
|    1 | SIMPLE      | days  | range | date_index    | date_index | 4       | NULL |    1 | Using where; Using index |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

Upstream, current FB/mysql-5.6

mysql> explain select * from days force index(date_index) where date < '2018-10-09' order by date desc limit 1;
+----+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key        | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | days  | index | date_index    | date_index | 4       | NULL |    2 | Using where; Using index |
+----+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

... but if I add more rows to the table I can get it to also pick the range plan, and it will produce the correct result.

Comment by Sergei Petrunia [ 2018-10-27 ]

Investigation:

The range we are scanning is actually NULL < date < '2018-10-09'

Date column is 3 bytes + 1 NULL byte.
'2018-10-09' is encoded as

uint tmp= ltime->year*16*32 + ltime->month*32+ltime->day = 0x0f c5 45

Comment by Sergei Petrunia [ 2018-10-27 ]

Inside myrocks::ha_rocksdb::calc_eq_cond_len, myrocks::ha_rocksdb::index_read_map_impl, we have:

(gdb) p slice
  $302 = (const rocksdb::Slice &) @0x7fffd9934210: {data_ = 0x7fff783a7dd0 "", size_ = 8}
(gdb) x/8cx slice.data_
  0x7fff783a7dd0:	0x00	0x00	0x01	0x07	0x01	0x0f	0xc5	0x49
 
(gdb) p *end_key
  $307 = {key = 0x7fff78343c60 "", length = 4, keypart_map = 1, flag = HA_READ_BEFORE_KEY}
(gdb) x/4x end_key.key
  0x7fff78343c60:	0x00	0x49	0xc5	0x0f

That is, it thinks that both start and the end keys are '2018-10-09'.

Comment by Sergei Petrunia [ 2018-10-27 ]

Comparing with the upstream:

(gdb) p slice
  $79 = (const rocksdb::Slice &) @0x7ffff40b41f0: {data_ = 0x7fffcc024bf0 "", size_ = 8}
(gdb) x/8cx slice.data_
  0x7fffcc024bf0:	0x00	0x00	0x01	0x04	0x01	0x0f	0xc5	0x49
 
(gdb) p *end_key
  $80 = {key = 0x7fffcc0d3808 "\001", length = 4, keypart_map = 1, flag = HA_READ_AFTER_KEY}
(gdb) x/4cx end_key->key
  0x7fffcc0d3808:	0x01	0x00	0x00	0x00

Comment by Sergei Petrunia [ 2018-10-27 ]

The SE API doesn't have a direct equivalent of read_range_first/next for doing reverse scans.
Instead, the QUICK_SELECT_DESC does this:

1. Inform the SE about the other end of the range.

2. Call ha_rocksdb->ha_index_read_map(..., find_flag=HA_READ_BEFORE_KEY)

Step #2 is implemented in different ways in MariaDB and in the Upstream.
The upstream' QUICK_SELECT_DESC::get_next does it as follows:

  file->set_end_range(&min_range, handler::RANGE_SCAN_DESC);

min_range here has

{key= SQL "NULL", flag = HA_READ_AFTER_KEY}

MariaDB's QUICK_SELECT_DESC::get_next uses a different API call:

    result= file->prepare_range_scan((last_range->flag & NO_MIN_RANGE) ? NULL : &start_key,
                                     (last_range->flag & NO_MAX_RANGE) ? NULL : &end_key);

(gdb) x/4x start_key.key
  0x7fff78343f48:	0x01	0x00	0x00	0x00 // HA_READ_AFTER_KEY
(gdb) x/4x end_key.key
  0x7fff78343f50:	0x00	0x49	0xc5	0x0f  // HA_READ_BEFORE_KEY

Comment by Sergei Petrunia [ 2018-10-27 ]

index_read_map_impl() actually has the code to flip the bounds:

  const key_range *cur_end_key= end_key;
  if (find_flag == HA_READ_PREFIX_LAST_OR_PREV)
    cur_end_key= m_start_range;

added by

commit 4d51009a772fdfffc6cc7ae87422a3b0b19fa06a
Author:	Sergei Petrunia <psergey@askmonty.org>  Tue May  8 13:00:26 2018
Committer:	Sergei Petrunia <psergey@askmonty.org>  Tue May  8 13:00:26 2018

But it only handles closed ranges, not open ones

Generated at Thu Feb 08 08:36:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.