[MDEV-20967] MRR implementation for MyRocks: cost-based choice Created: 2019-11-04  Updated: 2020-01-23  Resolved: 2019-11-05

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - RocksDB
Fix Version/s: N/A

Type: Task Priority: Major
Reporter: Sergei Petrunia Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-20433 MRR implementation for MyRocks Closed

 Description   

Investigating whether this is feasible



 Comments   
Comment by Sergei Petrunia [ 2019-11-04 ]

Test benchmark script: https://gist.github.com/spetrunia/f40e044652098be7f33c1becade03000

Observations from benchmarking MRR:

  • 50M rows table (10G size), lookup_list_size=10 - speedup is 1.17x
  • The same as abovel but lookup_list_size=100 - speedup is 1.6x. (The lookup values are next to each other: n, n+1, n+2, ...)
  • The same as above but lookup list values are spread over the whole table: the speedup is 1.2x
  • A larger dataset (20G, 100M rows), adjacent lookup values: the speedup is 1.57x
  • Same as above, but lookup_list_size=500: speedup is 1.48x for adjacent keys and 1.13 for randomly-distributed.
Comment by Sergei Petrunia [ 2019-11-04 ]
  • also tried a small table (1M rows, the datadir is 295M): the speedup is 1.9x
Comment by Sergei Petrunia [ 2019-11-04 ]

So far it seems MRR between 1.1 and 2x faster regardless of database size and/or key distribution?

An obvious case where MRR is slower is when the query doesn't read all of select output. (The most obvious way do this is to have LIMIT n). But this cannot be helped by computations in ha_rocksdb::multi_range_read_info*, because that call doesn't have enough information to detect such cases.

Generated at Thu Feb 08 09:03:35 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.