Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Done
-
None
-
Q1/2026 Server Development
Description
The idea is to check performance improvement of MDEV-37330.
Test Result Summary
The speedup depends on the query pattern, the number of partitions used, and dataset size.
We ran the tests on a relatively small dataset (~ 5G, so it was likely all in buffer pool). Testing on bigger dataset which is not necessarily in memory might bring more benefit.
The test used 48 partitions, which is average amount.
The results are:
| Query pattern | Speedup |
| MAX(parition_date_col) | 3.3x |
| ORDER BY partition_date_col DESC LIMIT 1 | 1.63x |
| ordered scan | 1.094x |
Details can be found below.
Details
Test 1
Take the query from its description, a mock dataset jan27-sales-12partitions.sql
select * from t1 where user_id=... ORDER BY date_created DESC LIMIT 1 |
and run it.
Result 1 : not much improvement
According to mariadb-pavithrapandith, there wasn't much improvement.
The only big difference in ANALYZE FORMAT=JSON output was:
On MDEV-37330 branch : "pages_accessed": 46
|
On main branch : "pages_accessed": 90
|
Note that the query is supposed to access 1 partition instead of 12. But there is still only 2x savings in pages_accessed.
Test1, experiment1: disable range optimization
Tried this:
analyze format=json
|
select /*+ NO_RANGE_OPTIMIZATION(sales) */ * from sales where user_id=5 ORDER BY sales_date desc limit 1; |
Got:
On MDEV-37330 branch: "pages_accessed": 4
|
On main branch: "pages_accessed": 48
|
This confirms the theory that Range Optimization was doing about half of the page accesses.
On main branch, we get 90 -> 48 page accesses.
On the MDEV-37330 code we get 45 - > 4 page accesses.
The query speeds are :
"r_total_time_ms": 0.087955664
|
"r_total_time_ms": 0.204567552,
|
about 2x speedup now.
Result 2: Larger artificial query
Take the same table with 12 partitions but construct a query that
- would not benefit from range optimizer pre-reading data in all partitions.
- would lookup "last date from some partition" repeatedly
This is jan30-big-artifical-query.sql
With subquery cache it makes about 100 lookups (the dataset has 100 users):
"r_engine_stats": { "pages_accessed": 400 },
|
"r_engine_stats": { "pages_accessed": 4804},
|
"r_total_time_ms": 6.166011166,
|
"r_total_time_ms": 9.675008024,
|
With disabled subquery cache, subquery will be executed 10K times and we will get:
"pages_accessed": 40,000
|
"pages_accessed": 480,419
|
and
"r_total_time_ms": 57.19026293,
|
"r_total_time_ms": 218.0788524,
|
No result for customer ticket query
A related customer ticket doesn't have enough info to reconstruct their case.
Result 3: Query with MAX
link
12 partitions: 1.5 or 1.8x speedup depending on data size
48 partitions: 2.2 or 3.3x speedup depending on data size (3.6x on another machine)
Result 4: ORDER BY ... LIMIT 1 again, on a larger dataset
link
48 partitions: 1.63x speedup.
Result 5: Long ordered scan
A query doing a long ordered scan: link
1.094x speedup
Attachments
Issue Links
- split from
-
MDEV-37330 Ordered scans over PARTITION BY RANGE should not use priority queue
-
- Closed
-