[MDEV-4026] Order by speed optimization - Jira

Details

Type: Task
Status: Open (View Workflow)
Priority: Minor
Resolution: Unresolved
Fix Version/s: None
Component/s: None
Labels:
- optimizer

Description

The goal of this task is to use non-covering indexes for efficient ORDER BY ... LIMIT.
Consider the query:

select * from t1 order by c_1, ... C_k, c_k+1, ..., c_m limit L;

In certain cases there may be an index that has a prefix that coincides with a prefix
of the ORDER BY clause:

create index i1 on t1 (c_1, ... c_k, [c_j...]);

Where c_j are columns from t1 after c_k in order not coinciding
with the GROUP BY clause.

If the index is scanned it will produce groups of records with the same
prefix (c_1 ... c_k). Given a limit L, the top L records sorted by the full
GROUP BY clause are among all groups that fit inside L, plus the records in
the last group where the limit L was reached. Let the total number of records
of interest is L+G, where G is the number of remaining records in the last
group after the limit L.

Algorithm (v2):
----------------
Repeat until L+G records are scanned:

Scan index i1, and put its keys into a buffer until it is full, or
we have scanned the first L+G keys.
Sort the rowids of each key (this is the physical order of the
corresponding records on disk).
Fetch the corresponding table records in physical order into a sort buffer SB.
Sort FB according to the GROUP BY fields (c_1, ... c_m)
If there are more keys to scan, flush SB to disk for subsequent sort-merge
If there is more than one on-disk sorted buffers, merge them.
Output the sorted rows.

Implementation:
---------------
= Execution
The DS-MRR execution code already performs the step of scanning an index,
ordering the rowids of the keys and fetching table data in physical order.
In order to reuse DS-MRR for the above algorithm it is needed to:

Create a range access method for the range (-inf, +inf) for the common prefix
between the chosen index and the GROUP BY clause. If the WHERE clause allows
a range to be constructed, then reuse this range.
Modify DS-MRR so that it can stop scanning the index when a certain condition
is met. This can be done by reusing the function pointer
st_range_seq_if::skip_index_tuple which is called by
Mrr_simple_index_reader::get_next()
The condition when DS-MRR stops is, when it read L keys, remember the L-th key.
Stop reading keys once the next key is different from the L-th key.
Notice that L is the number of produced records after applying the WHERE clause.
There already exists such a counter, used by generic LIMIT execution. Reuse
this counter.

The filesort function can utilize a range scan to supply its input for sorting.
Therefore once we make filesort use the modified DS-MRR scan, filesort will
resort the fetched data and will take care of overflow on disk and merging the
sorted chunks of data that were flushed to disk.

= Applicability test and optimization

there must be an index that has a common prefix with the ORDER BY clause,
there is no suitable index that covers the whole ORDER BY clause,
LIMIT is sufficiently small

Attachments

Activity

Ascending order - Click to sort in descending order

Matthew Lagoe (Inactive) added a comment - 2013-02-06 23:13

That algo looks like it would work properly, my only concern is we have to make sure that we ensure that if there are multiple values of X that are the same that we grab all of them so that the end resulting dataset is accurate.

Matthew Lagoe (Inactive) added a comment - 2013-02-06 23:13 That algo looks like it would work properly, my only concern is we have to make sure that we ensure that if there are multiple values of X that are the same that we grab all of them so that the end resulting dataset is accurate.

Timour Katchaounov (Inactive) added a comment - 2013-02-08 13:15

The previous two ideas how to use a an index that is a partial prefix of the ORDER BY are:

Algorithm (v1):
----------------

create a bounded priority queue PQ with bound N
for each group of keys in IDX, with prefix (c_1, ..., c_k) such that
the count of the first key in the group is <= N
fetch the record corresponding to that key, and put it in PQ, using the
non-indexed columns (c_k+1 ... c_m) of ORDER BY as a key in the queue
once all keys in the group have been scanned, and their records inserted
into PQ, output all records from PQ in order, until PQ is empty or a total
of N records have been output.

The above algorithm has the following properties:

PQ will get at most as big as the number of keys in the biggest group of keys
the first result will be output as early as we reach the end of the first
group (no need to wait till all data is sorted)
there is just one pass over the index

Original algorithm (v0):
------------------------

If index exists for column X and there is a limit on the query then use the index and sort the list, then return the first G values, continue past G until you hit a different value for X then the value that was at G
Use this smaller subset of data for then ordering the data based on Y
Order the data again based on Z
Return the sorted data like normal

Timour Katchaounov (Inactive) added a comment - 2013-02-08 13:15 The previous two ideas how to use a an index that is a partial prefix of the ORDER BY are: Algorithm (v1): ---------------- create a bounded priority queue PQ with bound N for each group of keys in IDX, with prefix (c_1, ..., c_k) such that the count of the first key in the group is <= N fetch the record corresponding to that key, and put it in PQ, using the non-indexed columns (c_k+1 ... c_m) of ORDER BY as a key in the queue once all keys in the group have been scanned, and their records inserted into PQ, output all records from PQ in order, until PQ is empty or a total of N records have been output. The above algorithm has the following properties: PQ will get at most as big as the number of keys in the biggest group of keys the first result will be output as early as we reach the end of the first group (no need to wait till all data is sorted) there is just one pass over the index Original algorithm (v0): ------------------------ If index exists for column X and there is a limit on the query then use the index and sort the list, then return the first G values, continue past G until you hit a different value for X then the value that was at G Use this smaller subset of data for then ordering the data based on Y Order the data again based on Z Return the sorted data like normal

People

Assignee:: Unassigned

Reporter:: Matthew Lagoe (Inactive)

Votes:: 6 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 2013-01-13 04:53

Updated:: 2015-10-30 12:38

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server