Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18079

Opportunistic optimization for ORDER BY LIMIT N queries

    XMLWordPrintable

Details

    Description

      Motivation

      When we are optimizing ORDER BY ... LIMIT N, there is a choice between:

      A. "Filesort plan": Read the rows of interest with the best access method, then
      sort them with filesort, then take the first N rows.

      B. "Index plan": find an index that matches the required ordering, start
      reading matching rows through that index (either by scanning the entire index,
      or by doing range access on it). Stop as soon as we have found N matches.

      Plan A has a predictable cost, but it can be very "inefficient" for small
      numbers in #LIMIT - we will enumerate many rows to get #LIMIT rows.

      For Plan B, execution cost may be difficult to predict. If there is a WHERE
      clause and a small LIMIT N, how soon we will find N matches?

      • Selectivity of the condition is not always known
      • Even if the selectivity of the condition is known (suppose it's 50%), how are
        matching rows distributed in the index? If we encounter the matches first,
        we will only enumerate N rows. If the matches will be encountered last,
        we will have to enumerate 50% + N rows before we find N matches.

      The idea to address this is to do both:

      Basic idea

      • Start with the Index Plan, and scan a certain number of records
      • If we get a sufficient number of matches, continue.
      • Otherwise, switch to the FileSort plan.

      Details

      Duplication of a part of the result

      If we start running the Index plan, it may produce a part of query output.

      Then, Filesort plan will produce the entire query output. We'll need to make
      sure the duplicate part of the query output is removed.

      Possible solutions:
      A. Don't pass the output of the Index Plan to the client. Accumulate it in a
      temporary file (or table), and discard it if we decide to switch to the
      Filesort plan.

      B. Pass the output rows of Index Plan to the client. If we emitted K rows, and
      then switched to Filesort plan, we will need to discard the first K rows when
      reading the filesort output.

      Solution #B is not workable

      The problem is "peer rows" - rows that compare as equal for ORDER BY, but are
      not identical for the user. Consider a query

      SELECT last_name,first_name FROM people WHERE ... ORDER BY last_name LIMIT 10;

      Suppose the table has rows:
      ('Ivanov', 'Ivan')
      ('Ivanov', 'Peter')

      and the Index Plan produces this as first output row:

      ('Ivanov', 'Ivan')

      Then we decide to stop and switch to Filesort plan. The filesort plan produces

      ('Ivanov', 'Peter')
      ('Ivanov', 'Ivan')
      ...

      This shows the technique of "skip the first #K output rows from the filesort plan
      output" is not workable.

      (Unless we extend the sort criteria with something like underlying table rowid,
      which would make all rows non-peers).

      Conditions for using this optimization

      The proposed execution strategy may cause double work to be done: the same rows
      will be first enumerated by Index Plan and then by the Filesort Plan.
      That is, the query will run with filesort plan but will have extra overhead of
      running with an Index Plan, too.

      Also, it doesn't make sense to take risks with an Index Plan if it will have to
      enumerate nearly as many rows as the filesort plan.

      First attempt at the optimization criteria:

      optimistic_index_cost = cost of Index Plan, assuming we will be "reasonably lucky".
      // "reasonably lucky" here may mean:
      //  - matching records will come first
      //  - matching records will be uniformly distributed across the enumerated set.
      

      filesort_cost = Cost of filesort plan;
      

      if (optimistic_index_cost < filesort_cost * 0.1)
        use opportinistic query plan;
      

      Attachments

        Issue Links

          Activity

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              8 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.