Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21136

InnoDB's records_in_range estimates can be way off

Details

    Description

      It is expected that the range estimates that InnoDB returns from records_in_range call are not exact. Returning about 2x more or 2x less than actual value is considered to be still ok.

      However, we are observing a 4.8x difference on a fairly large table. Unfortunately, the table is at the customer site and cannot be provided here.

      Attachments

        Issue Links

          Activity

            I filed MDEV-21895 for a task of refactoring ha_innobase::records_in_range() to address the problems that I am aware of. It might be possible to do something in earlier versions, especially if a repeatable test case can be provided.

            Note that there are also some strange things around handler::stats.records in handler::info(), as noted in MDEV-20169. That might also affect some query plans.

            marko Marko Mäkelä added a comment - I filed MDEV-21895 for a task of refactoring ha_innobase::records_in_range() to address the problems that I am aware of. It might be possible to do something in earlier versions, especially if a repeatable test case can be provided. Note that there are also some strange things around handler::stats.records in handler::info() , as noted in MDEV-20169 . That might also affect some query plans.

            Related to this, BTR_ESTIMATE and btr_cur_t::path_arr should be removed. They are only being used in the function btr_estimate_n_rows_in_range_low(). That function had better implement a specialized version of btr_cur_search_to_nth_level() to gather the path information.

            marko Marko Mäkelä added a comment - Related to this, BTR_ESTIMATE and btr_cur_t::path_arr should be removed. They are only being used in the function btr_estimate_n_rows_in_range_low() . That function had better implement a specialized version of btr_cur_search_to_nth_level() to gather the path information.

            origin/bb-10.6-MDEV-21136-records_in_range-experimental be4adf01a474a434ba3140f0449e45828d75cdb7 2022-07-01T17:56:47+03:00
            performed well in RQG testing. The crashes observed happen on the main trees too.
            

            mleich Matthias Leich added a comment - origin/bb-10.6-MDEV-21136-records_in_range-experimental be4adf01a474a434ba3140f0449e45828d75cdb7 2022-07-01T17:56:47+03:00 performed well in RQG testing. The crashes observed happen on the main trees too.

            See the slides to understand pages latching order during estimation.

            vlad.lesin Vladislav Lesin added a comment - See the slides to understand pages latching order during estimation.

            vlad.lesin, what are p1 and p2 there? Do I get it right that the slides describe locking for records_in_range(p1, p2) ?

            Looking at slides #18 and #19... so, for records_in_range('aaa', 'zzz') the code will obtain latches for all pages that hold values between 'aaa' and 'zzz' ? Even if that is thousands of pages?

            psergei Sergei Petrunia added a comment - vlad.lesin , what are p1 and p2 there? Do I get it right that the slides describe locking for records_in_range(p1, p2) ? Looking at slides #18 and #19... so, for records_in_range('aaa', 'zzz') the code will obtain latches for all pages that hold values between 'aaa' and 'zzz' ? Even if that is thousands of pages?
            vlad.lesin Vladislav Lesin added a comment - - edited

            psergei,

            > what are p1 and p2 there?
            p1 and p2 are cursors, which are used to find 'aaa' and 'zzz' leaves in b-tree.

            > Looking at slides #18 and #19... so, for records_in_range('aaa', 'zzz') the code will obtain latches for all pages that hold values between 'aaa' and 'zzz' ? Even if that is thousands of pages?

            No, there is some limit on pages count(9 pages) for per-level walking, if the right page is not reached and the the limit was exceeded, the number of rows is estimated approximately by some formula, otherwise the estimation is exact. See btr_estimate_n_rows_in_range_on_level(), it has quite detailed description.

            And, as you can see on the slides, the latches are released when we leave some page. The maximum amount of latched pages(red rectangles in the slides) at the same time is 4.

            vlad.lesin Vladislav Lesin added a comment - - edited psergei , > what are p1 and p2 there? p1 and p2 are cursors, which are used to find 'aaa' and 'zzz' leaves in b-tree. > Looking at slides #18 and #19... so, for records_in_range('aaa', 'zzz') the code will obtain latches for all pages that hold values between 'aaa' and 'zzz' ? Even if that is thousands of pages? No, there is some limit on pages count(9 pages) for per-level walking, if the right page is not reached and the the limit was exceeded, the number of rows is estimated approximately by some formula, otherwise the estimation is exact. See btr_estimate_n_rows_in_range_on_level(), it has quite detailed description. And, as you can see on the slides, the latches are released when we leave some page. The maximum amount of latched pages(red rectangles in the slides) at the same time is 4.

            People

              vlad.lesin Vladislav Lesin
              psergei Sergei Petrunia
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.