Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5947

MySQL Bug#11757108 CHANGE IN EXECUTION PLAN FOR COUNT_DISTINCT_GROUP_ON_KEY CAUSES PEFORMANCE REGRESSION

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.9
    • Fix Version/s: 10.0.12
    • Component/s: None
    • Labels:
      None

      Description

      revno: 4151
      committer: Olav Sandstaa <olav.sandstaa@oracle.com>
      branch nick: loose-scan-perf-5.6
      timestamp: Thu 2012-08-23 09:45:33 +0200
      message:
        Fix for Bug#11757108 CHANGE IN EXECUTION PLAN FOR COUNT_DISTINCT_GROUP_ON_KEY
                             CAUSES PEFORMANCE REGRESSION
       
        The cause for the performance regression is that the access strategy for the
        GROUP BY query is changed form using "index scan" in mysql-5.1 to use "loose
        index scan" in mysql-5.5. The index used for group by is unique and thus each
        "loose scan" group will only contain one record. Since loose scan needs to
        re-position on each "loose scan" group this query will do a re-position for
        each index entry. Compared to just reading the next index entry as a normal
        index scan does, the use of loose scan for this query becomes more expensive.
       
        The cause for selecting to use loose scan for this query is that in the current
        code when the size of the "loose scan" group is one, the formula for
        calculting the cost estimates becomes almost identical to the cost of using
        normal index scan. Differences in use of integer versus floating point aritmetic
        can cause one or the other access strategy to be selected.
       
        The main issue with the formula for estimating the cost of using loose scan is
        that it does not take into account that it is more costly to do a re-position
        for each "loose scan" group compared to just reading the next index entry.
        Both index scan and loose scan estimates the cpu cost as:
       
          "number of entries needed too read/scan" * ROW_EVALUATE_COST
       
        The results from testing with the query in this bug indicates that the real
        cost for doing re-position four to eight times higher than just reading the
        next index entry. Thus, the cpu cost estimate for loose scan should be increased.
        To account for the extra work to re-position in the index we increase the
        cost for loose index scan to include the cost of navigating the index.
        This is modelled as a function of the height of the b-tree:
       
          navigation cost= ceil(log(records in table)/log(indexes per block))
                         * ROWID_COMPARE_COST;
       
        This will avoid loose index scan being used for indexes where the "loose scan"
        group contains very few index entries.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              igor Igor Babaev
              Reporter:
              svoj Sergey Vojtovich
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: