Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21377

Eq_ref access not picked by query with optimizer_use_condition_selectivity > 1

Details

    Description

      I recently upgraded from 10.2 to 10.4 and had massive performance problems since then (plus 50% CPU load, very slow performance of many cron jobs which now take double the time). While I am still searching for many of the reasons, I found one query in particular in which MariaDB does not use a primary index, which leads to the query taking indefinitely long (1 hour+) instead of about 110ms. This is that query:

      SELECT
        p.product_id,
        SUM(op.quantity) AS quantity_sold,
        COUNT(DISTINCT o.order_id) AS orders_number
      FROM
        shop_aapo.orders o,
        shop_aapo.orders_products op,
        (
          (
            (
              shop_aapo.products p
                LEFT JOIN shop_aapo.admin_products_ausserhandel a ON (a.product_id = p.product_id AND a.admin_id = 1)
            )
              LEFT JOIN shop_aapo.products_availability_sources_voigt v ON (p.product_id = v.product_id)
          )
            LEFT JOIN shop_aapo.products_availability_sources_galexis_oldschool g ON (p.product_id = g.product_id AND g.last_checked > 1576505788)
        )
          LEFT JOIN shop_aapo.products_availability_sources_galexis_csv g2 ON (g2.product_id = p.product_id AND g2.outdated = 0)
      WHERE
        o.order_id = op.order_id
        AND op.product_id = p.product_id
        AND o.order_status = "active"
        AND (
          v.product_status IN ("C","K","U","Z","S")
          OR g.available_msg IN ("Ausser Handel","Fehlt beim Lieferanten")
          OR g2.status_np IN (5,7)
          OR g2.status_la IN (5,7)
        )
        AND o.create_date >= 1569161788
        AND (
          a.product_id IS NULL
          OR a.snooze_ts < 1576073788
        )
      GROUP BY
        p.product_id
      HAVING
        quantity_sold>5
        AND orders_number>2
      ORDER BY
        quantity_sold DESC
      LIMIT
        0, 51
      

      Many of the parts of the query are not important, what is important are all the LEFT JOINs I am doing - because each left join is using the PRIMARY index of a table, matching exactly one entry (or no entry), yet with the very last LEFT JOIN of the alias g2 MariaDB is not using that PRIMARY index, leading to the EXPLAIN output in mariadb_explain1.png I attached to this issue.

      When I add "FORCE INDEX FOR JOIN (PRIMARY)" after "g2" in the query, then MariaDB correctly joins the table and you can see the EXPLAIN in mariadb_explain2.png attached to this issue.

      I switched off all new optimizer switches since MariaDB 10.2, which did not solve this problem, so I am guessing this must be some other kind of bug. As you can see from the EXPLAIN MariaDB knows it could use the PRIMARY index, yet decides against it to do "Using where; Using join buffer", which is a million times slower unfortunately.

      Attachments

        Issue Links

          Activity

            This can be closed as fixed in 11.0

            psergei Sergei Petrunia added a comment - This can be closed as fixed in 11.0

            Ok I've ran both examples on current 11.0 (as was also done earlier). Both produce good results.

            Looking at the code, I see that the concern from this comment: https://jira.mariadb.org/browse/MDEV-21377?focusedCommentId=245678&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-245678 has been addressed as well: prev_record_reads() will not return values greater than record_count().

            psergei Sergei Petrunia added a comment - Ok I've ran both examples on current 11.0 (as was also done earlier). Both produce good results. Looking at the code, I see that the concern from this comment: https://jira.mariadb.org/browse/MDEV-21377?focusedCommentId=245678&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-245678 has been addressed as well: prev_record_reads() will not return values greater than record_count().
            psergei Sergei Petrunia added a comment - - edited

            One other interesting observation.

            In best_access_path(), if I look at the return value of prev_record_reads() and compare that with record_count, there are plenty of cases where prev_record_reads() > record_count. This doesn't look logical - how can number of different key values for ref access be bigger than the total number of incoming record combinations?

            Capping the return value of prev_record_reads() to be no larger than record_count causes quite a few changes in the query plans.

            psergei Sergei Petrunia added a comment - - edited One other interesting observation. In best_access_path(), if I look at the return value of prev_record_reads() and compare that with record_count , there are plenty of cases where prev_record_reads() > record_count . This doesn't look logical - how can number of different key values for ref access be bigger than the total number of incoming record combinations? Capping the return value of prev_record_reads() to be no larger than record_count causes quite a few changes in the query plans.

            Was studying the available testcases on 11.0.

            The testcase on the "mock dataset" is not a very good example actually as the last table (g2) has just one row (which means the cost of full scan isn't that different from that of ref access, although our formulas make it slightly higher)

            The "small testcase" with t1,t2,t3 is better.

            Why does bb-11.0 pick eq_ref there?
            Two reasons (one intentional, one accidental):

            The accidental reason is that full table scan is not considered due to this heuristics in best_access_path():

                /*
                  Don't test table scan if it can't be better.
                  Prefer key lookup if we would use the same key for scanning.
                  ...
                */
                if ((best.records_read >= s->found_records ||
                     best.cost > s->read_time) &&                                      // (1)
            

            The intentional reason is the change added by this commit:

            commit 93ad4e1612744809a3ef9c5263636fc14bed1ddf
            Author: Monty <monty@mariadb.org>
            Date:   Fri Dec 2 17:18:50 2022 +0200
             
                In best_access_path() change record_count to 1.0 if its less than 1.0.
            

            How does it help?

            Background info: This bug requires the query plan to have these properties:

            t1 
            t2  very_selective_condition
            t3  [eq_]ref access on t3.key=t1.col
            

            When considering access methods for table t3, the following will happen:

            • ref access: the optimizer uses prev_record_reads() to estimate how many eq_ref lookups will be made for table t3. Note that the return value of prev_record_reads(t3.key=t1.col) is not affected by the very_selective_condition .
            • full table scan: this will multiply the access cost by record_count.

            very_selective_condition makes record_count very low (For the original example, it is incorrect as we're guaranteed to have at least one incoming row_combination due to outer join).

            A very low record_count value is multiplied by the cost of full scan, and the result is cheaper than cost of eq_ref multiplied by prev_record_reads() return value.

            The above mentioned patch,

             93ad4e In best_access_path() change record_count to 1.0 if its less than 1.0.
            

            will apparently fix this.

            psergei Sergei Petrunia added a comment - Was studying the available testcases on 11.0. The testcase on the "mock dataset" is not a very good example actually as the last table (g2) has just one row (which means the cost of full scan isn't that different from that of ref access, although our formulas make it slightly higher) The "small testcase" with t1,t2,t3 is better. Why does bb-11.0 pick eq_ref there? Two reasons (one intentional, one accidental): The accidental reason is that full table scan is not considered due to this heuristics in best_access_path(): /* Don't test table scan if it can't be better. Prefer key lookup if we would use the same key for scanning. ... */ if ((best.records_read >= s->found_records || best.cost > s->read_time) && // (1) The intentional reason is the change added by this commit: commit 93ad4e1612744809a3ef9c5263636fc14bed1ddf Author: Monty <monty@mariadb.org> Date: Fri Dec 2 17:18:50 2022 +0200   In best_access_path() change record_count to 1.0 if its less than 1.0. How does it help? Background info: This bug requires the query plan to have these properties: t1 t2 very_selective_condition t3 [eq_]ref access on t3.key=t1.col When considering access methods for table t3, the following will happen: ref access: the optimizer uses prev_record_reads() to estimate how many eq_ref lookups will be made for table t3. Note that the return value of prev_record_reads(t3.key=t1.col) is not affected by the very_selective_condition . full table scan: this will multiply the access cost by record_count . very_selective_condition makes record_count very low (For the original example, it is incorrect as we're guaranteed to have at least one incoming row_combination due to outer join). A very low record_count value is multiplied by the cost of full scan, and the result is cheaper than cost of eq_ref multiplied by prev_record_reads() return value. The above mentioned patch, 93ad4e In best_access_path() change record_count to 1.0 if its less than 1.0. will apparently fix this.

            Tried with 11.0 tree, both with this tip cset:

            commit d64cd6c004d6056ba2ab49bb0b1fa09babebe813 (HEAD -> bb-11.0, origin/bb-11.0)
            Author: Monty <monty@mariadb.org>
            Date:   Fri Dec 2 17:18:50 2022 +0200
             
                In best_access_path() change record_count to 1.0 if its less than 1.0.
                
                In essence this means that we expect the user query to have at least
                one matching row in the end.
            

            and this:

            commit 9595df9921dff28310652c9b65e6a5e4f0ba5bb4
            Author: Monty <monty@mariadb.org>
            Date:   Mon Nov 28 15:02:34 2022 +0200
             
                Changed some startup warnings to notes
                
                - Changed 'WARNING' of type "You need to use --log-bin to make ... work"
                  to 'Note'
            

            In both cases, the required join order is produced and the table g2 is using eq_ref access.

            psergei Sergei Petrunia added a comment - Tried with 11.0 tree, both with this tip cset: commit d64cd6c004d6056ba2ab49bb0b1fa09babebe813 (HEAD -> bb-11.0, origin/bb-11.0) Author: Monty <monty@mariadb.org> Date: Fri Dec 2 17:18:50 2022 +0200   In best_access_path() change record_count to 1.0 if its less than 1.0. In essence this means that we expect the user query to have at least one matching row in the end. and this: commit 9595df9921dff28310652c9b65e6a5e4f0ba5bb4 Author: Monty <monty@mariadb.org> Date: Mon Nov 28 15:02:34 2022 +0200   Changed some startup warnings to notes - Changed 'WARNING' of type "You need to use --log-bin to make ... work" to 'Note' In both cases, the required join order is produced and the table g2 is using eq_ref access.

            People

              psergei Sergei Petrunia
              iquito Andreas Leathley
              Votes:
              3 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.