Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30514

Different plan and slower query with histogram type JSON_HB

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Won't Fix
    • 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11
    • N/A
    • Optimizer

    Description

      Set to critical, because in 11.0 JSON_HB is planned to become a default value, so it will become a regression. At least its extent needs to be analyzed.

      CREATE DATABASE dbt3;
      USE dbt3;
       
      set default_storage_engine=Aria;
       
      --source include/dbt3_s001.inc
       
      ANALYZE TABLE part, partsupp, supplier, lineitem, orders, customer PERSISTENT FOR ALL;
       
      --start_timer
      SELECT COUNT(*) AS cnt FROM part LEFT JOIN partsupp ON ( p_partkey = ps_partkey ) JOIN supplier ON (s_suppkey = ps_suppkey) LEFT JOIN lineitem ON ( ps_suppkey = l_suppkey ) JOIN orders ON ( l_orderkey = o_orderkey ) JOIN customer ON ( o_custkey = c_custkey ) WHERE ps_partkey < 130 AND ps_comment IS NOT NULL HAVING cnt < 0;
      --end_timer
       
      EXPLAIN EXTENDED
      SELECT COUNT(*) AS cnt FROM part LEFT JOIN partsupp ON ( p_partkey = ps_partkey ) JOIN supplier ON (s_suppkey = ps_suppkey) LEFT JOIN lineitem ON ( ps_suppkey = l_suppkey ) JOIN orders ON ( l_orderkey = o_orderkey ) JOIN customer ON ( o_custkey = c_custkey ) WHERE ps_partkey < 130 AND ps_comment IS NOT NULL HAVING cnt < 0;
       
      DROP DATABASE dbt3;
      

      With DOUBLE_PREC_HB, the query takes 0.14 sec on my machine, on 10.8.6 release build. With JSON_HB it is 1.8 sec.

      Plans:

      10.8.6 with DOUBLE_PREC_HB

      +------+-------------+----------+--------+--------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+--------------------------------------------------------------+
      | id   | select_type | table    | type   | possible_keys                                          | key         | key_len | ref                                         | rows | filtered | Extra                                                        |
      +------+-------------+----------+--------+--------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+--------------------------------------------------------------+
      |    1 | SIMPLE      | supplier | index  | PRIMARY                                                | PRIMARY     | 4       | NULL                                        | 10   |   100.00 | Using index                                                  |
      |    1 | SIMPLE      | lineitem | ref    | PRIMARY,i_l_suppkey,i_l_orderkey,i_l_orderkey_quantity | i_l_suppkey | 5       | test.supplier.s_suppkey                     | 600  |   100.00 |                                                              |
      |    1 | SIMPLE      | orders   | eq_ref | PRIMARY,i_o_custkey                                    | PRIMARY     | 4       | test.lineitem.l_orderkey                    | 1    |   100.00 | Using where                                                  |
      |    1 | SIMPLE      | customer | eq_ref | PRIMARY                                                | PRIMARY     | 4       | test.orders.o_custkey                       | 1    |   100.00 | Using index                                                  |
      |    1 | SIMPLE      | part     | range  | PRIMARY                                                | PRIMARY     | 4       | NULL                                        | 129  |   100.00 | Using where; Using index; Using join buffer (flat, BNL join) |
      |    1 | SIMPLE      | partsupp | eq_ref | PRIMARY,i_ps_partkey,i_ps_suppkey                      | PRIMARY     | 8       | test.part.p_partkey,test.supplier.s_suppkey | 1    |    99.22 | Using where                                                  |
      +------+-------------+----------+--------+--------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+--------------------------------------------------------------+
      | Note  | 1003 | select count(0) AS `cnt` from `test`.`part` join `test`.`partsupp` join `test`.`supplier` join `test`.`lineitem` join `test`.`orders` join `test`.`customer` where `test`.`customer`.`c_custkey` = `test`.`orders`.`o_custkey` and `test`.`orders`.`o_orderkey` = `test`.`lineitem`.`l_orderkey` and `test`.`lineitem`.`l_suppkey` = `test`.`supplier`.`s_suppkey` and `test`.`partsupp`.`ps_suppkey` = `test`.`supplier`.`s_suppkey` and `test`.`partsupp`.`ps_partkey` = `test`.`part`.`p_partkey` and `test`.`part`.`p_partkey` < 130 and `test`.`partsupp`.`ps_comment` is not null |
      

      10.8.6 with JSON_HB

      +------+-------------+----------+--------+--------------------------------------------------------+--------------+---------+--------------------------+------+----------+-------------+
      | id   | select_type | table    | type   | possible_keys                                          | key          | key_len | ref                      | rows | filtered | Extra       |
      +------+-------------+----------+--------+--------------------------------------------------------+--------------+---------+--------------------------+------+----------+-------------+
      |    1 | SIMPLE      | supplier | index  | PRIMARY                                                | PRIMARY      | 4       | NULL                     | 10   |   100.00 | Using index |
      |    1 | SIMPLE      | partsupp | ref    | PRIMARY,i_ps_partkey,i_ps_suppkey                      | i_ps_suppkey | 4       | test.supplier.s_suppkey  | 70   |    40.00 | Using where |
      |    1 | SIMPLE      | part     | eq_ref | PRIMARY                                                | PRIMARY      | 4       | test.partsupp.ps_partkey | 1    |   100.00 | Using index |
      |    1 | SIMPLE      | lineitem | ref    | PRIMARY,i_l_suppkey,i_l_orderkey,i_l_orderkey_quantity | i_l_suppkey  | 5       | test.supplier.s_suppkey  | 600  |   100.00 |             |
      |    1 | SIMPLE      | orders   | eq_ref | PRIMARY,i_o_custkey                                    | PRIMARY      | 4       | test.lineitem.l_orderkey | 1    |   100.00 | Using where |
      |    1 | SIMPLE      | customer | eq_ref | PRIMARY                                                | PRIMARY      | 4       | test.orders.o_custkey    | 1    |   100.00 | Using index |
      +------+-------------+----------+--------+--------------------------------------------------------+--------------+---------+--------------------------+------+----------+-------------+
      | Note  | 1003 | select count(0) AS `cnt` from `test`.`part` join `test`.`partsupp` join `test`.`supplier` join `test`.`lineitem` join `test`.`orders` join `test`.`customer` where `test`.`customer`.`c_custkey` = `test`.`orders`.`o_custkey` and `test`.`orders`.`o_orderkey` = `test`.`lineitem`.`l_orderkey` and `test`.`partsupp`.`ps_suppkey` = `test`.`supplier`.`s_suppkey` and `test`.`lineitem`.`l_suppkey` = `test`.`supplier`.`s_suppkey` and `test`.`part`.`p_partkey` = `test`.`partsupp`.`ps_partkey` and `test`.`partsupp`.`ps_partkey` < 130 and `test`.`partsupp`.`ps_comment` is not null |
      

      Attachments

        Activity

          On the current bb-11.0 tree, I get the same join order with JSON_HB and DOUBLE_PREC_HB.
          It matches the join order that the report says 10.8 produces with JSON_HB:

          +------+-------------+----------+--------+--------------------------------------------------------+--------------+---------+-----------------------------+------+----------+-------------+
          | id   | select_type | table    | type   | possible_keys                                          | key          | key_len | ref                         | rows | filtered | Extra       |
          +------+-------------+----------+--------+--------------------------------------------------------+--------------+---------+-----------------------------+------+----------+-------------+
          |    1 | SIMPLE      | supplier | index  | PRIMARY                                                | PRIMARY      | 4       | NULL                        | 10   |   100.00 | Using index |
          |    1 | SIMPLE      | partsupp | ref    | PRIMARY,i_ps_partkey,i_ps_suppkey                      | i_ps_suppkey | 4       | dbt3old.supplier.s_suppkey  | 70   |    39.69 | Using where |
          |    1 | SIMPLE      | part     | eq_ref | PRIMARY                                                | PRIMARY      | 4       | dbt3old.partsupp.ps_partkey | 1    |   100.00 | Using index |
          |    1 | SIMPLE      | lineitem | ref    | PRIMARY,i_l_suppkey,i_l_orderkey,i_l_orderkey_quantity | i_l_suppkey  | 5       | dbt3old.supplier.s_suppkey  | 600  |   100.00 |             |
          |    1 | SIMPLE      | orders   | eq_ref | PRIMARY,i_o_custkey                                    | PRIMARY      | 4       | dbt3old.lineitem.l_orderkey | 1    |   100.00 | Using where |
          |    1 | SIMPLE      | customer | eq_ref | PRIMARY                                                | PRIMARY      | 4       | dbt3old.orders.o_custkey    | 1    |   100.00 | Using index |
          +------+-------------+----------+--------+--------------------------------------------------------+--------------+---------+-----------------------------+------+----------+-------------+
          

          It takes about 10 seconds on my machine on debug build.

          psergei Sergei Petrunia added a comment - On the current bb-11.0 tree, I get the same join order with JSON_HB and DOUBLE_PREC_HB. It matches the join order that the report says 10.8 produces with JSON_HB: +------+-------------+----------+--------+--------------------------------------------------------+--------------+---------+-----------------------------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+----------+--------+--------------------------------------------------------+--------------+---------+-----------------------------+------+----------+-------------+ | 1 | SIMPLE | supplier | index | PRIMARY | PRIMARY | 4 | NULL | 10 | 100.00 | Using index | | 1 | SIMPLE | partsupp | ref | PRIMARY,i_ps_partkey,i_ps_suppkey | i_ps_suppkey | 4 | dbt3old.supplier.s_suppkey | 70 | 39.69 | Using where | | 1 | SIMPLE | part | eq_ref | PRIMARY | PRIMARY | 4 | dbt3old.partsupp.ps_partkey | 1 | 100.00 | Using index | | 1 | SIMPLE | lineitem | ref | PRIMARY,i_l_suppkey,i_l_orderkey,i_l_orderkey_quantity | i_l_suppkey | 5 | dbt3old.supplier.s_suppkey | 600 | 100.00 | | | 1 | SIMPLE | orders | eq_ref | PRIMARY,i_o_custkey | PRIMARY | 4 | dbt3old.lineitem.l_orderkey | 1 | 100.00 | Using where | | 1 | SIMPLE | customer | eq_ref | PRIMARY | PRIMARY | 4 | dbt3old.orders.o_custkey | 1 | 100.00 | Using index | +------+-------------+----------+--------+--------------------------------------------------------+--------------+---------+-----------------------------+------+----------+-------------+ It takes about 10 seconds on my machine on debug build.

          I could reproduce the issue on 10.8.

          psergei Sergei Petrunia added a comment - I could reproduce the issue on 10.8.

          Conditions that could be affected by histogram selectivity:

          • partsupp.ps_partkey < 130 – not using histograms, because there is an index. Range analyzer provides the estimate.
          • partsupp.ps_comment is not null – Does use histograms..

          DOUBLE_PREC_HB: selectivity_from_histogram": 0.9921875
          JSON_HB: selectivity_from_histogram": 1

          In reality, the table has 700 rows and 0 rows have "ps_comment is null".

          (in both cases, mysql.column_stats.nulls_ratio=0.000, I'm not sure where DOUBLE_PREC_HB gets 0.9921875 value from... 1/ (1- 0.9921875) = 128...

          In any case, if the estimate is off by 1/128th, this doesn't look like a meaningful reason for 10x increase in query speed..

          psergei Sergei Petrunia added a comment - Conditions that could be affected by histogram selectivity: partsupp.ps_partkey < 130 – not using histograms, because there is an index. Range analyzer provides the estimate. partsupp.ps_comment is not null – Does use histograms.. DOUBLE_PREC_HB: selectivity_from_histogram": 0.9921875 JSON_HB: selectivity_from_histogram": 1 In reality, the table has 700 rows and 0 rows have "ps_comment is null". (in both cases, mysql.column_stats.nulls_ratio=0.000, I'm not sure where DOUBLE_PREC_HB gets 0.9921875 value from... 1/ (1- 0.9921875) = 128... In any case, if the estimate is off by 1/128th, this doesn't look like a meaningful reason for 10x increase in query speed..

          Trying the good join order on bb-11.0: mdev30514-bb-11.0-try-good-join-order.txt

          It is also faster than what the optimizer chooses by default.
          However, table "part" uses the join buffer. The optimizer doesn't take the use of join buffer into account (either in 11.0 or in 10.8, it's a long known deficiency in best_access_path).

          If I set join_cache_level=0 to make the execution match what the optimizer estimates, the "good" join order becomes slower than the join order that the optimizer picks by default.

          psergei Sergei Petrunia added a comment - Trying the good join order on bb-11.0: mdev30514-bb-11.0-try-good-join-order.txt It is also faster than what the optimizer chooses by default. However, table "part" uses the join buffer. The optimizer doesn't take the use of join buffer into account (either in 11.0 or in 10.8, it's a long known deficiency in best_access_path). If I set join_cache_level=0 to make the execution match what the optimizer estimates, the "good" join order becomes slower than the join order that the optimizer picks by default.

          People

            psergei Sergei Petrunia
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.