Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35280

Performance Issue on TPC-H Query 2

    XMLWordPrintable

Details

    Description

      For the query 2 in the TPC-H benchmark:

      SELECT s_acctbal,
             s_name,
             n_name,
             p_partkey,
             p_mfgr,
             s_address,
             s_phone,
             s_comment
      FROM PART,
           SUPPLIER,
           PARTSUPP,
           NATION,
           REGION
      WHERE p_partkey = ps_partkey
        AND s_suppkey = ps_suppkey
        AND p_size = 30
        AND p_type like '%STEEL'
        AND s_nationkey = n_nationkey
        AND n_regionkey = r_regionkey
        AND r_name = 'ASIA'
        AND ps_supplycost =
          (SELECT min(ps_supplycost)
           FROM PARTSUPP,
                SUPPLIER,
                NATION,
                REGION
           WHERE p_partkey = ps_partkey
             AND s_suppkey = ps_suppkey
             AND s_nationkey = n_nationkey
             AND n_regionkey = r_regionkey
             AND r_name = 'ASIA')
      ORDER BY s_acctbal DESC,
               n_name,
               s_name,
               p_partkey
      LIMIT 100;
      

      When executing with ANAYZE, its query plan and execution time is:

      $ time mysql -h 127.0.0.1 -uroot -proot tpch < 2.sql
      id      select_type     table   type    possible_keys   key     key_len ref     rows    r_rows  filtered  r_filtered      Extra
      1       PRIMARY NATION  ALL     PRIMARY,NATION_FK1      NULL    NULL    NULL    25      25.00   100.00    100.00  Using temporary; Using filesort
      1       PRIMARY REGION  eq_ref  PRIMARY PRIMARY 4       tpch.NATION.N_REGIONKEY 1       1.00    100.00    20.00   Using where
      1       PRIMARY SUPPLIER        ref     PRIMARY,SUPPLIER_FK1    SUPPLIER_FK1    4       tpch.NATION.N_NATIONKEY   399     400.60  100.00  100.00
      1       PRIMARY PARTSUPP        ref     PRIMARY,PARTSUPP_FK1    PARTSUPP_FK1    4       tpch.SUPPLIER.S_SUPPKEY   78      80.00   100.00  100.00
      1       PRIMARY PART    eq_ref  PRIMARY PRIMARY 4       tpch.PARTSUPP.PS_PARTKEY        1       1.00      100.00  0.30    Using where
      2       DEPENDENT SUBQUERY      PARTSUPP        ref     PRIMARY,PARTSUPP_FK1    PRIMARY 4       tpch.PART.P_PARTKEY       3       4.00    100.00  100.00
      2       DEPENDENT SUBQUERY      SUPPLIER        eq_ref  PRIMARY,SUPPLIER_FK1    PRIMARY 4       tpch.PARTSUPP.PS_SUPPKEY  1       1.00    100.00  100.00
      2       DEPENDENT SUBQUERY      NATION  eq_ref  PRIMARY,NATION_FK1      PRIMARY 4       tpch.SUPPLIER.S_NATIONKEY 1       1.00    100.00  100.00
      2       DEPENDENT SUBQUERY      REGION  eq_ref  PRIMARY PRIMARY 4       tpch.NATION.N_REGIONKEY 11.00     100.00  39.16   Using where
       
       0.00s user 0.00s system 0% cpu 6.641 total
      

      While if we disable the default optimizer cost:

      diff --git a/sql/keycaches.cc b/sql/keycaches.cc
      index b325f1558c5..f761e838f0f 100644
      --- a/sql/keycaches.cc
      +++ b/sql/keycaches.cc
      @@ -269,8 +269,6 @@ OPTIMIZER_COSTS heap_optimizer_costs, tmp_table_optimizer_costs;
       
       OPTIMIZER_COSTS *get_optimizer_costs(const LEX_CSTRING *cache_name)
       {
      -  if (!cache_name->length)
      -    return &default_optimizer_costs;
         return ((OPTIMIZER_COSTS*) find_named(&linked_optimizer_costs,
                                               cache_name->str, cache_name->length,
                                               0));
       
      The execution time is significantly reduced:
       
      id      select_type     table   type    possible_keys   key     key_len ref     rows    r_rows  filtered  r_filtered      Extra
      1       PRIMARY PART    ALL     PRIMARY NULL    NULL    NULL    198101  200000.00       100.00  0.41      Using where; Using temporary; Using filesort
      1       PRIMARY PARTSUPP        ref     PRIMARY,PARTSUPP_FK1    PRIMARY 4       tpch.PART.P_PARTKEY       3       4.00    100.00  14.68   Using where
      1       PRIMARY SUPPLIER        eq_ref  PRIMARY,SUPPLIER_FK1    PRIMARY 4       tpch.PARTSUPP.PS_SUPPKEY  1       1.00    100.00  100.00
      1       PRIMARY NATION  ALL     PRIMARY,NATION_FK1      NULL    NULL    NULL    25      25.00   4.00      4.00    Using where; Using join buffer (flat, BNL join)
      1       PRIMARY REGION  ALL     PRIMARY NULL    NULL    NULL    5       5.00    20.00   20.00   Using where; Using join buffer (incremental, BNL join)
      2       DEPENDENT SUBQUERY      PARTSUPP        ref     PRIMARY,PARTSUPP_FK1    PRIMARY 4       tpch.PART.P_PARTKEY       3       4.00    100.00  100.00
      2       DEPENDENT SUBQUERY      SUPPLIER        eq_ref  PRIMARY,SUPPLIER_FK1    PRIMARY 4       tpch.PARTSUPP.PS_SUPPKEY  1       1.00    100.00  100.00
      2       DEPENDENT SUBQUERY      NATION  eq_ref  PRIMARY,NATION_FK1      PRIMARY 4       tpch.SUPPLIER.S_NATIONKEY 1       1.00    100.00  100.00
      2       DEPENDENT SUBQUERY      REGION  eq_ref  PRIMARY PRIMARY 4       tpch.NATION.N_REGIONKEY 11.00     100.00  19.64   Using where
       
      0.00s user 0.01s system 9% cpu 0.065 total
      

      The performance seems unexpected. I am not proposing a patch to directly modify the code as the patch is incorrect overall. I am wondering for this query, whether we should not use the default optimizer cost to make the performance better?

      I used 1 GB data for TPC-H benchmark. Configuring TPC-H requires some effort, so I also attach my entire database for your reference: https://drive.google.com/file/d/1PQF7FOHu2VQYUC9aUG69M2u_7BgNCeN_/view?usp=sharing
      The username is root. and the password is root.

      Attachments

        Issue Links

          Activity

            People

              psergei Sergei Petrunia
              bajinsheng Jinsheng Ba
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.