Details

    Description

      For the query 2 in the TPC-H benchmark:

      SELECT s_acctbal,
             s_name,
             n_name,
             p_partkey,
             p_mfgr,
             s_address,
             s_phone,
             s_comment
      FROM PART,
           SUPPLIER,
           PARTSUPP,
           NATION,
           REGION
      WHERE p_partkey = ps_partkey
        AND s_suppkey = ps_suppkey
        AND p_size = 30
        AND p_type like '%STEEL'
        AND s_nationkey = n_nationkey
        AND n_regionkey = r_regionkey
        AND r_name = 'ASIA'
        AND ps_supplycost =
          (SELECT min(ps_supplycost)
           FROM PARTSUPP,
                SUPPLIER,
                NATION,
                REGION
           WHERE p_partkey = ps_partkey
             AND s_suppkey = ps_suppkey
             AND s_nationkey = n_nationkey
             AND n_regionkey = r_regionkey
             AND r_name = 'ASIA')
      ORDER BY s_acctbal DESC,
               n_name,
               s_name,
               p_partkey
      LIMIT 100;
      

      When executing with ANAYZE, its query plan and execution time is:

      $ time mysql -h 127.0.0.1 -uroot -proot tpch < 2.sql
      id      select_type     table   type    possible_keys   key     key_len ref     rows    r_rows  filtered  r_filtered      Extra
      1       PRIMARY NATION  ALL     PRIMARY,NATION_FK1      NULL    NULL    NULL    25      25.00   100.00    100.00  Using temporary; Using filesort
      1       PRIMARY REGION  eq_ref  PRIMARY PRIMARY 4       tpch.NATION.N_REGIONKEY 1       1.00    100.00    20.00   Using where
      1       PRIMARY SUPPLIER        ref     PRIMARY,SUPPLIER_FK1    SUPPLIER_FK1    4       tpch.NATION.N_NATIONKEY   399     400.60  100.00  100.00
      1       PRIMARY PARTSUPP        ref     PRIMARY,PARTSUPP_FK1    PARTSUPP_FK1    4       tpch.SUPPLIER.S_SUPPKEY   78      80.00   100.00  100.00
      1       PRIMARY PART    eq_ref  PRIMARY PRIMARY 4       tpch.PARTSUPP.PS_PARTKEY        1       1.00      100.00  0.30    Using where
      2       DEPENDENT SUBQUERY      PARTSUPP        ref     PRIMARY,PARTSUPP_FK1    PRIMARY 4       tpch.PART.P_PARTKEY       3       4.00    100.00  100.00
      2       DEPENDENT SUBQUERY      SUPPLIER        eq_ref  PRIMARY,SUPPLIER_FK1    PRIMARY 4       tpch.PARTSUPP.PS_SUPPKEY  1       1.00    100.00  100.00
      2       DEPENDENT SUBQUERY      NATION  eq_ref  PRIMARY,NATION_FK1      PRIMARY 4       tpch.SUPPLIER.S_NATIONKEY 1       1.00    100.00  100.00
      2       DEPENDENT SUBQUERY      REGION  eq_ref  PRIMARY PRIMARY 4       tpch.NATION.N_REGIONKEY 11.00     100.00  39.16   Using where
       
       0.00s user 0.00s system 0% cpu 6.641 total
      

      While if we disable the default optimizer cost:

      diff --git a/sql/keycaches.cc b/sql/keycaches.cc
      index b325f1558c5..f761e838f0f 100644
      --- a/sql/keycaches.cc
      +++ b/sql/keycaches.cc
      @@ -269,8 +269,6 @@ OPTIMIZER_COSTS heap_optimizer_costs, tmp_table_optimizer_costs;
       
       OPTIMIZER_COSTS *get_optimizer_costs(const LEX_CSTRING *cache_name)
       {
      -  if (!cache_name->length)
      -    return &default_optimizer_costs;
         return ((OPTIMIZER_COSTS*) find_named(&linked_optimizer_costs,
                                               cache_name->str, cache_name->length,
                                               0));
       
      The execution time is significantly reduced:
       
      id      select_type     table   type    possible_keys   key     key_len ref     rows    r_rows  filtered  r_filtered      Extra
      1       PRIMARY PART    ALL     PRIMARY NULL    NULL    NULL    198101  200000.00       100.00  0.41      Using where; Using temporary; Using filesort
      1       PRIMARY PARTSUPP        ref     PRIMARY,PARTSUPP_FK1    PRIMARY 4       tpch.PART.P_PARTKEY       3       4.00    100.00  14.68   Using where
      1       PRIMARY SUPPLIER        eq_ref  PRIMARY,SUPPLIER_FK1    PRIMARY 4       tpch.PARTSUPP.PS_SUPPKEY  1       1.00    100.00  100.00
      1       PRIMARY NATION  ALL     PRIMARY,NATION_FK1      NULL    NULL    NULL    25      25.00   4.00      4.00    Using where; Using join buffer (flat, BNL join)
      1       PRIMARY REGION  ALL     PRIMARY NULL    NULL    NULL    5       5.00    20.00   20.00   Using where; Using join buffer (incremental, BNL join)
      2       DEPENDENT SUBQUERY      PARTSUPP        ref     PRIMARY,PARTSUPP_FK1    PRIMARY 4       tpch.PART.P_PARTKEY       3       4.00    100.00  100.00
      2       DEPENDENT SUBQUERY      SUPPLIER        eq_ref  PRIMARY,SUPPLIER_FK1    PRIMARY 4       tpch.PARTSUPP.PS_SUPPKEY  1       1.00    100.00  100.00
      2       DEPENDENT SUBQUERY      NATION  eq_ref  PRIMARY,NATION_FK1      PRIMARY 4       tpch.SUPPLIER.S_NATIONKEY 1       1.00    100.00  100.00
      2       DEPENDENT SUBQUERY      REGION  eq_ref  PRIMARY PRIMARY 4       tpch.NATION.N_REGIONKEY 11.00     100.00  19.64   Using where
       
      0.00s user 0.01s system 9% cpu 0.065 total
      

      The performance seems unexpected. I am not proposing a patch to directly modify the code as the patch is incorrect overall. I am wondering for this query, whether we should not use the default optimizer cost to make the performance better?

      I used 1 GB data for TPC-H benchmark. Configuring TPC-H requires some effort, so I also attach my entire database for your reference: https://drive.google.com/file/d/1PQF7FOHu2VQYUC9aUG69M2u_7BgNCeN_/view?usp=sharing
      The username is root. and the password is root.

      Attachments

        Issue Links

          Activity

            bajinsheng Jinsheng Ba added a comment - - edited

            I also observed a similar performance improvement in Query 8:

            select
                o_year,
                sum(
                    case
                        when nation = 'INDIA' then volume
                        else 0
                    end
                ) / sum(volume) as mkt_share
            from
                (
                    select
                        extract(
                            year
                            from
                                o_orderdate
                        ) as o_year,
                        l_extendedprice * (1 - l_discount) as volume,
                        n2.n_name as nation
                    from
                        PART,
                        SUPPLIER,
                        LINEITEM,
                        ORDERS,
                        CUSTOMER,
                        NATION n1,
                        NATION n2,
                        REGION
                    where
                        p_partkey = l_partkey
                        and s_suppkey = l_suppkey
                        and l_orderkey = o_orderkey
                        and o_custkey = c_custkey
                        and c_nationkey = n1.n_nationkey
                        and n1.n_regionkey = r_regionkey
                        and r_name = 'ASIA'
                        and s_nationkey = n2.n_nationkey
                        and o_orderdate between date '1995-01-01'
                        and date '1996-12-31'
                        and p_type = 'SMALL PLATED COPPER'
                ) as all_nations
            group by
                o_year
            order by
                o_year;
            

            Before:

            id	select_type	table	type	possible_keys	key	key_len	ref	rows	r_rows	filtered	r_filtered	Extra
            1	SIMPLE	ORDERS	ALL	PRIMARY,ORDERS_FK1	NULL	NULL	NULL	1488600	1500000.00	100.00	30.48	Using where; Using temporary; Using filesort
            1	SIMPLE	CUSTOMER	eq_ref	PRIMARY,CUSTOMER_FK1	PRIMARY	4	tpch.ORDERS.O_CUSTKEY	1	1.00	100.00	100.00	
            1	SIMPLE	n1	eq_ref	PRIMARY,NATION_FK1	PRIMARY	4	tpch.CUSTOMER.C_NATIONKEY	1	1.00	100.00	100.00	
            1	SIMPLE	REGION	eq_ref	PRIMARY	PRIMARY	4	tpch.n1.N_REGIONKEY	1	1.00	100.00	20.06	Using where
            1	SIMPLE	LINEITEM	ref	PRIMARY,LINEITEM_FK2	PRIMARY	4	tpch.ORDERS.O_ORDERKEY	3	4.00	100.00	100.00	
            1	SIMPLE	PART	eq_ref	PRIMARY	PRIMARY	4	tpch.LINEITEM.L_PARTKEY	1	1.00	100.00	0.69	Using where
            1	SIMPLE	SUPPLIER	eq_ref	PRIMARY,SUPPLIER_FK1	PRIMARY	4	tpch.LINEITEM.L_SUPPKEY	1	1.00	100.00	100.00	
            1	SIMPLE	n2	eq_ref	PRIMARY	PRIMARY	4	tpch.SUPPLIER.S_NATIONKEY	1	1.00	100.00	100.00	
             
            0.00s user 0.01s system 0% cpu 4.807 total
            

            After applying the patch:

            id	select_type	table	type	possible_keys	key	key_len	ref	rows	r_rows	filtered	r_filtered	Extra
            1	SIMPLE	ORDERS	ALL	PRIMARY,ORDERS_FK1	NULL	NULL	NULL	1488600	1500000.00	100.00	30.48	Using where; Using temporary; Using filesort
            1	SIMPLE	CUSTOMER	eq_ref	PRIMARY,CUSTOMER_FK1	PRIMARY	4	tpch.ORDERS.O_CUSTKEY	1	1.00	100.00	100.00	
            1	SIMPLE	n1	eq_ref	PRIMARY,NATION_FK1	PRIMARY	4	tpch.CUSTOMER.C_NATIONKEY	1	1.00	100.00	100.00	
            1	SIMPLE	REGION	ALL	PRIMARY	NULL	NULL	NULL	5	5.00	20.00	4.01	Using where; Using join buffer (flat, BNL join)
            1	SIMPLE	LINEITEM	ref	PRIMARY,LINEITEM_FK2	PRIMARY	4	tpch.ORDERS.O_ORDERKEY	3	4.00	100.00	100.00	
            1	SIMPLE	PART	eq_ref	PRIMARY	PRIMARY	4	tpch.LINEITEM.L_PARTKEY	1	1.00	100.00	0.69	Using where
            1	SIMPLE	SUPPLIER	eq_ref	PRIMARY,SUPPLIER_FK1	PRIMARY	4	tpch.LINEITEM.L_SUPPKEY	1	1.00	100.00	100.00	
            1	SIMPLE	n2	ALL	PRIMARY	NULL	NULL	NULL	25	25.00	4.00	4.00	Using where; Using join buffer (flat, BNL join)
             
            0.01s user 0.00s system 0% cpu 3.885 total
            

            bajinsheng Jinsheng Ba added a comment - - edited I also observed a similar performance improvement in Query 8: select o_year, sum ( case when nation = 'INDIA' then volume else 0 end ) / sum (volume) as mkt_share from ( select extract( year from o_orderdate ) as o_year, l_extendedprice * (1 - l_discount) as volume, n2.n_name as nation from PART, SUPPLIER, LINEITEM, ORDERS, CUSTOMER, NATION n1, NATION n2, REGION where p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between date '1995-01-01' and date '1996-12-31' and p_type = 'SMALL PLATED COPPER' ) as all_nations group by o_year order by o_year; Before: id select_type table type possible_keys key key_len ref rows r_rows filtered r_filtered Extra 1 SIMPLE ORDERS ALL PRIMARY,ORDERS_FK1 NULL NULL NULL 1488600 1500000.00 100.00 30.48 Using where; Using temporary; Using filesort 1 SIMPLE CUSTOMER eq_ref PRIMARY,CUSTOMER_FK1 PRIMARY 4 tpch.ORDERS.O_CUSTKEY 1 1.00 100.00 100.00 1 SIMPLE n1 eq_ref PRIMARY,NATION_FK1 PRIMARY 4 tpch.CUSTOMER.C_NATIONKEY 1 1.00 100.00 100.00 1 SIMPLE REGION eq_ref PRIMARY PRIMARY 4 tpch.n1.N_REGIONKEY 1 1.00 100.00 20.06 Using where 1 SIMPLE LINEITEM ref PRIMARY,LINEITEM_FK2 PRIMARY 4 tpch.ORDERS.O_ORDERKEY 3 4.00 100.00 100.00 1 SIMPLE PART eq_ref PRIMARY PRIMARY 4 tpch.LINEITEM.L_PARTKEY 1 1.00 100.00 0.69 Using where 1 SIMPLE SUPPLIER eq_ref PRIMARY,SUPPLIER_FK1 PRIMARY 4 tpch.LINEITEM.L_SUPPKEY 1 1.00 100.00 100.00 1 SIMPLE n2 eq_ref PRIMARY PRIMARY 4 tpch.SUPPLIER.S_NATIONKEY 1 1.00 100.00 100.00   0.00s user 0.01s system 0% cpu 4.807 total After applying the patch: id select_type table type possible_keys key key_len ref rows r_rows filtered r_filtered Extra 1 SIMPLE ORDERS ALL PRIMARY,ORDERS_FK1 NULL NULL NULL 1488600 1500000.00 100.00 30.48 Using where; Using temporary; Using filesort 1 SIMPLE CUSTOMER eq_ref PRIMARY,CUSTOMER_FK1 PRIMARY 4 tpch.ORDERS.O_CUSTKEY 1 1.00 100.00 100.00 1 SIMPLE n1 eq_ref PRIMARY,NATION_FK1 PRIMARY 4 tpch.CUSTOMER.C_NATIONKEY 1 1.00 100.00 100.00 1 SIMPLE REGION ALL PRIMARY NULL NULL NULL 5 5.00 20.00 4.01 Using where; Using join buffer (flat, BNL join) 1 SIMPLE LINEITEM ref PRIMARY,LINEITEM_FK2 PRIMARY 4 tpch.ORDERS.O_ORDERKEY 3 4.00 100.00 100.00 1 SIMPLE PART eq_ref PRIMARY PRIMARY 4 tpch.LINEITEM.L_PARTKEY 1 1.00 100.00 0.69 Using where 1 SIMPLE SUPPLIER eq_ref PRIMARY,SUPPLIER_FK1 PRIMARY 4 tpch.LINEITEM.L_SUPPKEY 1 1.00 100.00 100.00 1 SIMPLE n2 ALL PRIMARY NULL NULL NULL 25 25.00 4.00 4.00 Using where; Using join buffer (flat, BNL join)   0.01s user 0.00s system 0% cpu 3.885 total

            People

              psergei Sergei Petrunia
              bajinsheng Jinsheng Ba
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.