[MDEV-6015] Measure impact of optimizer_switch='exists_to_in=on' in 10.0 Created: 2014-04-02  Updated: 2014-04-11  Resolved: 2014-04-11

Status: Closed
Project: MariaDB Server
Component/s: None
Fix Version/s: None

Type: Task Priority: Major
Reporter: Sergei Petrunia Assignee: Axel Schwenke
Resolution: Fixed Votes: 0
Labels: None

Attachments: Text File 04-opt1_explain.sql_1_results.txt     Text File 04-opt2_explain.sql_1_results.txt     Text File dbt3-q22.txt     PNG File q4-mrr-bka-real.png     PNG File q4-mrr-real-no-exists2in-io-only.png     PNG File q4-mrr-real-no-exists2in.png    

 Description   

See https://mariadb.atlassian.net/browse/MDEV-38?focusedCommentId=28609&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-28609.

I am looking at Q4.

ts ( select * from lineitem where l_orderkey = o_orderkey and l_commitdate < l_receiptdate ) group by o_orderpriority order by o_orderpriority;
+------+--------------------+----------+-------+--------------------------------------------+---------------+---------+---------------------------+--------+----------+---------------------------------------------------------------------+
| id   | select_type        | table    | type  | possible_keys                              | key           | key_len | ref                       | rows   | filtered | Extra                                                               |
+------+--------------------+----------+-------+--------------------------------------------+---------------+---------+---------------------------+--------+----------+---------------------------------------------------------------------+
|    1 | PRIMARY            | orders   | range | i_o_orderdate                              | i_o_orderdate | 4       | NULL                      | 137994 |   100.00 | Using index condition; Using where; Using temporary; Using filesort |
|    2 | DEPENDENT SUBQUERY | lineitem | ref   | PRIMARY,i_l_orderkey,i_l_orderkey_quantity | PRIMARY       | 4       | dbt3sf1.orders.o_orderkey |      2 |   100.00 | Using where                                                         |
+------+--------------------+----------+-------+--------------------------------------------+---------------+---------+---------------------------+--------+----------+---------------------------------------------------------------------+
2 rows in set, 2 warnings (0.00 sec)
 
MariaDB [dbt3sf1]> set optimizer_switch='exists_to_in=on';
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [dbt3sf1]> explain extended select o_orderpriority, count(*) as order_count from orders where o_orderdate >= '1995-06-06' and o_orderdate < date_add('1995-06-06', interval 3 month) and exists ( select * from lineitem where l_orderkey = o_orderkey and l_commitdate < l_receiptdate ) group by o_orderpriority order by o_orderpriority;
+------+-------------+----------+-------+--------------------------------------------+---------------+---------+---------------------------+--------+----------+--------------------------------------------------------+
| id   | select_type | table    | type  | possible_keys                              | key           | key_len | ref                       | rows   | filtered | Extra                                                  |
+------+-------------+----------+-------+--------------------------------------------+---------------+---------+---------------------------+--------+----------+--------------------------------------------------------+
|    1 | PRIMARY     | orders   | range | PRIMARY,i_o_orderdate                      | i_o_orderdate | 4       | NULL                      | 137994 |   100.00 | Using index condition; Using temporary; Using filesort |
|    1 | PRIMARY     | lineitem | ref   | PRIMARY,i_l_orderkey,i_l_orderkey_quantity | PRIMARY       | 4       | dbt3sf1.orders.o_orderkey |      2 |   100.00 | Using where; FirstMatch(orders)                        |
+------+-------------+----------+-------+--------------------------------------------+---------------+---------+---------------------------+--------+----------+--------------------------------------------------------+
2 rows in set, 2 warnings (0.00 sec)

The query plan is similar. However, the second query should be able to use BKA. Run the query with and without BKA and measure the impact.



 Comments   
Comment by Sergei Petrunia [ 2014-04-02 ]

The Q4 variant I am using:

select
	o_orderpriority,
	count(*) as order_count
from
	orders
where
	o_orderdate >= '1995-06-06'
	and o_orderdate < date_add('1995-06-06', interval 3 month)
	and exists (
		select
			*
		from
			lineitem
		where
			l_orderkey = o_orderkey
			and l_commitdate < l_receiptdate
	)
group by
	o_orderpriority
order by
	o_orderpriority;

Comment by Sergei Petrunia [ 2014-04-02 ]

Query Q22, EXPLAINs with exists_to_in=on|off.

The query uses WHERE .... AND NOT EXISTS(SELECT * FROM orders WHERE o_custkey = c_custkey).

It seems, that the best way run the subquery is to use index lookup on "o_custkey = c_custkey". This is used for both exists_to_in=ON and exists_to_in=OFF.
BKA is not possible in either case, because the subquery uses NOT EXISTS. It is not a semi-join.

Comment by Axel Schwenke [ 2014-04-03 ]

Results from running DBT3/Q4/SF30 on MariaDB-10.0.10 with exits-to-in either on or off. Execution times:

exists_to_in=off -> 284, 276, 280 (sec)
exists_to_in=on -> 261, 255, 257 (sec)

EXPLAIN plans attached.

Comment by Sergei Petrunia [ 2014-04-05 ]

Thanks.

Looking at the EXPLAINs. The query plans have the same structure. Exists-to-in conversion has enabled BKA for table lineitem. However, it didn't bring a lot of speedup.

Comment by Sergei Petrunia [ 2014-04-11 ]

Graphic trace of query's io when running with scale=1, cold buffer cache.

Comment by Sergei Petrunia [ 2014-04-11 ]

IO trace for exists_to_in=off. Things to note
1. there is no BKA (reads go back and forth)
2 the data seems to be highly correlated - lookups in lineitem are linear.

Generated at Thu Feb 08 07:08:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.