Details
-
Bug
-
Status: Open (View Workflow)
-
Minor
-
Resolution: Unresolved
-
10.1.24
-
None
-
GNU/Linux
Description
The following query pattern:
SELECT c1 FROM t WHERE i_eq_ref IN (c1....c1000) as R1 and i_ref IN (c1...c200) as R2
on an InnoDB table with around 200 million rows, i_eq_ref being the primary key and i_ref a secondary key with only barely more than 200 possible values (application-level access group identifiers, present on each row, so providing 200 values means the user is member of most groups, and they can see about the totality of the 200 million rows) makes query planner spend a lot of time, and ends up being using the largest amount of overall query execution time: EXPLAIN on a cold cache takes ~2 seconds, and actually running the query takes nearly the same amount of time (so query optimiser ends up picking the correct plan, it just takes too long to do so).
I would like to thank Stephane Varoqui for his huge help in getting me to understand this, and for providing me with a template on how this should be reported, as it is still quite a bit above my head. Hopefully it will all make sense to mariadb developers.
Cost R1 is cap by the cost of 1000 rows.
Cost of R2 have minimum constant cost of the size of the range 200(index dive cost) + SUM(index dives result).
For each index dive the reduction should be cancelled when it reach R1 cost if not the range can consume possibly in the worth case 200 IO on disk and end up picking the R1 anyway.
Back porting https://dev.mysql.com/worklog/task/?id=7170 could also help if index does not stay in memory the cost to add 200 index dive on disk is worth the cost of 1000 pk lookup.
How to reproduce (I stripped several columns which are not involved in the query, i_eq_ref is "uid" and i_ref is "security_uid"):
> show create table catalog\G
|
Create Table: CREATE TABLE `catalog` (
|
`uid` bigint(20) unsigned NOT NULL,
|
`security_uid` int(10) unsigned DEFAULT NULL,
|
`agent_security_uid` int(10) unsigned DEFAULT NULL,
|
`viewable_owner` varbinary(255) NOT NULL DEFAULT '',
|
`viewable_associate` varbinary(255) NOT NULL DEFAULT '',
|
`path` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT ''
|
PRIMARY KEY (`uid`),
|
KEY `security_uid` (`security_uid`),
|
KEY `viewable_owner` (`viewable_owner`),
|
KEY `viewable_associate` (`viewable_associate`),
|
KEY `Path` (`path`),
|
KEY `agent_security_uid` (`agent_security_uid`)
|
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
|
> show index from catalog\G
|
*************************** 1. row ***************************
|
Table: catalog
|
Non_unique: 0
|
Key_name: PRIMARY
|
Seq_in_index: 1
|
Column_name: uid
|
Collation: A
|
Cardinality: 212787844
|
Sub_part: NULL
|
Packed: NULL
|
Null:
|
Index_type: BTREE
|
Comment:
|
Index_comment:
|
*************************** 2. row ***************************
|
Table: catalog
|
Non_unique: 1
|
Key_name: security_uid
|
Seq_in_index: 1
|
Column_name: security_uid
|
Collation: A
|
Cardinality: 30148
|
Sub_part: NULL
|
Packed: NULL
|
Null: YES
|
Index_type: BTREE
|
Comment:
|
Index_comment:
|
*************************** 4. row ***************************
|
Table: catalog
|
Non_unique: 1
|
Key_name: viewable_owner
|
Seq_in_index: 1
|
Column_name: viewable_owner
|
Collation: A
|
Cardinality: 95980
|
Sub_part: NULL
|
Packed: NULL
|
Null:
|
Index_type: BTREE
|
Comment:
|
Index_comment:
|
*************************** 5. row ***************************
|
Table: catalog
|
Non_unique: 1
|
Key_name: viewable_associate
|
Seq_in_index: 1
|
Column_name: viewable_associate
|
Collation: A
|
Cardinality: 1364024
|
Sub_part: NULL
|
Packed: NULL
|
Null:
|
Index_type: BTREE
|
Comment:
|
Index_comment:
|
*************************** 7. row ***************************
|
Table: catalog
|
Non_unique: 1
|
Key_name: Path
|
Seq_in_index: 1
|
Column_name: path
|
Collation: A
|
Cardinality: 212787844
|
Sub_part: NULL
|
Packed: NULL
|
Null:
|
Index_type: BTREE
|
Comment:
|
Index_comment:
|
*************************** 25. row ***************************
|
Table: catalog
|
Non_unique: 1
|
Key_name: agent_security_uid
|
Seq_in_index: 1
|
Column_name: agent_security_uid
|
Collation: A
|
Cardinality: 2
|
Sub_part: NULL
|
Packed: NULL
|
Null: YES
|
Index_type: BTREE
|
Comment:
|
Index_comment:
|
All the timeings below are with a hot cache (to be fair to each measure), but in reality this query will virtually always run on cold cache: it is scanning a pre-determined set of uid values, so each query will pull a fresh set of rows. With around 30 such queries running concurrently, "explain" and execution time increase to around 20 seconds (io starvation ?).
Explain without "force index":
+------+-------------+-------------------------------------------------+-------+---------------------------------------------------------------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------------------------------------------------+-------+---------------------------------------------------------------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
|
| 1 | SIMPLE | catalog | range | PRIMARY,security_uid,viewable_owner,viewable_associate,agent_security_uid | PRIMARY | 8 | NULL | 1000 | Using where; Using temporary |
|
| 1 | SIMPLE | related_destination_section_role_uid_category | ref | PRIMARY,Membership | PRIMARY | 8 | foobarbazprod2.catalog.uid | 2 | Using where; Using index |
|
| 1 | SIMPLE | related_destination_section_role_uid_1_category | ref | PRIMARY | PRIMARY | 8 | foobarbazprod2.related_destination_section_role_uid_category.category_uid | 2 | Using where; Using index |
|
| 1 | SIMPLE | related_strict_specialise_uid_1_category | ref | PRIMARY | PRIMARY | 8 | foobarbazprod2.catalog.uid | 2 | Using where; Using index |
|
+------+-------------+-------------------------------------------------+-------+---------------------------------------------------------------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
|
4 rows in set (0.77 sec)
|
Select without "force index": 1.22 sec
Explain with "force index (PRIMARY)", which is the column the optimiser ended up picking above:
|
+------+-------------+-------------------------------------------------+-------+--------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------------------------------------------------+-------+--------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
|
| 1 | SIMPLE | catalog | range | PRIMARY | PRIMARY | 8 | NULL | 1000 | Using where; Using temporary |
|
| 1 | SIMPLE | related_destination_section_role_uid_category | ref | PRIMARY,Membership | PRIMARY | 8 | foobarbazprod2.catalog.uid | 2 | Using where; Using index |
|
| 1 | SIMPLE | related_destination_section_role_uid_1_category | ref | PRIMARY | PRIMARY | 8 | foobarbazprod2.related_destination_section_role_uid_category.category_uid | 2 | Using where; Using index |
|
| 1 | SIMPLE | related_strict_specialise_uid_1_category | ref | PRIMARY | PRIMARY | 8 | foobarbazprod2.catalog.uid | 2 | Using where; Using index |
|
+------+-------------+-------------------------------------------------+-------+--------------------+---------+---------+---------------------------------------------------------------------------+------+------------------------------+
|
4 rows in set (0.00 sec)
|
Select with "force index (PRIMARY)": 0.04 sec
[EDIT]: damn, formating is hard here
Attachments
Issue Links
- relates to
-
MDEV-16934 Query with very large IN clause lists runs slowly.
- Closed