Details
-
Bug
-
Status: Stalled (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.0.16, 10.0(EOL), 10.1(EOL)
-
Linux with MariaDB-like packages
-
10.1.11
Description
This is the summary of the bug reported here: https://phabricator.wikimedia.org/T113901
This is the exact CREATE syntax of revision and page:
CREATE TABLE `revision` ( |
`rev_id` int(8) unsigned NOT NULL AUTO_INCREMENT, |
`rev_page` int(8) unsigned NOT NULL DEFAULT '0', |
`rev_text_id` int(8) unsigned NOT NULL DEFAULT '0', |
`rev_comment` varbinary(255) DEFAULT NULL, |
`rev_user` int(5) unsigned NOT NULL DEFAULT '0', |
`rev_user_text` varbinary(255) NOT NULL DEFAULT '', |
`rev_timestamp` varbinary(14) NOT NULL DEFAULT '', |
`rev_minor_edit` tinyint(1) unsigned NOT NULL DEFAULT '0', |
`rev_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0', |
`rev_len` int(8) unsigned DEFAULT NULL, |
`rev_parent_id` int(8) unsigned DEFAULT NULL, |
`rev_sha1` varbinary(32) NOT NULL DEFAULT '', |
`rev_content_model` varbinary(32) DEFAULT NULL, |
`rev_content_format` varbinary(64) DEFAULT NULL, |
PRIMARY KEY (`rev_id`), |
KEY `rev_page_id` (`rev_page`,`rev_id`), |
KEY `rev_timestamp` (`rev_timestamp`,`rev_id`), |
KEY `page_timestamp` (`rev_page`,`rev_timestamp`,`rev_id`), |
KEY `user_timestamp` (`rev_user`,`rev_timestamp`,`rev_id`), |
KEY `usertext_timestamp` (`rev_user_text`,`rev_timestamp`,`rev_id`) |
) ENGINE=InnoDB AUTO_INCREMENT=683122218 DEFAULT CHARSET=binary |
CREATE TABLE `page` ( |
`page_id` int(8) unsigned NOT NULL AUTO_INCREMENT, |
`page_namespace` int(11) NOT NULL DEFAULT '0', |
`page_title` varbinary(255) NOT NULL DEFAULT '', |
`page_restrictions` tinyblob NOT NULL, |
`page_counter` bigint(20) unsigned NOT NULL DEFAULT '0', |
`page_is_redirect` tinyint(1) unsigned NOT NULL DEFAULT '0', |
`page_is_new` tinyint(1) unsigned NOT NULL DEFAULT '0', |
`page_random` double unsigned NOT NULL DEFAULT '0', |
`page_touched` varbinary(14) NOT NULL DEFAULT '', |
`page_links_updated` varbinary(14) DEFAULT NULL, |
`page_latest` int(8) unsigned NOT NULL DEFAULT '0', |
`page_len` int(8) unsigned NOT NULL DEFAULT '0', |
`page_content_model` varbinary(32) DEFAULT NULL, |
PRIMARY KEY (`page_id`), |
UNIQUE KEY `name_title` (`page_namespace`,`page_title`), |
KEY `page_random` (`page_random`), |
KEY `page_len` (`page_len`), |
KEY `page_redirect_namespace_len` (`page_is_redirect`,`page_namespace`,`page_len`) |
) ENGINE=InnoDB AUTO_INCREMENT=47992700 DEFAULT CHARSET=binary |
Regular explain:
Tables have approximately the numbers or rows you see on the autoinc.
mysql> EXPLAIN SELECT rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` ON ((rev_page = page_id)) ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 11\G |
*************************** 1. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: page |
type: index |
possible_keys: PRIMARY |
key: name_title |
key_len: 261
|
ref: NULL |
rows: 33836157 |
Extra: Using index; Using temporary; Using filesort |
*************************** 2. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: revision |
type: ref
|
possible_keys: rev_page_id,page_timestamp
|
key: page_timestamp |
key_len: 4
|
ref: enwiki.page.page_id
|
rows: 8 |
Extra: Using index |
Forcing the other plan:
mysql> EXPLAIN SELECT STRAIGHT_JOIN rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` ON ((rev_page = page_id)) ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 11\G |
*************************** 1. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: revision |
type: index |
possible_keys: rev_page_id,page_timestamp
|
key: rev_timestamp |
key_len: 20
|
ref: NULL |
rows: 11 |
Extra:
|
*************************** 2. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: page |
type: eq_ref
|
possible_keys: PRIMARY |
key: PRIMARY |
key_len: 4
|
ref: enwiki.revision.rev_page
|
rows: 1 |
Extra:
|
The actual query parsed has no issue:
select `enwiki`.`revision`.`rev_id` AS `rev_id`,`enwiki`.`revision`.`rev_timestamp` AS `rev_timestamp`,`enwiki`.`page`.`page_id` |
AS `page_id`,`enwiki`.`page`.`page_title` AS `page_title`,`enwiki`.`page`.`page_namespace` AS `page_namespace` |
from `enwiki`.`revision` join `enwiki`.`page` where (`enwiki`.`revision`.`rev_page` = `enwiki`.`page`.`page_id`) |
order by `enwiki`.`revision`.`rev_timestamp` desc,`enwiki`.`revision`.`rev_id` desc limit 11 |
Even ignoring the indexes does not work (it is not related to covering index):
mysql> EXPLAIN EXTENDED SELECT rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` IGNORE INDEX(name_title, PRIMARY) ON ((rev_page = page_id)) ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 11\G |
*************************** 1. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: page |
type: ALL |
possible_keys: NULL |
key: NULL |
key_len: NULL |
ref: NULL |
rows: 33836157 |
filtered: 100.00
|
Extra: Using temporary; Using filesort |
*************************** 2. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: revision |
type: ref
|
possible_keys: rev_page_id,page_timestamp
|
key: page_timestamp |
key_len: 4
|
ref: enwiki.page.page_id
|
rows: 8 |
filtered: 100.00
|
Extra: Using index |
2 rows in set, 1 warning (0.00 sec) |
I think this is a bug on the optimizer.
Running ANALYZE on both tables, even with use-stat-tables=PREFERABLY does not help (and it shouldn't, row counts are ok, it is that for some reason maria refuses to use the clearly better plan -that itself understands).
This could be a specific configuration we have, you can see most of it at:
https://git.wikimedia.org/blob/operations%2Fpuppet.git/1423255184712f7fde17dee338c7ac1519ee44ee/templates%2Fmariadb%2Fproduction.my.cnf.erb
Attachments
Issue Links
- relates to
-
MDEV-8306 Complete cost-based optimization for ORDER BY with LIMIT
-
- Stalled
-
Tried on x1000 smaller tables (600K and 50K rows, correspondingly). Same result
MariaDB [test]> EXPLAIN SELECT rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` ON ((rev_page = page_id)) ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 11;
+------+-------------+----------+-------+----------------------------+----------------+---------+-------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------+-------+----------------------------+----------------+---------+-------------------+-------+----------------------------------------------+
| 1 | SIMPLE | page | index | PRIMARY | name_title | 261 | NULL | 25272 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | revision | ref | rev_page_id,page_timestamp | page_timestamp | 4 | test.page.page_id | 5 | Using index |
+------+-------------+----------+-------+----------------------------+----------------+---------+-------------------+-------+----------------------------------------------+
2 rows in set (0.00 sec)
MariaDB [test]> EXPLAIN SELECT STRAIGHT_JOIN rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` ON ((rev_page = page_id)) ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 11;
+------+-------------+----------+--------+----------------------------+---------------+---------+------------------------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------+--------+----------------------------+---------------+---------+------------------------+------+-------+
| 1 | SIMPLE | revision | index | rev_page_id,page_timestamp | rev_timestamp | 20 | NULL | 11 | |
| 1 | SIMPLE | page | eq_ref | PRIMARY | PRIMARY | 4 | test.revision.rev_page | 1 | |
+------+-------------+----------+--------+----------------------------+---------------+---------+------------------------+------+-------+
2 rows in set (0.00 sec)
Execution time for the first query on my machine is ~1 sec, give or take; for the second one it is negligible.
11 rows in set (0.82 sec)
11 rows in set (0.00 sec)
I've uploaded my generated data dump (purely artificial, nothing confidential) to ftp://ftp.askmonty.org/public/mdev8880.dump.gz