Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.1.37, 10.1.38, 10.1.39
-
None
-
debian stretch bare metal
Description
We have 24 servers running SSDs (RAID10+BBU) and 512GB RAM and the following query:
SELECT /* ApiQueryRevisions::run */ rev_id, rev_page, rev_timestamp, rev_minor_edit, rev_deleted, rev_len, rev_parent_id, rev_sha1, comment_rev_comment.comment_text AS `rev_comment_text`, comment_rev_comment.comment_data AS `rev_comment_data`, comment_rev_comment.comment_id AS `rev_comment_cid`, actor_rev_user.actor_user AS `rev_user`, actor_rev_user.actor_name AS `rev_user_text`, temp_rev_user.revactor_actor AS `rev_actor`, page_namespace, page_title, page_id, page_latest, page_is_redirect, page_len, user_name FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `revision_actor_temp` `temp_rev_user` ON ((temp_rev_user.revactor_rev = rev_id)) JOIN `actor` `actor_rev_user` ON ((actor_rev_user.actor_id = temp_rev_user.revactor_actor)) JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((actor_rev_user.actor_user != 0) AND (user_id = actor_rev_user.actor_user)) WHERE rev_page = '35159700' ORDER BY rev_timestamp DESC, rev_id DESC LIMIT 2;
|
That query seems to be choosing the wrong index rev_timestamp on some hosts, and the right index page_timestamp on some others.
The wrong index shows a terrible query plan (this is a show explain for)
+------+-------------+---------------------+--------+--------------------------------------------------------+---------------+---------+-----------------------------------------------+-----------+-----------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+---------------------+--------+--------------------------------------------------------+---------------+---------+-----------------------------------------------+-----------+-----------------------------+
|
| 1 | SIMPLE | revision | index | PRIMARY,page_timestamp,page_user_timestamp,rev_page_id | rev_timestamp | 16 | const | 830807594 | Using where; Using filesort |
|
| 1 | SIMPLE | page | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.revision.rev_page | 1 | |
|
| 1 | SIMPLE | temp_rev_comment | ref | PRIMARY,revcomment_rev | PRIMARY | 4 | enwiki.revision.rev_id | 1 | Using index |
|
| 1 | SIMPLE | comment_rev_comment | eq_ref | PRIMARY | PRIMARY | 8 | enwiki.temp_rev_comment.revcomment_comment_id | 1 | |
|
| 1 | SIMPLE | temp_rev_user | ref | PRIMARY,revactor_rev,actor_timestamp | PRIMARY | 4 | enwiki.revision.rev_id | 1 | Using index |
|
| 1 | SIMPLE | actor_rev_user | eq_ref | PRIMARY | PRIMARY | 8 | enwiki.temp_rev_user.revactor_actor | 1 | |
|
| 1 | SIMPLE | user | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.actor_rev_user.actor_user | 1 | Using where |
|
+------+-------------+---------------------+--------+--------------------------------------------------------+---------------+---------+-----------------------------------------------+-----------+-----------------------------+
|
|
And the query takes around 10 seconds.
On the other hand, for those hosts choosing the expected index page_timestamp the query runs almost instantly and it scans a lot less rows:
+------+-------------+---------------------+--------+--------------------------------------------------------+----------------+---------+-----------------------------------------------+--------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+---------------------+--------+--------------------------------------------------------+----------------+---------+-----------------------------------------------+--------+-------------+
|
| 1 | SIMPLE | revision | ref | PRIMARY,page_timestamp,page_user_timestamp,rev_page_id | page_timestamp | 4 | const | 153670 | Using where |
|
| 1 | SIMPLE | page | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.revision.rev_page | 1 | |
|
| 1 | SIMPLE | temp_rev_user | ref | PRIMARY,revactor_rev,actor_timestamp | PRIMARY | 4 | enwiki.revision.rev_id | 1 | Using index |
|
| 1 | SIMPLE | actor_rev_user | eq_ref | PRIMARY | PRIMARY | 8 | enwiki.temp_rev_user.revactor_actor | 1 | |
|
| 1 | SIMPLE | user | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.actor_rev_user.actor_user | 1 | Using where |
|
| 1 | SIMPLE | temp_rev_comment | ref | PRIMARY,revcomment_rev | PRIMARY | 4 | enwiki.revision.rev_id | 1 | Using index |
|
| 1 | SIMPLE | comment_rev_comment | eq_ref | PRIMARY | PRIMARY | 8 | enwiki.temp_rev_comment.revcomment_comment_id | 1 | |
|
+------+-------------+---------------------+--------+--------------------------------------------------------+----------------+---------+-----------------------------------------------+--------+-------------+
|
We ran the following on one of the hosts (10.1.39) that have the wrong query plan:
alter table revision engine=innodb, force
And the optimizer changed and started to use the right one.
However, doing the same on other hosts (with 10.1.37, 10.1.38 and 10.1.39) didn't make the optimizer to choose the right plan.
On that host, we also ran an analyze table and stopped+started mariadb, but it still uses the wrong plan.
This can also be seen with a lot simpler queries:
explain SELECT rev_id, rev_page, rev_timestamp, rev_minor_edit FROM `revision` WHERE rev_page = '35159700' ORDER BY rev_timestamp DESC, rev_id DESC LIMIT 2;
|
+------+-------------+----------+-------+------------------------------------------------+---------------+---------+------+-------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+----------+-------+------------------------------------------------+---------------+---------+------+-------+-------------+
|
| 1 | SIMPLE | revision | index | page_timestamp,page_user_timestamp,rev_page_id | rev_timestamp | 16 | NULL | 10359 | Using where |
|
+------+-------------+----------+-------+------------------------------------------------+---------------+---------+------+-------+-------------+
|
1 row in set (0.04 sec)
|
And then another host with the correct index:
explain SELECT rev_id, rev_page, rev_timestamp, rev_minor_edit FROM `revision` WHERE rev_page = '35159700' ORDER BY rev_timestamp DESC, rev_id DESC LIMIT 2;
|
+------+-------------+----------+------+------------------------------------------------+----------------+---------+-------+--------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+----------+------+------------------------------------------------+----------------+---------+-------+--------+-------------+
|
| 1 | SIMPLE | revision | ref | page_timestamp,page_user_timestamp,rev_page_id | page_timestamp | 4 | const | 153670 | Using where |
|
+------+-------------+----------+------+------------------------------------------------+----------------+---------+-------+--------+-------------+
|
1 row in set (0.05 sec)
|
This behaviour seems to be very random across all the hosts (and versions, either 10.1.38 and 10.1.39), some of them seem to be using the right index and some other don't, even if the run the same mariadb version and same config. And some of them seem to be changing the query plan after the table rebuilt but some others don't.
The table schema is exactly the same across all of them.
These are the table definitions:
*************************** 1. row ***************************
|
Table: revision
|
Create Table: CREATE TABLE `revision` (
|
`rev_id` int(8) unsigned NOT NULL AUTO_INCREMENT,
|
`rev_page` int(8) unsigned NOT NULL DEFAULT '0',
|
`rev_text_id` int(8) unsigned NOT NULL DEFAULT '0',
|
`rev_comment` varbinary(255) DEFAULT '',
|
`rev_user` int(5) unsigned NOT NULL DEFAULT '0',
|
`rev_user_text` varbinary(255) NOT NULL DEFAULT '',
|
`rev_timestamp` varbinary(14) NOT NULL DEFAULT '',
|
`rev_minor_edit` tinyint(1) unsigned NOT NULL DEFAULT '0',
|
`rev_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0',
|
`rev_len` int(8) unsigned DEFAULT NULL,
|
`rev_parent_id` int(8) unsigned DEFAULT NULL,
|
`rev_sha1` varbinary(32) NOT NULL DEFAULT '',
|
`rev_content_model` varbinary(32) DEFAULT NULL,
|
`rev_content_format` varbinary(64) DEFAULT NULL,
|
PRIMARY KEY (`rev_id`),
|
KEY `rev_timestamp` (`rev_timestamp`),
|
KEY `page_timestamp` (`rev_page`,`rev_timestamp`),
|
KEY `user_timestamp` (`rev_user`,`rev_timestamp`),
|
KEY `usertext_timestamp` (`rev_user_text`,`rev_timestamp`),
|
KEY `page_user_timestamp` (`rev_page`,`rev_user`,`rev_timestamp`),
|
KEY `rev_page_id` (`rev_page`,`rev_id`)
|
) ENGINE=InnoDB AUTO_INCREMENT=898392806 DEFAULT CHARSET=binary
|
*************************** 1. row ***************************
|
Table: revision_comment_temp
|
Create Table: CREATE TABLE `revision_comment_temp` (
|
`revcomment_rev` int(10) unsigned NOT NULL,
|
`revcomment_comment_id` bigint(20) unsigned NOT NULL,
|
PRIMARY KEY (`revcomment_rev`,`revcomment_comment_id`),
|
UNIQUE KEY `revcomment_rev` (`revcomment_rev`)
|
) ENGINE=InnoDB DEFAULT CHARSET=binary
|
*************************** 1. row ***************************
|
Table: comment
|
Create Table: CREATE TABLE `comment` (
|
`comment_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
|
`comment_hash` int(11) NOT NULL,
|
`comment_text` blob NOT NULL,
|
`comment_data` blob,
|
PRIMARY KEY (`comment_id`),
|
KEY `comment_hash` (`comment_hash`)
|
) ENGINE=InnoDB AUTO_INCREMENT=287235912 DEFAULT CHARSET=binary
|
*************************** 1. row ***************************
|
Table: revision_actor_temp
|
Create Table: CREATE TABLE `revision_actor_temp` (
|
`revactor_rev` int(10) unsigned NOT NULL,
|
`revactor_actor` bigint(20) unsigned NOT NULL,
|
`revactor_timestamp` binary(14) NOT NULL DEFAULT '\0\0\0\0\0\0\0\0\0\0\0\0\0\0',
|
`revactor_page` int(10) unsigned NOT NULL,
|
PRIMARY KEY (`revactor_rev`,`revactor_actor`),
|
UNIQUE KEY `revactor_rev` (`revactor_rev`),
|
KEY `actor_timestamp` (`revactor_actor`,`revactor_timestamp`),
|
KEY `page_actor_timestamp` (`revactor_page`,`revactor_actor`,`revactor_timestamp`)
|
) ENGINE=InnoDB DEFAULT CHARSET=binary
|
*************************** 1. row ***************************
|
Table: user
|
Create Table: CREATE TABLE `user` (
|
`user_id` int(5) unsigned NOT NULL AUTO_INCREMENT,
|
`user_name` varbinary(255) NOT NULL DEFAULT '',
|
`user_real_name` varbinary(255) NOT NULL DEFAULT '',
|
`user_password` tinyblob NOT NULL,
|
`user_newpassword` tinyblob NOT NULL,
|
`user_email` tinyblob NOT NULL,
|
`user_touched` varbinary(14) NOT NULL DEFAULT '',
|
`user_token` varbinary(32) NOT NULL DEFAULT '',
|
`user_email_authenticated` varbinary(14) DEFAULT NULL,
|
`user_email_token` varbinary(32) DEFAULT NULL,
|
`user_email_token_expires` varbinary(14) DEFAULT NULL,
|
`user_registration` varbinary(14) DEFAULT NULL,
|
`user_newpass_time` varbinary(14) DEFAULT NULL,
|
`user_editcount` int(11) DEFAULT NULL,
|
`user_password_expires` varbinary(14) DEFAULT NULL,
|
PRIMARY KEY (`user_id`),
|
UNIQUE KEY `user_name` (`user_name`),
|
KEY `user_email_token` (`user_email_token`),
|
KEY `user_email` (`user_email`(50))
|
) ENGINE=InnoDB AUTO_INCREMENT=36583698 DEFAULT CHARSET=binary
|
*************************** 1. row ***************************
|
Table: actor
|
Create Table: CREATE TABLE `actor` (
|
`actor_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
|
`actor_user` int(10) unsigned DEFAULT NULL,
|
`actor_name` varbinary(255) NOT NULL,
|
PRIMARY KEY (`actor_id`),
|
UNIQUE KEY `actor_name` (`actor_name`),
|
UNIQUE KEY `actor_user` (`actor_user`)
|
) ENGINE=InnoDB AUTO_INCREMENT=192760880 DEFAULT CHARSET=binary
|
*************************** 1. row ***************************
|
Table: page
|
Create Table: CREATE TABLE `page` (
|
`page_id` int(8) unsigned NOT NULL AUTO_INCREMENT,
|
`page_namespace` int(11) NOT NULL DEFAULT '0',
|
`page_title` varbinary(255) NOT NULL DEFAULT '',
|
`page_restrictions` tinyblob NOT NULL,
|
`page_is_redirect` tinyint(1) unsigned NOT NULL DEFAULT '0',
|
`page_is_new` tinyint(1) unsigned NOT NULL DEFAULT '0',
|
`page_random` double unsigned NOT NULL DEFAULT '0',
|
`page_touched` varbinary(14) NOT NULL DEFAULT '',
|
`page_links_updated` varbinary(14) DEFAULT NULL,
|
`page_latest` int(8) unsigned NOT NULL DEFAULT '0',
|
`page_len` int(8) unsigned NOT NULL DEFAULT '0',
|
`page_content_model` varbinary(32) DEFAULT NULL,
|
`page_lang` varbinary(35) DEFAULT NULL,
|
PRIMARY KEY (`page_id`),
|
UNIQUE KEY `name_title` (`page_namespace`,`page_title`),
|
KEY `page_random` (`page_random`),
|
KEY `page_len` (`page_len`),
|
KEY `page_redirect_namespace_len` (`page_is_redirect`,`page_namespace`,`page_len`)
|
) ENGINE=InnoDB AUTO_INCREMENT=60840578 DEFAULT CHARSET=binary
|
They are quite big on disk too:
-rw-rw---- 1 mysql mysql 242G May 23 08:17 revision.ibd
|
-rw-rw---- 1 mysql mysql 48G May 23 08:16 revision_comment_temp.ibd
|
-rw-rw---- 1 mysql mysql 48G May 23 08:16 comment.ibd
|
-rw-rw---- 1 mysql mysql 156G May 23 08:17 revision_actor_temp.ibd
|
-rw-rw---- 1 mysql mysql 13G May 23 08:17 user.ibd
|
-rw-rw---- 1 mysql mysql 9.5G May 23 08:17 actor.ibd
|
-rw-rw---- 1 mysql mysql 14G May 23 08:17 page.ibd
|
These are the optimizer variables
| optimizer_switch | index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=on,mrr_cost_based=on,mrr_sort_keys=on,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
|
|
| optimizer_prune_level | 1
|
| optimizer_search_depth | 62 |
|
| optimizer_selectivity_sampling_limit | 100
|
To sum up all the hosts that got the table rebuilt, their version and the result:
db1080 (10.1.38) plan remained on rev_timestamp (later upgraded to 10.1.39 and nothing changed)
db2048 (10.1.37) plan remained on rev_timestamp
db2055 (10.1.38) plan changed to page_timestamp
db2071 (10.1.38) plan changed to page_timestamp
db2085:3311 (10.1.38) plan changed to page_timestamp
db2092 (10.1.38) plan changed to page_timestamp
db2103 (10.1.39) plan changed to page_timestamp
db2112 (10.1.39) plan remained to rev_timestamp
This is a pt-config-diff of two hosts with the same version but with different behaviours:
Variable db2112 db2103
|
========================= ========================= =========================
|
general_log_file db2112.log db2103.log
|
gtid_binlog_pos 171974720-171974720-15... 171974720-171974720-15...
|
gtid_binlog_state 171974720-171974720-15... 171974720-171974720-15...
|
gtid_current_pos 0-171970637-5484646134... 0-171970637-5484646134...
|
gtid_domain_id 180363268 180355190
|
gtid_slave_pos 0-171970637-5484646134... 0-171970637-5484646134...
|
hostname db2112 db2103
|
log_bin_basename /srv/sqldata/db2112-bin /srv/sqldata/db2103-bin
|
log_bin_index /srv/sqldata/db2112-bi... /srv/sqldata/db2103-bi...
|
pid_file /srv/sqldata/db2112.pid /srv/sqldata/db2103.pid
|
server_id 180363268 180355190
|
slow_query_log_file db2112-slow.log db2103-slow.log
|
wsrep_node_name db2112 db2103
|
Index cardinality for two failing hosts
Host with the wrong index being used
+----------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
|
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
|
+----------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
|
| revision | 0 | PRIMARY | 1 | rev_id | A | 830807594 | NULL | NULL | | BTREE | | |
|
| revision | 1 | rev_timestamp | 1 | rev_timestamp | A | 371509902 | NULL | NULL | | BTREE | | |
|
| revision | 1 | page_timestamp | 1 | rev_page | A | 48061342 | NULL | NULL | | BTREE | | |
|
| revision | 1 | page_timestamp | 2 | rev_timestamp | A | 830724521 | NULL | NULL | | BTREE | | |
|
| revision | 1 | user_timestamp | 1 | rev_user | A | 9217389 | NULL | NULL | | BTREE | | |
|
| revision | 1 | user_timestamp | 2 | rev_timestamp | A | 782673192 | NULL | NULL | | BTREE | | |
|
| revision | 1 | usertext_timestamp | 1 | rev_user_text | A | 54411751 | NULL | NULL | | BTREE | | |
|
| revision | 1 | usertext_timestamp | 2 | rev_timestamp | A | 816037318 | NULL | NULL | | BTREE | | |
|
| revision | 1 | page_user_timestamp | 1 | rev_page | A | 48061342 | NULL | NULL | | BTREE | | |
|
| revision | 1 | page_user_timestamp | 2 | rev_user | A | 296272588 | NULL | NULL | | BTREE | | |
|
| revision | 1 | page_user_timestamp | 3 | rev_timestamp | A | 830724521 | NULL | NULL | | BTREE | | |
|
| revision | 1 | rev_page_id | 1 | rev_page | A | 48061342 | NULL | NULL | | BTREE | | |
|
| revision | 1 | rev_page_id | 2 | rev_id | A | 830807594 | NULL | NULL | | BTREE | | |
|
+----------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
|
Host with the right index being used
+----------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
|
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
|
+----------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
|
| revision | 0 | PRIMARY | 1 | rev_id | A | 830575333 | NULL | NULL | | BTREE | | |
|
| revision | 1 | rev_timestamp | 1 | rev_timestamp | A | 371389435 | NULL | NULL | | BTREE | | |
|
| revision | 1 | page_timestamp | 1 | rev_page | A | 48048462 | NULL | NULL | | BTREE | | |
|
| revision | 1 | page_timestamp | 2 | rev_timestamp | A | 830492283 | NULL | NULL | | BTREE | | |
|
| revision | 1 | user_timestamp | 1 | rev_user | A | 9214945 | NULL | NULL | | BTREE | | |
|
| revision | 1 | user_timestamp | 2 | rev_timestamp | A | 782454388 | NULL | NULL | | BTREE | | |
|
| revision | 1 | usertext_timestamp | 1 | rev_user_text | A | 54398321 | NULL | NULL | | BTREE | | |
|
| revision | 1 | usertext_timestamp | 2 | rev_timestamp | A | 815809186 | NULL | NULL | | BTREE | | |
|
| revision | 1 | page_user_timestamp | 1 | rev_page | A | 48048462 | NULL | NULL | | BTREE | | |
|
| revision | 1 | page_user_timestamp | 2 | rev_user | A | 296200325 | NULL | NULL | | BTREE | | |
|
| revision | 1 | page_user_timestamp | 3 | rev_timestamp | A | 830492283 | NULL | NULL | | BTREE | | |
|
| revision | 1 | rev_page_id | 1 | rev_page | A | 48048462 | NULL | NULL | | BTREE | | |
|
| revision | 1 | rev_page_id | 2 | rev_id | A | 830575333 | NULL | NULL | | BTREE | | |
|
+----------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
|
Please let me know which other mysql config flags you'd need.