Details
-
Bug
-
Status: Stalled (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.2(EOL), 10.3(EOL), 10.4(EOL)
-
Debian 10, RAID10 HW
Description
I am creating a database of Wikipedia images metadata to handle backups on my database of preference (MariaDB). The structure is a relatively common one "a single large table" plus some of its properties normalized to external, small tables. This is the structure of the relevant tables:
files.sql |
db1133.eqiad.wmnet[mediabackups]> show create table files\G
|
*************************** 1. row ***************************
|
Table: files
|
Create Table: CREATE TABLE `files` (
|
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
|
`wiki` int(10) unsigned NOT NULL,
|
`upload_name` varbinary(255) DEFAULT NULL,
|
`swift_container` int(10) unsigned DEFAULT NULL,
|
`swift_name` varbinary(270) DEFAULT NULL,
|
`file_type` tinyint(3) unsigned DEFAULT NULL,
|
`status` tinyint(3) unsigned DEFAULT NULL,
|
`sha1` varbinary(40) DEFAULT NULL,
|
`md5` varbinary(32) DEFAULT NULL,
|
`size` int(10) unsigned DEFAULT NULL,
|
`upload_timestamp` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
|
`archived_timestamp` timestamp NULL DEFAULT NULL,
|
`deleted_timestamp` timestamp NULL DEFAULT NULL,
|
`backup_status` tinyint(3) unsigned DEFAULT 1,
|
PRIMARY KEY (`id`),
|
KEY `sha1` (`sha1`),
|
KEY `file_type` (`file_type`),
|
KEY `status` (`status`),
|
KEY `backup_status` (`backup_status`),
|
KEY `wiki` (`wiki`),
|
KEY `swift_container` (`swift_container`),
|
KEY `upload_name` (`upload_name`,`status`),
|
KEY `upload_timestamp` (`upload_timestamp`),
|
CONSTRAINT `files_ibfk_1` FOREIGN KEY (`file_type`) REFERENCES `file_types` (`id`),
|
CONSTRAINT `files_ibfk_2` FOREIGN KEY (`status`) REFERENCES `file_status` (`id`),
|
CONSTRAINT `files_ibfk_3` FOREIGN KEY (`wiki`) REFERENCES `wikis` (`id`),
|
CONSTRAINT `files_ibfk_4` FOREIGN KEY (`backup_status`) REFERENCES `backup_status` (`id`),
|
CONSTRAINT `files_ibfk_5` FOREIGN KEY (`swift_container`) REFERENCES `swift_containers` (`id`)
|
) ENGINE=InnoDB AUTO_INCREMENT=4549852 DEFAULT CHARSET=binary
|
1 row in set (0.001 sec)
|
file_status.sql |
db1133.eqiad.wmnet[mediabackups]> show create table file_status\G
|
*************************** 1. row ***************************
|
Table: file_status
|
Create Table: CREATE TABLE `file_status` (
|
`id` tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
|
`status_name` varbinary(100) DEFAULT NULL,
|
PRIMARY KEY (`id`)
|
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=binary
|
1 row in set (0.001 sec)
|
 |
db1133.eqiad.wmnet[mediabackups]> select * FROM file_status;
|
+----+--------------+
|
| id | status_name |
|
+----+--------------+
|
| 1 | public |
|
| 2 | archived |
|
| 3 | deleted |
|
| 4 | hard-deleted |
|
+----+--------------+
|
4 rows in set (0.001 sec)
|
backup_status.sql |
db1133.eqiad.wmnet[mediabackups]> show create table backup_status\G
|
*************************** 1. row ***************************
|
Table: backup_status
|
Create Table: CREATE TABLE `backup_status` (
|
`id` tinyint(3) unsigned NOT NULL,
|
`backup_status_name` varbinary(100) DEFAULT NULL,
|
PRIMARY KEY (`id`)
|
) ENGINE=InnoDB DEFAULT CHARSET=binary
|
1 row in set (0.000 sec)
|
 |
db1133.eqiad.wmnet[mediabackups]> select * FROM backup_status;
|
+----+--------------------+
|
| id | backup_status_name |
|
+----+--------------------+
|
| 1 | pending |
|
| 2 | processing |
|
| 3 | backedup |
|
| 4 | error |
|
+----+--------------------+
|
4 rows in set (0.001 sec)
|
wikis.sql |
db1133.eqiad.wmnet[mediabackups]> show create table wikis\G
|
*************************** 1. row ***************************
|
Table: wikis
|
Create Table: CREATE TABLE `wikis` (
|
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
|
`wiki_name` varbinary(255) DEFAULT NULL,
|
`type` tinyint(3) unsigned DEFAULT NULL,
|
PRIMARY KEY (`id`),
|
UNIQUE KEY `wiki_name` (`wiki_name`)
|
) ENGINE=InnoDB AUTO_INCREMENT=1206 DEFAULT CHARSET=binary
|
1 row in set (0.001 sec)
|
 |
db1133.eqiad.wmnet[mediabackups]> select count(*) FROM wikis;
|
+----------+
|
| count(*) |
|
+----------+
|
| 999 |
|
+----------+
|
1 row in set (0.001 sec)
|
When I try to run a query with an ORDER BY, followed by a short limit, using the PRIMARY KEY, it preferes to scan the whole table and perform a filesort rather than using the index efectively to scan just ~10 rows:
query.sql |
-- simple query, just denormalize the data so it is in "human" readable form
|
db1133.eqiad.wmnet[mediabackups]> SELECT wiki_name, upload_name, size, status_name, sha1, backup_status_name
|
FROM files JOIN file_status ON file_status.id = files.status
|
JOIN backup_status ON backup_status.id = files.backup_status
|
JOIN wikis ON wikis.id = files.wiki
|
ORDER BY files.id
|
LIMIT 10;
|
|
<---- waiting for minutes...
|
 |
^CCtrl-C -- query killed. Continuing normally.
|
ERROR 1317 (70100): Query execution was interrupted
|
|
db1133.eqiad.wmnet[mediabackups]> EXPLAIN select wiki_name, upload_name, size, status_name, sha1, backup_status_name FROM files JOIN file_status ON file_status.id = files.status JOIN backup_status ON backup_status.id = files.backup_status JOIN wikis ON wikis.id = files.wiki ORDER BY files.id LIMIT 10\G
|
*************************** 1. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: files
|
type: ALL <--------------------------------------------------- ?????
|
possible_keys: status,backup_status,wiki
|
key: NULL
|
key_len: NULL
|
ref: NULL
|
rows: 4549851
|
Extra: Using where; Using temporary; Using filesort <------------------------- ??????
|
*************************** 2. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: wikis
|
type: eq_ref
|
possible_keys: PRIMARY
|
key: PRIMARY
|
key_len: 4
|
ref: mediabackups.files.wiki
|
rows: 1
|
Extra:
|
*************************** 3. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: backup_status
|
type: ALL
|
possible_keys: PRIMARY
|
key: NULL
|
key_len: NULL
|
ref: NULL
|
rows: 4
|
Extra: Using where; Using join buffer (flat, BNL join)
|
*************************** 4. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: file_status
|
type: eq_ref
|
possible_keys: PRIMARY
|
key: PRIMARY
|
key_len: 1
|
ref: mediabackups.files.status
|
rows: 1
|
Extra:
|
4 rows in set (0.001 sec)
|
This is a major regression because being able to select ranges in order is a top feature of a MySQL-type db, and has to be fast. Most of our web pages are just "list of items in order". I can see an issue with a secondary index among many, but refusing to use the primary key?
The table was analyzed and rebuilt, to the same results. Optimizer trace and more information is available at our web tracker: https://phabricator.wikimedia.org/P13595#74956
To check it was not something I was doing wrong, I checked the same query, after importing a copy of the database on Percona Server 8, and I got this:
percona_query.sql |
db2102.codfw.wmnet[mediabackups]> explain select wiki_name, upload_name, size, status_name, sha1, backup_status_name FROM files JOIN file_status ON file_status.id = files.status JOIN backup_status ON backup_status.id = files.backup_status JOIN wikis ON wikis.id = files.wiki ORDER BY files.id LIMIT 10\G
|
*************************** 1. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: files
|
partitions: NULL
|
type: index
|
possible_keys: status,backup_status,wiki
|
key: PRIMARY <------------------------------------- preferred index
|
key_len: 4
|
ref: NULL
|
rows: 10
|
filtered: 100.00
|
Extra: Using where <------------------------------------ no filesort
|
*************************** 2. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: file_status
|
partitions: NULL
|
type: eq_ref
|
possible_keys: PRIMARY
|
key: PRIMARY
|
key_len: 1
|
ref: mediabackups.files.status
|
rows: 1
|
filtered: 100.00
|
Extra: NULL
|
*************************** 3. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: backup_status
|
partitions: NULL
|
type: eq_ref
|
possible_keys: PRIMARY
|
key: PRIMARY
|
key_len: 1
|
ref: mediabackups.files.backup_status
|
rows: 1
|
filtered: 100.00
|
Extra: NULL
|
*************************** 4. row ***************************
|
id: 1
|
select_type: SIMPLE
|
table: wikis
|
partitions: NULL
|
type: eq_ref
|
possible_keys: PRIMARY
|
key: PRIMARY
|
key_len: 4
|
ref: mediabackups.files.wiki
|
rows: 1
|
filtered: 100.00
|
Extra: NULL
|
4 rows in set, 1 warning (0.035 sec)
|
 |
db2102.codfw.wmnet[mediabackups]> select wiki_name, upload_name, size, status_name, sha1, backup_status_name FROM files JOIN file_status ON file_status.id = files.status JOIN backup_status ON backup_status.id = files.backup_status JOIN wikis ON wikis.id = files.wiki ORDER BY files.id LIMIT 10;
|
+-----------+---------------------------------------------------------+--------+-------------+------------------------------------------+--------------------+
|
| wiki_name | upload_name | size | status_name | sha1 | backup_status_name |
|
+-----------+---------------------------------------------------------+--------+-------------+------------------------------------------+--------------------+
|
| enwiki | !!!_(Chk_Chk_Chk)_-_One_Girl_One_Boy_cover_art.jpg | 31850 | public | 2c5f4c5ff0e57ffcea85c1da92b4599336d75fb9 | backedup |
|
| enwiki | !!!_-_!!!_album_cover.jpg | 43672 | public | 25c046a856d14314cda3c741539f41fbc6c63fe2 | backedup |
|
| enwiki | !!!_-_Wallop.png | 118745 | public | 3fccf40bfd27ccfac1d057198fd4315afd50b42d | backedup |
|
| enwiki | !0_Trombones_Like_2_Pianos.jpg | 25319 | public | 3ec8b40e128ad8c677b869ef9280104c03bff2e5 | backedup |
|
| enwiki | !ClaudiaPascoal.png | 26203 | public | 66a8f68f1ac4708aaa600dcc5a5690836868a2c9 | backedup |
|
| enwiki | !Haunu.ogg | 13450 | public | 77577544869768e619829c4e9e6e0ddde94f9421 | backedup |
|
| enwiki | !Hero_(album).jpg | 38664 | public | c42574f1639e0bd96749f0ac3e76b69573ce2ba3 | backedup |
|
| enwiki | !Women_Art_Revolution_(documentary_film)_poster_art.jpg | 15446 | public | fd33557daf97103c72401e7a33723360cd0db9f4 | backedup |
|
| enwiki | !_(The_Song_Formely_Known_As)_by_Regurgitator.png | 146525 | public | c491daf7642bfa1d192b22bc83cc41c721d1092e | backedup |
|
| enwiki | ""Motor-Cycle"_LP_cover-Lotti_Golden.jpg | 17001 | public | dbc8dcac96e647d0dfcba702c88e86a0935e2d88 | backedup |
|
+-----------+---------------------------------------------------------+--------+-------------+------------------------------------------+--------------------+
|
10 rows in set (0.035 sec)
|
~10 rows read, using the index to avoid the sort + short circuit the scan.
I could not make the query on MariaDB behave as expected with index hints. The only way I could do it is by doing:
workaround-not-really.sql |
db1133.eqiad.wmnet[mediabackups]> select wiki_name, upload_name, size, status_name, sha1, backup_status_name FROM files JOIN file_status ON file_status.id = files.status JOIN backup_status ON backup_status.id = files.backup_status JOIN wikis ON wikis.id = files.wiki WHERE files.id BETWEEN 1 AND 10; -- note the only change at the end
|
+-----------+---------------------------------------------------------+--------+-----------
|
| wiki_name | upload_name | size | status_nam
|
+-----------+---------------------------------------------------------+--------+-----------
|
| enwiki | !!!_(Chk_Chk_Chk)_-_One_Girl_One_Boy_cover_art.jpg | 31850 | public
|
| enwiki | !!!_-_!!!_album_cover.jpg | 43672 | public
|
| enwiki | !!!_-_Wallop.png | 118745 | public
|
| enwiki | !0_Trombones_Like_2_Pianos.jpg | 25319 | public
|
| enwiki | !ClaudiaPascoal.png | 26203 | public
|
| enwiki | !Haunu.ogg | 13450 | public
|
| enwiki | !Hero_(album).jpg | 38664 | public
|
| enwiki | !Women_Art_Revolution_(documentary_film)_poster_art.jpg | 15446 | public
|
| enwiki | !_(The_Song_Formely_Known_As)_by_Regurgitator.png | 146525 | public
|
| enwiki | ""Motor-Cycle"_LP_cover-Lotti_Golden.jpg | 17001 | public
|
+-----------+---------------------------------------------------------+--------+-----------
|
10 rows in set (0.001 sec)
|
But the between usage is not really practical in a general case (gaps on PKs).
I can provide a partial dump of the table (the public part of it), if needed.