[MDEV-27629] Performance problems due to incorrect table statistics after database imports Created: 2022-01-26 Updated: 2022-06-15 Resolved: 2022-06-15 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.6.5 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Critical |
| Reporter: | Stijn | Assignee: | Marko Mäkelä |
| Resolution: | Incomplete | Votes: | 1 |
| Labels: | regression | ||
| Environment: |
CentOS 8 |
||
| Issue Links: |
|
||||||||
| Description |
|
We are seeing big performance problems for databases that were dumped (mysqldump) and then restored in an empty database. The query plans are sometimes completely inefficient and the 'ANALYZE FORMAT = JSON' list a low number of rows in some tables while there are for example 100k rows present ("rows": 1, "r_rows": 99924). This results in complete table scans and queries that normally take ~10ms, now take over 100s to complete. Doing an analyze table for all tables in the database (or the ones used in the query) fixes the problem. We are now implementing a cron job that does (mariadb-check --analyze --all-databases) every evening, but that is not an ideal situation. I think the problem is that the innodb_table_stats table statistics are not (correctly) updated during the bulk insert of sql data. We did not have this problem (or at least did not notice it) in MariaDB 10.4. |
| Comments |
| Comment by Marko Mäkelä [ 2022-01-26 ] |
|
stijn, thank you for the report? How exactly are you loading the data? Is that from an SQL dump? Can you provide a small example with some dummy data? If this is about the InnoDB persistent statistics, it could be possible that |
| Comment by Stijn [ 2022-01-26 ] |
|
The sql dump is created with 'mysqldump' (full database dump) and loaded into an empty database with 'mysql' command. I will try to recreate the issue with dummy data. I'm not sure which of these stats is the problem, but I'll check them while trying to reproduce the issue with dummy data. |
| Comment by Niels Hendriks [ 2022-01-31 ] |
|
Just to confirm we've noticed the same issue. Last week we updated ~200 VMs to MariaDB 10.6 (coming from a mix between 10.2 and 10.5) and we've had multiple people complaining they saw poor query performance for newly created databases after loading in a mysqldump file. |
| Comment by Jonathan Champ [ 2022-02-03 ] |
|
@Stijn Have you tried using mysqldump --skip-column-statistics? Newer versions of mysqldump try and export statistics by default which I assume are being then imported into the database as a "time saving" measure. However, if the column statistics are instead causing this problem, then that would be good to know. |
| Comment by Niels Hendriks [ 2022-02-07 ] |
|
In case it helps, an update from our side: We are seeing this on MariaDB 1:10.6.5+maria~buster - But NOT on 1:10.6.5+maria~bullseye with the same my.cnf So perhaps specific to certain OS versions...? In our case we don't see column statistics in the mysqldump file either, fwiw |
| Comment by Marko Mäkelä [ 2022-02-11 ] |
|
Could this be the same issue as |
| Comment by Stijn [ 2022-02-23 ] |
|
@Marko, that seems to be a very similar bug. Hopefully that fix will also fix this issue. |
| Comment by Niels Hendriks [ 2022-03-05 ] |
|
Is there anything known about when 10.6.8 will be released? We're still running analyze table on 10~ dbs/week due to load skyrocketing when this bug occurs. Would love to no longer have to do this. |
| Comment by Niels Hendriks [ 2022-03-17 ] |
|
Any update on this? Perhaps as a workaround we can downgrade all our servers to 10.6.4? I believe this is not affected, correct? |
| Comment by Elena Stepanova [ 2022-04-17 ] |
The release is currently planned for 2022-04-29, you can find the schedule at the JIRA dashboard. |