[MDEV-12610] MariaDB start is slow Created: 2017-04-27 Updated: 2020-08-25 Resolved: 2017-06-09 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Encryption, Storage Engine - InnoDB, Storage Engine - XtraDB |
| Affects Version/s: | 10.1.22, 10.1.23 |
| Fix Version/s: | 10.1.25 |
| Type: | Bug | Priority: | Major |
| Reporter: | Darshit Gavhane | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Fixed | Votes: | 2 |
| Labels: | mariadb, mariadb-restart, slow | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Description |
|
It takes about 10 mins to start, had seen the same issue earlier in 10.1.18 which was fixed but has reoccurred in 10.1.22
|
| Comments |
| Comment by Marko Mäkelä [ 2017-05-03 ] | ||
|
If this problem is reproducible, please run "perf top" during the hang, or use http://poormansprofiler.org/ or similar to find out what InnoDB is spending its time on. | ||
| Comment by Valerii Kravchuk [ 2017-06-01 ] | ||
|
We have the same problem with 10.1.23 when there are many tables/.ibd files (24000+) | ||
| Comment by Alex Boag-Munroe [ 2017-06-03 ] | ||
|
I've got the same issue on a system with a few hundred thousand tables, takes 10 minutes or so between InnoDB: Highest supported file format is Barracuda. Currently running a debug build while investigating another issue and the log is full of entries like: 2017-06-03 14:33:44 140090905216896 [Note] InnoDB: Created tablespace for space 3981695 name SCHEMANAME/TABLENAME key_id 0 encryption 0. According to perf top:
I'll run perf top again when I've got a normal startup on standard binaries. | ||
| Comment by Alex Boag-Munroe [ 2017-06-03 ] | ||
|
Starting up while not on a debug binary shows:
| ||
| Comment by Marko Mäkelä [ 2017-06-05 ] | ||
|
We can trivially avoid this call when the default page size is being used, and neither page_compression nor encryption is used. To avoid the start-up penalty in those cases, maybe we should check if the SYS_TABLESPACES.FLAGS are in the incompatible format that was introduced in MariaDB 10.1.0. | ||
| Comment by Marko Mäkelä [ 2017-06-05 ] | ||
|
Ninpo, thank you for the "perf top" results! While the root cause of this slowness appears to be that the function fsp_flags_try_adjust() is being unconditionally invoked on every .ibd file on startup, the top function fsp_header_get_crypt_offset() for the non-debug binary is noteworthy. That the page checksum validation (ut_crc32_sse42) takes some resources is known. I think that the function fsp_header_get_crypt_offset() is unacceptably complex and slow. In MariaDB 10.2, I replaced the function with fsp_header_get_encryption_offset(), which uses a simple arithmetic formula that returns the equivalent result as the loop. I think that we should port this change to 10.1. | ||
| Comment by Alex Boag-Munroe [ 2017-06-05 ] | ||
|
Probably worth noting that page compression has been in use on these systems at some point, though no tables are using it currently. innodb_compression_algorithm=zlib is set in my.cnf for when we have looked at it. Unless I'm misunderstanding what you said about page compression At any rate, I'm glad I could help. Our 700k table schema seems destined to hit/find edge cases in Maria. | ||
| Comment by Jan Lindström (Inactive) [ 2017-06-07 ] | ||
|
https://github.com/MariaDB/server/commit/6fb92442e2e33d0cbe56395972aef50cc8a84f5d | ||
| Comment by Marko Mäkelä [ 2017-06-07 ] | ||
|
Please address my review comments. | ||
| Comment by Jan Lindström (Inactive) [ 2017-06-08 ] | ||
|
After review fixes: | ||
| Comment by Jan Lindström (Inactive) [ 2017-06-09 ] | ||
|
Author: Jan Lindström <jan.lindstrom@mariadb.com> Problem appears to be that the function fsp_flags_try_adjust() Ported implementation of fsp_header_get_encryption_offset() Introduced a new function fil_crypt_read_crypt_data() fil_crypt_find_space_to_rotate(): Now that page 0 for every .ibd fil_space_crypt_get_status(): Now that page 0 for every .ibd fil_crypt_thread(): Add is_stopping condition for tablespace fil_space_create: Remove page_0_crypt_read and extra fil_open_single_table_tablespace(): We call fsp_flags_try_adjust fil_space_t::page_0_crypt_read removed. Added test case innodb-first-page-read to test startup when | ||
| Comment by Nathan Landis [ 2018-01-26 ] | ||
|
Anyone else still seeing this in recent 10.2 versions? I'm testing an upgrade from 10.0.13 > 10.1.30 > 10.2.12 and seem to be encountering this still (~12min startup delay on ~70k tables that wasn't there before) | ||
| Comment by Daniel Black [ 2018-01-27 ] | ||
|
Logs and configuration would help verify this. | ||
| Comment by Nathan Landis [ 2018-01-29 ] | ||
|
Apologies! Log and config below from the latest version I tried (10.2.7). I'm using MySQL Sandbox for intermediate upgrade testing. I know I'm several releases behind on that, upgrading and retrying (or trying an instance without sandbox) is on my todo list. 180126 16:26:03 mysqld_safe Starting mysqld daemon with databases from /redacted/dir ---config: [mysql] (\d) > ' [client] [mysqld]
| ||
| Comment by Marko Mäkelä [ 2018-01-29 ] | ||
|
lu_nate, | ||
| Comment by Nathan Landis [ 2018-01-29 ] | ||
|
Ah, that is very helpful Marko, I missed seeing that issue. I will stop banging my head against in now | ||
| Comment by Marko Mäkelä [ 2018-02-16 ] | ||
|
As noted in |