In mdberror.log
the problems seem to start with the following:
2024-03-13 8:36:02 79392687 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13565944128412, current LSN=13581932190873. Shutdown is in progress.
|
2024-03-13 8:36:02 79392636 [Note] InnoDB: Crash recovery was broken between LSN=13581932190873 and checkpoint LSN=13565944139910.
|
2024-03-13 8:36:02 79392853 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13565944139910, current LSN=13582015684857. Shutdown is in progress.
|
2024-03-13 8:36:15 0 [Warning] InnoDB: A long semaphore wait:
|
--Thread 139904351422208 has waited at dict0dict.cc line 1094 for 247.00 seconds the semaphore:
|
Mutex at 0x5563990041c0, Mutex DICT_SYS created ./storage/innobase/dict/dict0dict.cc:1038, lock var 2
|
|
2024-03-13 8:36:15 0 [Warning] InnoDB: A long semaphore wait:
|
--Thread 139857005311744 has waited at dict0dict.cc line 1094 for 242.00 seconds the semaphore:
|
Mutex at 0x5563990041c0, Mutex DICT_SYS created ./storage/innobase/dict/dict0dict.cc:1038, lock var 2
|
|
...
|
|
2024-03-13 8:42:31 0 [Note] InnoDB: A semaphore wait:
|
--Thread 139856289801984 has waited at dict0dict.cc line 1094 for 372.00 seconds the semaphore:
|
Mutex at 0x5563990041c0, Mutex DICT_SYS created ./storage/innobase/dict/dict0dict.cc:1038, lock var 2
|
|
InnoDB: Pending reads 0, writes 0
|
2024-03-13 8:42:31 0 [ERROR] [FATAL] InnoDB: Semaphore wait has lasted > 600 seconds. We intentionally crash the server because it appears to be hung.
|
240313 8:42:31 [ERROR] mysqld got signal 6 ;
|
…
|
/usr/sbin/mariadbd(handle_fatal_signal+0x475)[0x556398227335]
|
2024-03-13 8:42:43 0 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13566179716303, current LSN=13584607376410. Shutdown is in progress.
|
2024-03-13 8:42:43 0 [Note] InnoDB: Crash recovery was broken between LSN=13584607376410 and checkpoint LSN=13566187792857.
|
2024-03-13 8:42:43 79393095 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13566187792857, current LSN=13584651283446. Shutdown is in progress.
|
2024-03-13 8:42:44 79392800 [Note] InnoDB: Crash recovery was broken between LSN=13584651283446 and checkpoint LSN=13566187793323.
|
2024-03-13 8:42:44 79388047 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13566187793323, current LSN=13584809292367. Shutdown is in progress.
|
/lib/x86_64-linux-gnu/libpthread.so.0(+0x13140)[0x7f8b022af140]
|
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x141)[0x7f8b01de7ce1]
|
/lib/x86_64-linux-gnu/libc.so.6(abort+0x123)[0x7f8b01dd1537]
|
/usr/sbin/mariadbd(+0x652edf)[0x556397ec9edf]
|
/usr/sbin/mariadbd(+0x6476c0)[0x556397ebe6c0]
|
/usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic13timer_generic7executeEPv+0x38)[0x556398778b68]
|
/usr/sbin/mariadbd(_ZN5tpool4task7executeEv+0x32)[0x556398779862]
|
/usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x4f)[0x55639877769f]
|
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xceed0)[0x7f8b02197ed0]
|
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7)[0x7f8b022a3ea7]
|
2024-03-13 8:43:13 0 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13566187793323, current LSN=13584918550202. Shutdown is in progress.
|
2024-03-13 8:43:13 0 [Note] InnoDB: Crash recovery was broken between LSN=13584809292367 and checkpoint LSN=13566206207179.
|
2024-03-13 8:43:13 0 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13566206207179, current LSN=13584919711302. Shutdown is in progress.
|
2024-03-13 8:43:13 0 [Note] InnoDB: Crash recovery was broken between LSN=13584919711302 and checkpoint LSN=13566206398343.
|
There are error messages that InnoDB is or was not crash safe. These messages were improved in MDEV-29982; before that there was no was broken indication.
According to the log messages, InnoDB may not have been crash safe at the time a crash (related to your another report MDEV-35139) occurred. Interestingly, the built-in stack_trace reporter is taking rather long time to execute (it could also hang; see MDEV-21010), and at this time, the buf_flush_page_cleaner thread appears to be able to advance the log checkpoint.
It is a matter of luck how corrupted the database might be after a recovery attempt. In this case, we ran out of luck as follows:
2024-03-13 8:48:56 0 [ERROR] InnoDB: We scanned the log up to 13566206397952. A checkpoint was at 13566206398343 and the maximum LSN on a database page was 0. It is possible that the database is now corrupt!
|
2024-03-13 8:48:56 0 [ERROR] InnoDB: Page [page id: space=0, page number=2] log sequence number 13576669775087 is in the future! Current system log sequence number 13566207559443.
|
2024-03-13 8:48:56 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
|
2024-03-13 8:48:57 0 [ERROR] InnoDB: Page [page id: space=0, page number=6] log sequence number 13577636973585 is in the future! Current system log sequence number 13566207559443.
|
2024-03-13 8:48:57 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
|
2024-03-13 8:48:57 0 [ERROR] InnoDB: Page [page id: space=0, page number=45] log sequence number 13577662804184 is in the future! Current system log sequence number 13566207559443.
|
2024-03-13 8:48:57 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
|
2024-03-13 8:48:57 0 [ERROR] InnoDB: unsupported undo header state 104
|
2024-03-13 8:48:57 0 [Note] InnoDB: Retry with innodb_force_recovery=5
|
2024-03-13 8:48:57 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption
|
Because some changes to some pages were not recovered, some links to undo log pages got broken. With luck, you might be able to start up the server with innodb_force_recovery=5 and use something like mariadb-dump to extract the contents of the database, essentially at READ UNCOMMITTED isolation level. The contents of the database could be inconsistent in further ways, if also some B-tree or BLOB page links are broken.
As I have noted in MDEV-14462, I am afraid that this log overwrite problem can’t be prevented fully. There is a possible worst-case scenario where there is a large number of concurrent threads that will produce a lot of redo log. Yes, before each write operation, each thread would call log_free_check() and optionally wait for a log checkpoint to complete before they proceed. After a thread passed this check, it would acquire some buffer page latches, potentially also on pages that would have to be written out in order to advance the log checkpoint. Meanwhile, other threads could generate more log and write it out in mtr_t::commit(). If there are many threads that produce a lot of log concurrently, it will simply not be possible to advance the log checkpoint.
I can see two ways to make this safe.
- Allow only one thread to produce log at a time. This would totally ruin the write scalability and things like the group commit mechanism (
MDEV-26789).
- Implement "shadow pages" for any modifications so that pending write operations would not block the log checkpoint from advancing. That is, the pending writes would not be applied to the page before mtr_t::commit(), so that other threads in mtr_t::commit() could have buf_flush_page_cleaner() write out the page as it was before the pending changes, and therefore advance the log checkpoint and prevent the log file overrun.
With MDEV-14425 and MDEV-27774 this issue got even harder to fix, because log_sys.mutex was replaced with log_sys.latch, which is normally only held in shared mode. That is, we will allow multiple concurrent mtr_t::commit() to write to non-overlapping portions of log_sys.buf. Only during log checkpoint and DDL operations, the log_sys.latch will be held in exclusive mode.
I have seen a rule of thumb that the innodb_log_file_size should correspond to one hour worth of writes (the growth of the InnoDB LSN in 1 hour). In MDEV-19176 and MDEV-29911, the crash recovery was overhauled in such a way that it should be practical to make innodb_log_file_size even larger than innodb_buffer_pool_size. A larger log may make crash recovery consume more time, depending on the checkpoint age at the time when the server was killed. The checkpoint age is defined as the difference between the current LSN and the latest checkpoint LSN. This is the amount of log that needs to be parsed and applied during crash recovery.
In mdberror.log
the problems seem to start with the following:
2024-03-13 8:36:02 79392687 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13565944128412, current LSN=13581932190873. Shutdown is in progress.
2024-03-13 8:36:02 79392636 [Note] InnoDB: Crash recovery was broken between LSN=13581932190873 and checkpoint LSN=13565944139910.
2024-03-13 8:36:02 79392853 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13565944139910, current LSN=13582015684857. Shutdown is in progress.
2024-03-13 8:36:15 0 [Warning] InnoDB: A long semaphore wait:
--Thread 139904351422208 has waited at dict0dict.cc line 1094 for 247.00 seconds the semaphore:
Mutex at 0x5563990041c0, Mutex DICT_SYS created ./storage/innobase/dict/dict0dict.cc:1038, lock var 2
2024-03-13 8:36:15 0 [Warning] InnoDB: A long semaphore wait:
--Thread 139857005311744 has waited at dict0dict.cc line 1094 for 242.00 seconds the semaphore:
Mutex at 0x5563990041c0, Mutex DICT_SYS created ./storage/innobase/dict/dict0dict.cc:1038, lock var 2
...
2024-03-13 8:42:31 0 [Note] InnoDB: A semaphore wait:
--Thread 139856289801984 has waited at dict0dict.cc line 1094 for 372.00 seconds the semaphore:
Mutex at 0x5563990041c0, Mutex DICT_SYS created ./storage/innobase/dict/dict0dict.cc:1038, lock var 2
InnoDB: Pending reads 0, writes 0
2024-03-13 8:42:31 0 [ERROR] [FATAL] InnoDB: Semaphore wait has lasted > 600 seconds. We intentionally crash the server because it appears to be hung.
240313 8:42:31 [ERROR] mysqld got signal 6 ;
…
/usr/sbin/mariadbd(handle_fatal_signal+0x475)[0x556398227335]
2024-03-13 8:42:43 0 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13566179716303, current LSN=13584607376410. Shutdown is in progress.
2024-03-13 8:42:43 0 [Note] InnoDB: Crash recovery was broken between LSN=13584607376410 and checkpoint LSN=13566187792857.
2024-03-13 8:42:43 79393095 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13566187792857, current LSN=13584651283446. Shutdown is in progress.
2024-03-13 8:42:44 79392800 [Note] InnoDB: Crash recovery was broken between LSN=13584651283446 and checkpoint LSN=13566187793323.
2024-03-13 8:42:44 79388047 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13566187793323, current LSN=13584809292367. Shutdown is in progress.
/lib/x86_64-linux-gnu/libpthread.so.0(+0x13140)[0x7f8b022af140]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x141)[0x7f8b01de7ce1]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x123)[0x7f8b01dd1537]
/usr/sbin/mariadbd(+0x652edf)[0x556397ec9edf]
/usr/sbin/mariadbd(+0x6476c0)[0x556397ebe6c0]
/usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic13timer_generic7executeEPv+0x38)[0x556398778b68]
/usr/sbin/mariadbd(_ZN5tpool4task7executeEv+0x32)[0x556398779862]
/usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x4f)[0x55639877769f]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xceed0)[0x7f8b02197ed0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7)[0x7f8b022a3ea7]
2024-03-13 8:43:13 0 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13566187793323, current LSN=13584918550202. Shutdown is in progress.
2024-03-13 8:43:13 0 [Note] InnoDB: Crash recovery was broken between LSN=13584809292367 and checkpoint LSN=13566206207179.
2024-03-13 8:43:13 0 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=13566206207179, current LSN=13584919711302. Shutdown is in progress.
2024-03-13 8:43:13 0 [Note] InnoDB: Crash recovery was broken between LSN=13584919711302 and checkpoint LSN=13566206398343.
There are error messages that InnoDB is or was not crash safe. These messages were improved in
MDEV-29982; before that there was no was broken indication.According to the log messages, InnoDB may not have been crash safe at the time a crash (related to your another report
MDEV-35139) occurred. Interestingly, the built-in stack_trace reporter is taking rather long time to execute (it could also hang; seeMDEV-21010), and at this time, the buf_flush_page_cleaner thread appears to be able to advance the log checkpoint.It is a matter of luck how corrupted the database might be after a recovery attempt. In this case, we ran out of luck as follows:
2024-03-13 8:48:56 0 [ERROR] InnoDB: We scanned the log up to 13566206397952. A checkpoint was at 13566206398343 and the maximum LSN on a database page was 0. It is possible that the database is now corrupt!
2024-03-13 8:48:56 0 [ERROR] InnoDB: Page [page id: space=0, page number=2] log sequence number 13576669775087 is in the future! Current system log sequence number 13566207559443.
2024-03-13 8:48:56 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
2024-03-13 8:48:57 0 [ERROR] InnoDB: Page [page id: space=0, page number=6] log sequence number 13577636973585 is in the future! Current system log sequence number 13566207559443.
2024-03-13 8:48:57 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
2024-03-13 8:48:57 0 [ERROR] InnoDB: Page [page id: space=0, page number=45] log sequence number 13577662804184 is in the future! Current system log sequence number 13566207559443.
2024-03-13 8:48:57 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
2024-03-13 8:48:57 0 [ERROR] InnoDB: unsupported undo header state 104
2024-03-13 8:48:57 0 [Note] InnoDB: Retry with innodb_force_recovery=5
2024-03-13 8:48:57 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption
Because some changes to some pages were not recovered, some links to undo log pages got broken. With luck, you might be able to start up the server with innodb_force_recovery=5 and use something like mariadb-dump to extract the contents of the database, essentially at READ UNCOMMITTED isolation level. The contents of the database could be inconsistent in further ways, if also some B-tree or BLOB page links are broken.
As I have noted in
MDEV-14462, I am afraid that this log overwrite problem can’t be prevented fully. There is a possible worst-case scenario where there is a large number of concurrent threads that will produce a lot of redo log. Yes, before each write operation, each thread would call log_free_check() and optionally wait for a log checkpoint to complete before they proceed. After a thread passed this check, it would acquire some buffer page latches, potentially also on pages that would have to be written out in order to advance the log checkpoint. Meanwhile, other threads could generate more log and write it out in mtr_t::commit(). If there are many threads that produce a lot of log concurrently, it will simply not be possible to advance the log checkpoint.I can see two ways to make this safe.
MDEV-26789).With
MDEV-14425andMDEV-27774this issue got even harder to fix, because log_sys.mutex was replaced with log_sys.latch, which is normally only held in shared mode. That is, we will allow multiple concurrent mtr_t::commit() to write to non-overlapping portions of log_sys.buf. Only during log checkpoint and DDL operations, the log_sys.latch will be held in exclusive mode.I have seen a rule of thumb that the innodb_log_file_size should correspond to one hour worth of writes (the growth of the InnoDB LSN in 1 hour). In
MDEV-19176andMDEV-29911, the crash recovery was overhauled in such a way that it should be practical to make innodb_log_file_size even larger than innodb_buffer_pool_size. A larger log may make crash recovery consume more time, depending on the checkpoint age at the time when the server was killed. The checkpoint age is defined as the difference between the current LSN and the latest checkpoint LSN. This is the amount of log that needs to be parsed and applied during crash recovery.