Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Not a Bug
-
10.5.10, 10.6.3
-
Linux Ubuntu 20.04 LTS
Description
log_close() may display a log_overwrite_warning() which contains the text:
InnoDB: The age of the last checkpoint is <age>, which exceeds the log capacity <capacity>.
This is printed using sql_print_error, meaning it appears as an ERROR in the log.
However, log_close() appears to simply treat this as a warning when the redo log is full and will return PAGE_FLUSH_SYNC to indicate that "furious flushing is needed".
Therefore, this appears not to be an error that would result in important redo log entries being overwritten. Instead, this appears to just be an indication that the redo logs are full and that the server will pause to flush before continuing.
Incorrectly reporting this as an ERROR will cause users to worry that there is a data integrity problem, when in reality it's just an informational note.
I am getting many of these ERRORs in my logs on a slave server when I do a shutdown with innodb_fast_shutdown = 0, on a database that regularly writes around 200MB per minute to the 1G redo log file and therefore fills it up every few minutes.
Here is an example of the error messages on shutdown:
2022-02-05 14:08:29 0 [Note] /usr/sbin/mariadbd (initiated by: root[root] @ localhost []): Normal shutdown
|
2022-02-05 14:08:29 0 [Note] InnoDB: FTS optimize thread exiting.
|
2022-02-05 14:08:29 0 [Note] InnoDB: to purge 9 transactions
|
2022-02-05 14:08:30 0 [Note] InnoDB: Starting shutdown...
|
2022-02-05 14:08:30 0 [Note] InnoDB: Dumping buffer pool(s) to /mariadb/ib_buffer_pool
|
2022-02-05 14:08:30 0 [Note] InnoDB: Restricted to 32388 pages due to innodb_buf_pool_dump_pct=25
|
2022-02-05 14:08:30 0 [Note] InnoDB: Buffer pool(s) dump completed at 220205 14:08:30
|
2022-02-05 14:11:05 0 [ERROR] InnoDB: The age of the last checkpoint is 966376793, which exceeds the log capacity 966365799.
|
2022-02-05 14:11:21 0 [ERROR] InnoDB: The age of the last checkpoint is 994921739, which exceeds the log capacity 966365799.
|
2022-02-05 14:11:37 0 [ERROR] InnoDB: The age of the last checkpoint is 1022257877, which exceeds the log capacity 966365799.
|
2022-02-05 14:11:53 0 [ERROR] InnoDB: The age of the last checkpoint is 1047226930, which exceeds the log capacity 966365799.
|
2022-02-05 14:12:09 0 [ERROR] InnoDB: The age of the last checkpoint is 1071228676, which exceeds the log capacity 966365799.
|
2022-02-05 14:12:25 0 [ERROR] InnoDB: The age of the last checkpoint is 1093753956, which exceeds the log capacity 966365799.
|
2022-02-05 14:12:41 0 [ERROR] InnoDB: The age of the last checkpoint is 1112481150, which exceeds the log capacity 966365799.
|
2022-02-05 14:12:54 0 [Note] InnoDB: Waiting for change buffer merge to complete number of bytes of change buffer just merged: 192206046
|
2022-02-05 14:13:00 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
|
2022-02-05 14:13:00 0 [Note] InnoDB: Shutdown completed; log sequence number 236522592867138; transaction id 35631972102
|
2022-02-05 14:13:00 0 [Note] /usr/sbin/mariadbd: Shutdown complete
|
Attachments
Issue Links
- relates to
-
MDEV-14462 Confusing error message: ib_logfiles are too small for innodb_thread_concurrency=0
-
- Closed
-
-
MDEV-19895 Support "autoset" in SET GLOBAL for AUTO_SET system variables
-
- Open
-
-
MDEV-29905 Change buffer operations fail to check for log file overflow
-
- Closed
-
-
MDEV-29982 Improve the InnoDB log overwrite error message
-
- Closed
-
monty suggested an idea that goes roughly like this:
How large is ‘large enough’? There may be at most 8×innodb_page_size active transactions (524,288 when using innodb_page_size=64k). There may also be 32 purge threads, and up to 255 fil_crypt_thread. In the worst case, each thread may be splitting a B-tree to its maximum height of 31 or 32 nodes, that is, modify roughly 64 pages and write up to roughly 32×innodb_page_size bytes of log.
Rounding up, the ‘large enough’ log file should be something like 9×innodb_page_size×32×innodb_page_size = 288×innodb_page_size×innodb_page_size. That would range from 4.5 GiB to 1,152 GiB. At the default innodb_page_size=16k, this ‘large enough’ estimate would be 72 GiB.
Even though this estimate might feel like an overestimate, it may in fact be an underestimate. We assumed that each of the concurrent threads would run only once. But, it is possible that on such a heavily loaded system, some threads are starved of CPU resources for an extended period of time while other threads are scheduled to run multiple mtr_t::commit() each. One of those starved threads could already have acquired a latch on the oldest modified page, and the log_free_check() from other threads would not be able to advance the checkpoint. Eventually, all subsequent threads would end up waiting in log_free_check() and the starved threads would be allowed to proceed to release the page latches, which would allow the checkpoint to be completed. During such a glitch it is possible that the log was momentarily overrun.
So, there does not exist any ‘large enough’ innodb_log_file_size that can guarantee to avoid the race condition between log_free_check() and mtr.commit(). We could only reduce the probability to very close to 0, which is already the case for well configured installations. Besides, 72 GiB is already a quite large redo log. Even with innodb_buffer_pool_size=128G it could be perfectly reasonable to use a much smaller innodb_log_file_size, especially when page splits are rare.
Practically, if we wanted to completely avoid the log overrun condition, we practically would have to implement some additional locking. Based on some experience with
MDEV-25062, I expect that an innocent-looking std::atomic::fetch_add() call for updating a global variable will seriously degrade throughput. Even when using numactl to ensure that the server runs on a single NUMA node, the AMD64 LOCK XADD instruction for trx_sys.rseg_history_len++ consumed several per cent of all CPU cycles during my simple benchmark. If we added anything like that to log_free_check(), the impact should be much bigger, because every write transaction consists of multiple mini-transactions.What if we implemented the extra locking only for the cases where a mini-transaction is expected to split or merge pages? That would still leave a small probability of log overrun due to threads being scheduled ‘unfairly’ on a heavily loaded system.