Details
-
New Feature
-
Status: In Progress (View Workflow)
-
Critical
-
Resolution: Unresolved
-
Q1/2026 Server Development, Q1/2026 Server Maintenance, Q2/2026 Server Maintenance
Description
The InnoDB write-ahead log file (ib_logfile0) is pre-allocated to innodb_log_file_size and written as a ring buffer. This is good for write performance and space management, but unsuitable for arbitrary point-in-time recovery or for facilitating incremental backup.
As noted in MDEV-14992, it would be better if the server took care of saving a sufficient amount of InnoDB write-ahead log to support backup. The SQL interface with a backup program would be something like the following:
-- At the start of the backup, we check if log archiving is already running.
|
SELECT @@GLOBAL.innodb_log_archive, variable_value |
FROM information_schema.global_status |
WHERE variable_name = 'INNODB_LSN_ARCHIVED'; |
-- If BETWEEN 1 AND the LSN of the previous backup, skip copying InnoDB files (incremental backup)
|
-- Else if necessary, request the log to be archived from the current LSN onwards.
|
SET GLOBAL innodb_log_archive=ON; |
-- In this case, at the end of the backup, we would disable the log archiving again:
|
SET GLOBAL innodb_log_archive=OFF; |
Log archiving could also be enabled indefinitely, to facilitate arbitrary point-in-time recovery of anything that is covered by the InnoDB write-ahead log. When MDEV-34705 is implemented and enabled, this would include the binlog. However, point-in-time recovery of DDL operations would not work without limitations even for an InnoDB-only deployment, because part of the data dictionary is stored in .frm files, whose creation is covered by a separate log (see MDEV-17567).
The default value of innodb_log_archive is OFF, meaning that the log archiving is disabled by default.
When innodb_log_archive=ON, changes of the parameter innodb_log_file_size will take place when the current log file is about to be filled up and a new file is being created and allocated. The log resizing logic (MDEV-27812) as well as the creation of a redundant log file ib_logfile101 will remain in use when innodb_log_archive=OFF.
When innodb_log_archiving=ON, the server will write log to files like the following:
ib_0000000000003000.log
|
ib_0000000000400000.log
|
ib_00000000007fd000.log
|
ib_0000000000bfa000.log
|
ib_0000000000ff7000.log
|
ib_00000000013f4000.log
|
The above example is with the minimum innodb_log_file_size=4M (0x400000 bytes in hexadecimal). The file names will refer to the log sequence number corresponding to offset 12288 (0x3000) in the file. At the start of each file, there will be a 12288-byte header that contains 32-bit offsets into checkpoint mini-transactions that end in FILE_CHECKPOINT records that point to the checkpoints.
For innodb_log_archive=ON, we will impose a maximum innodb_log_file_size=4G, to keep the log file sizes manageable and to allow the format of 32-bit offsets to work.
For efficient recovery from archived log files instead of ib_logfile0, two further start-up parameters will be introduced:
| parameter | meaning |
|---|---|
| innodb_log_recovery_start | LSN to start recovery from (instead of the LSN of the mini-transaction that point to the latest available checkpoint); at the LSN we expect to find an optional sequence of FILE_MODIFY records and a FILE_CHECKPOINT record. |
| innodb_log_recovery_target | recovery point objective (end LSN of a backup) |
An implementation of backup may write these parameters into a .cnf file or pass these in a command line when invoking mariadbd. The idea of these parameters is to limit the scope of recovery and to avoid replaying an archived log from the very beginning to the very end.
We must keep in mind that the creation and modification of some files, such as .frm files that form part of the data dictionary, are not covered by the InnoDB write-ahead log. Therefore, it is important to be able to stop the recovery at a specific LSN.
Testing considerations
mariadb-backup will assume innodb_log_archive=OFF. It will not attempt to read any other log files than ib_logfile0.
We must keep in mind that there are several log I/O implementations:
- on Linux /dev/shm or PMEM unless cmake -DWITH_INNODB_PMEM=OFF: memory-mapped reads and writes
- else:
- parsing during recovery is either via pread or memory-mapped, according to the innodb_log_file_mmap setting
- writes are via pwrite
All combinations must be covered in testing. I have made use of the regression test suite like this:
# data directory stored outside /dev/shm, or server compiled WITH_INNODB_PMEM=OFF
|
mysql-test/mtr --parallel=auto --big-test --force --mysqld=--loose-innodb-log-{archive,recovery-start=12288,file-size=4m,file-mmap=OFF} --skip-test=mariabackup |
mysql-test/mtr --parallel=auto --big-test --force --mysqld=--loose-innodb-log-{archive,recovery-start=12288,file-size=4m,file-mmap=ON} --skip-test=mariabackup |
# data directory in /dev/shm and server built WITH_INNODB_PMEM=ON
|
mysql-test/mtr --parallel=auto --big-test --force --mysqld=--loose-innodb-log-{archive,recovery-start=12288,file-size=4m} --skip-test=mariabackup |
It would be beneficial extend our stress tests as follows:
- Cover both innodb_encrypt_log=OFF and innodb_encrypt_log=ON. Note that this parameter cannot be changed (at server restart) while innodb_log_archive=ON is set.
- Kill the server and determine the final LSN. This can be done by attempting startup with impossible innodb_log_recovery_target=12288 and checking the error message.
- Start the server with innodb_log_recovery_target set to the final LSN.
- Expect everything to work, except any writes to persistent tables. Some transactions, such as any reads at TRANSACTION ISOLATION LEVEL SERIALIZABLE may be blocked by locks that are held by recovered incomplete transactions.
Attachments
Issue Links
- blocks
-
MDEV-38304 Innodb Binlog to be Stored in Archived Redo Log
-
- Open
-
-
MDEV-39053 Implement an option for binlog inside innodb_log_archive
-
- Closed
-
-
MDEV-39054 InnoDB-only, DML-only incremental backup
-
- Open
-
-
MDEV-39055 Multi-threaded innodb_log_archive parsing
-
- Open
-
-
MDEV-39061 mariadb-backup compatible wrappers for BACKUP SERVER
-
- Open
-
- causes
-
MDEV-38914 Assertion `!mode || buf_size == std::min<uint64_t>(capacity(), buf_size_max)' failed
-
- Closed
-
- is blocked by
-
MDEV-38595 InnoDB doublewrite buffer creation generates unnecessary log
-
- Closed
-
-
MDEV-38807 Uninitialized return value from ibuf_upgrade_needed()
-
- Closed
-
- is part of
-
MDEV-14992 BACKUP SERVER to mounted file system
-
- In Progress
-
- relates to
-
MDEV-38730 innodb_log_file_mmap=ON does not work on AMD64, ARMv8, POWER
-
- Closed
-
-
MDEV-38748 recv_recovery_read_checkpoint() had better be inlined
-
- Closed
-
-
MDEV-38831 The maximum innodb_log_write_ahead_size is insufficient for some storage
-
- Open
-
-
MDEV-38833 Suboptimal or dead code at server startup
-
- Closed
-
-
MDEV-38850 Dormant corruption in log_t::clear_mmap()
-
- Closed
-
-
MDEV-38968 Redundant FILE_CHECKPOINT writes
-
- Closed
-
-
MDEV-39040 log_sys.latch performance lost to PERFORMANCE_SCHEMA
-
- Stalled
-
-
MDEV-14462 Confusing error message: ib_logfiles are too small for innodb_thread_concurrency=0
-
- Closed
-
-
MDEV-37058 [Draft] Assertion `get_lsn() == get_flushed_lsn(std::memory_order_relaxed)' failed upon upgrade to 11.4
-
- Open
-