[MDEV-18976] Implement a CHECKSUM redo log record for improved validation Created: 2019-03-20 Updated: 2023-01-26 Resolved: 2022-06-06 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | mariabackup, Storage Engine - InnoDB |
| Fix Version/s: | 10.6.9, 10.7.5, 10.8.4, 10.9.2 |
| Type: | Task | Priority: | Major |
| Reporter: | Marko Mäkelä | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 3 |
| Labels: | backup, corruption, recovery | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
The InnoDB redo log mostly uses physical addressing (byte offsets within a page). While If a page that is read during redo log apply is older than what the redo log expects (if some page writes were missed for any reason), then most redo log apply operations would happily corrupt the page further. The corruption might sometimes be caught when an insert operation is being applied. We should introduce an option to generate new When ‘applying’ a CHECKSUM record, recovery (or mariabackup --prepare) would compute the corresponding checksum of the page and compare it to the one that is written to the log record. It would also compare the FIL_PAGE_LSN on the page to the one in the CHECKSUM record. |
| Comments |
| Comment by Vladislav Lesin [ 2020-12-31 ] | ||||||||||||||||||||||||||||||
|
FIL_PAGE_LSN is set in mtr_t::commit() when ReleaseBlocks::operator()(mtr_memo_slot_t* slot) is called. But in recv_recover_page() it's set when all hashed records are applied:
So we can't currently compare FIL_PAGE_LSN when CHECKSUM record is applied. | ||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2021-01-01 ] | ||||||||||||||||||||||||||||||
|
Pushed bb-10.5-MDEV-18976-redolog-crc branch. FIL_PAGE_LSN check is not implemented by the reason described above. There is no clear description about what must happen if CRC does no match, so I implemented the version where warning is issued in error log. I did not find the other way to test it except injecting debug code. My initial intention was to change pages FIL_PAGE_LSN to start recovery with OPTION CHECKSUM record, in this case page CRC would not match, but I did not find the way to pass/to count LSN of CHECKSUM record in mtr test. | ||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-05-31 ] | ||||||||||||||||||||||||||||||
|
The recursive page latches during page allocation and BLOB operations are making it challenging to implement this. I got an https://rr-project.org trace of a checksum mismatch. For the recovery of a mini-transaction that deletes a BLOB page by page during the ROLLBACK of an INSERT, there will be a checksum mismatch for an allocation bitmap page, because that page had already been modified in a ‘parent’ mini-transaction that did not write its log yet. The parent mini-transaction holds exclusive latches on the clustered index leaf page as well as the allocation bitmap page. I think that we must specially flag those sub-mini-transactions, so that checksum records will only be written in the parent mini-transaction. There are also some challenges to implement this for ROW_FORMAT=COMPRESSED pages, because there are two copies of the page in the buffer pool. | ||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-06-02 ] | ||||||||||||||||||||||||||||||
|
There is an anomaly where OPT_PAGE_CHECKSUM records could be emitted after a FREE_PAGE record. In this case, recovery would fail, because a checksum would be computed on a page that was freed during recovery. This can happen during DROP INDEX (or DROP TABLE or any table-rebuilding DDL in the system tablespace). To prevent the anomaly, we must revise the logic of |