[MDEV-22388] Corrupted undo log record leads to server crash Created: 2020-04-28  Updated: 2022-08-15  Resolved: 2022-06-22

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 5.5, 10.0, 10.1, 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.6.9, 10.7.5, 10.8.4, 10.9.2, 10.10.1

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 1
Labels: corruption

Issue Links:
Blocks
blocks MDEV-28349 Provide "crash safe" options for CHEC... Open
is blocked by MDEV-13542 Crashing on a corrupted page is unhel... Closed
Relates
relates to MDEV-15608 Crash during transaction rollback whe... Closed

 Description   

The function trx_undo_rec_copy() calculates the size of the undo log record by assuming that it has been passed a valid pointer to an undo log record in an undo page. Sometimes this does not hold. The error might be caught by an assertion (only in debug builds) or by a failure to allocate a large amount of memory.

Perhaps the function should also take a const buf_block_t& parameter to identify the buffer pool page, and perform some consistency checks? Its callers should check for an error value (a null pointer would seem to be appropriate) and deal accordingly.

  • In purge, we should probably just skip the undo log record and move on.
  • On MVCC read, we should return an error that the undo log is corrupted.
  • On ROLLBACK, we should probably write a message to the error log and not release any locks. Fixing this would require manual intervention.

This ticket is motivated by some MySQL 5.7.30 code changes in this area:
Bug #29448406 TRX_UNDO_REC_COPY NEEDLESSLY RELIES ON BUFFER POOL PAGE ALIGNMENT



 Comments   
Comment by Marko Mäkelä [ 2022-06-07 ]

This was missed in MDEV-13542.

Generated at Thu Feb 08 09:14:22 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.