Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL)
Description
There is a potential problem if the server is killed amid freeing undo log pages:
diff --git a/storage/innobase/trx/trx0purge.cc b/storage/innobase/trx/trx0purge.cc
|
index f273903ef93..e7f61162dcd 100644
|
--- a/storage/innobase/trx/trx0purge.cc
|
+++ b/storage/innobase/trx/trx0purge.cc
|
@@ -365,6 +365,8 @@ void trx_purge_free_segment(mtr_t &mtr, trx_rseg_t* rseg, fil_addr_t hdr_addr)
|
TRX_UNDO_SEG_HDR + TRX_UNDO_FSEG_HEADER
|
+ block->frame, &mtr)) {
|
mtr.commit();
|
+ log_write_up_to(mtr.commit_lsn(), true);
|
+ abort();
|
mtr.start();
|
|
rseg_hdr = trx_rsegf_get(rseg->space, rseg->page_no, &mtr); |
The following scenario would seem to be possible:
- InnoDB is killed between that point and the time when the mini-transaction of a subsequent trx_purge_remove_log_hdr() becomes durable.
- InnoDB is restarted, and the pages that were freed above are being allocated for something else (further undo log records, or data located in the system tablespace).
- Purge attempts to access an invalid page.
The function trx_purge_free_segment() is also missing calls to log_free_check(), which means that an overrun of the redo log is possible, and the database might become impossible to recover if the server is killed while the function is being executed.
There is a hint in the source code how this could be fixed:
/* We may free the undo log segment header page; it must be freed |
within the same mtr as the undo log header is removed from the
|
history list: otherwise, in case of a database crash, the segment
|
could become inaccessible garbage in the file space. */
|
|
trx_purge_remove_log_hdr(rseg_hdr, block, hdr_addr.boffset, &mtr);
|
|
do { |
|
/* Here we assume that a file segment with just the header |
page can be freed in a few steps, so that the buffer pool
|
is not flooded with bufferfixed pages: see the note in
|
fsp0fsp.cc. */
|
|
} while (!fseg_free_step(TRX_UNDO_SEG_HDR + TRX_UNDO_FSEG_HEADER |
+ block->frame, &mtr));
|
If we simply call trx_purge_remove_log_hdr() in the first mini-transaction, everything should be safe. Yes, the pages might not be easy to free afterwards, but that is not a problem for those who use multiple innodb_undo_tablespaces and innodb_undo_log_truncate=ON.
We could also try to free everything in a single mini-transaction, provided that there is sufficient capacity in the redo log and the buffer pool.
Attachments
Issue Links
- causes
-
MDEV-32049 Deadlock due to log_free_check(), involving trx_purge_truncate_rseg_history() and trx_undo_assign_low()
- Closed
-
MDEV-32820 Race condition between trx_purge_free_segment() and trx_undo_create()
- Closed
- relates to
-
MDEV-30671 innodb_undo_log_truncate=ON fails to wait for purge of transaction history
- Closed