[MDEV-31373] innodb_undo_log_truncate=ON recovery results in a corrupted undo log - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL), 11.1(EOL)
Fix Version/s: 10.5.22, 10.6.15, 10.9.8, 10.10.6, 10.11.5, 11.0.3, 11.1.2
Component/s: Backup, Storage Engine - InnoDB
Labels:

Description

mleich produced rr replay traces of something that led to a corrupted database after mariadb-backup --prepare claimed to have completed successfully.

To reproduce the corruption with the attached copy of the backup data.tar.xz:

tar xJf data.tar.xz

mariabackup --prepare --target-dir "$(pwd)"/data

mariadbd --innodb-page-size=8k --datadir $(pwd)/data

The database will fail to start up as follows:

10.6 a6c0a2769663850c055e41a83b0cca9dd9ab89a3
2023-05-31 14:11:49 0 [Note] InnoDB: Opened 3 undo tablespaces
2023-05-31 14:11:49 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces
2023-05-31 14:11:49 0 [ERROR] InnoDB: Trying to read 8192 bytes at 6545408 outside the bounds of the file: .//undo001
2023-05-31 14:11:49 0 [ERROR] InnoDB: File './/undo001' is corrupted
2023-05-31 14:11:49 0 [Note] InnoDB: Retry with innodb_force_recovery=5
2023-05-31 14:11:49 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1484] with error Data structure corruption
2023-05-31 14:11:49 0 [Note] InnoDB: Starting shutdown...
2023-05-31 14:11:49 0 [ERROR] Plugin 'InnoDB' init function returned error.
2023-05-31 14:11:49 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

With the following fix, the recovery will succeed:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc

index 2764cc79829..91e87643ab1 100644

--- a/storage/innobase/log/log0recv.cc

+++ b/storage/innobase/log/log0recv.cc

@@ -3645,10 +3645,12 @@ void recv_sys_t::apply(bool last_batch)

         if (fil_space_t *space = fil_space_get(id + srv_undo_space_id_start))

           ut_ad(UT_LIST_GET_LEN(space->chain) == 1);

+          ut_ad(space->recv_size >= t.pages);

           fil_node_t *file= UT_LIST_GET_FIRST(space->chain);

           ut_ad(file->is_open());

           os_file_truncate(file->name, file->handle,

-                           os_offset_t{t.pages} << srv_page_size_shift, true);

+                           os_offset_t{space->recv_size} <<

+                           srv_page_size_shift, true);

An analysis of the rr replay trace of mariadb-backup --prepare showed the following:

space->size will be recovered as 1536 by reading the first page of undo001, at FIL_PAGE_LSN=579965, in fil_node_t::read_page0()
space->recv_size will be recovered as 640, 768, …, 1536 by recv_sys_t::parse()
The file undo001 will be incorrectly truncated to t.pages (640) pages by the above code (without the patch).
There will be subsequent calls to write pages after the truncated size. InnoDB is always supposed to extend data files before any writes take place. The server error message is referring to page 799, on which buf_page_t::write_complete() was invoked in the trace.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

data.tar.xz
3.09 MB
2023-05-31 11:16

Issue Links

blocks

MDEV-14795 InnoDB system tablespace cannot be shrunk

Closed

relates to

MDEV-28840 innodb_undo_log_truncate is not crash-safe

Closed

MDEV-29999 innodb_undo_log_truncate=ON is not crash safe

Closed

Activity

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2023-05-31 11:27

Updated:: 2023-07-05 09:12

Resolved:: 2023-06-01 09:49

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration