Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.5, 10.6, 10.11, 11.1(EOL), 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 11.0(EOL)
Description
mleich produced rr replay traces of something that led to a corrupted database after mariadb-backup --prepare claimed to have completed successfully.
To reproduce the corruption with the attached copy of the backup data.tar.xz:
tar xJf data.tar.xz
|
mariabackup --prepare --target-dir "$(pwd)"/data
|
mariadbd --innodb-page-size=8k --datadir $(pwd)/data
|
The database will fail to start up as follows:
10.6 a6c0a2769663850c055e41a83b0cca9dd9ab89a3 |
2023-05-31 14:11:49 0 [Note] InnoDB: Opened 3 undo tablespaces
|
2023-05-31 14:11:49 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces
|
2023-05-31 14:11:49 0 [ERROR] InnoDB: Trying to read 8192 bytes at 6545408 outside the bounds of the file: .//undo001
|
2023-05-31 14:11:49 0 [ERROR] InnoDB: File './/undo001' is corrupted
|
2023-05-31 14:11:49 0 [Note] InnoDB: Retry with innodb_force_recovery=5
|
2023-05-31 14:11:49 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1484] with error Data structure corruption
|
2023-05-31 14:11:49 0 [Note] InnoDB: Starting shutdown...
|
2023-05-31 14:11:49 0 [ERROR] Plugin 'InnoDB' init function returned error.
|
2023-05-31 14:11:49 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
|
With the following fix, the recovery will succeed:
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
|
index 2764cc79829..91e87643ab1 100644
|
--- a/storage/innobase/log/log0recv.cc
|
+++ b/storage/innobase/log/log0recv.cc
|
@@ -3645,10 +3645,12 @@ void recv_sys_t::apply(bool last_batch)
|
if (fil_space_t *space = fil_space_get(id + srv_undo_space_id_start))
|
{
|
ut_ad(UT_LIST_GET_LEN(space->chain) == 1);
|
+ ut_ad(space->recv_size >= t.pages);
|
fil_node_t *file= UT_LIST_GET_FIRST(space->chain);
|
ut_ad(file->is_open());
|
os_file_truncate(file->name, file->handle,
|
- os_offset_t{t.pages} << srv_page_size_shift, true);
|
+ os_offset_t{space->recv_size} <<
|
+ srv_page_size_shift, true);
|
}
|
}
|
} |
An analysis of the rr replay trace of mariadb-backup --prepare showed the following:
- space->size will be recovered as 1536 by reading the first page of undo001, at FIL_PAGE_LSN=579965, in fil_node_t::read_page0()
- space->recv_size will be recovered as 640, 768, …, 1536 by recv_sys_t::parse()
- The file undo001 will be incorrectly truncated to t.pages (640) pages by the above code (without the patch).
- There will be subsequent calls to write pages after the truncated size. InnoDB is always supposed to extend data files before any writes take place. The server error message is referring to page 799, on which buf_page_t::write_complete() was invoked in the trace.
Attachments
Issue Links
- blocks
-
MDEV-14795 InnoDB system tablespace cannot be shrunk
- Closed
- relates to
-
MDEV-28840 innodb_undo_log_truncate is not crash-safe
- Closed
-
MDEV-29999 innodb_undo_log_truncate=ON is not crash safe
- Closed