[MDEV-33341] innodb.undo_space_dblwr test case fails with Unknown Storage Engine InnoDB Created: 2024-01-31  Updated: 2024-02-07  Resolved: 2024-01-31

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.5, 10.6, 10.11, 11.1, 11.2, 11.3, 11.4
Fix Version/s: 11.3.2, 10.5.25, 10.6.18, 10.11.8, 11.0.6, 11.1.5, 11.2.4

Type: Bug Priority: Major
Reporter: Thirunarayanan Balathandayuthapani Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-31851 After crash recovery, Checksum mismat... Closed

 Description   

undo_space_dblwr test case fails if the first page of undo
tablespace is not flushed before restart the server. While restarting
the server, InnoDB fails to detect the first page of undo
tablespace from doublewrite buffer.

Steps to repeat the scenario:

diff --git a/mysql-test/suite/innodb/t/undo_space_dblwr.test b/mysql-test/suite/innodb/t/undo_space_dblwr.test
index b6fd6738a1c..c93c45dd7f6 100644
--- a/mysql-test/suite/innodb/t/undo_space_dblwr.test
+++ b/mysql-test/suite/innodb/t/undo_space_dblwr.test
@@ -21,7 +21,7 @@ set global innodb_fil_make_page_dirty_debug = 1;
 SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.0;
 SET GLOBAL innodb_max_dirty_pages_pct=0.0;
 
-sleep 1;
+#sleep 1;
 --let CLEANUP_IF_CHECKPOINT=drop table t1;
 --source ../include/no_checkpoint_end.inc
 

Error log file contains:

2024-01-31 14:40:46 0 [ERROR] InnoDB: Checksum mismatch in the first page of file .//undo001
2024-01-31 14:40:46 0 [ERROR] InnoDB: Unable to read first page of file .//undo001
2024-01-31 14:40:46 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1386] with error Data structure corruption



 Comments   
Comment by Thirunarayanan Balathandayuthapani [ 2024-01-31 ]

https://github.com/MariaDB/server/pull/3036

Comment by Marko Mäkelä [ 2024-01-31 ]

thiru, thank you. For the non-debug test innodb.doublewrite in 10.11 and later, I concluded that we have to live with occasional unexpected checkpoints and there is nothing that we can really do about it. We could increase the sleep time, but even then there is no guarantee that the very first wakeup of buf_flush_page_cleaner() will complete and that thread will be empty.

However, for a debug-instrumented test like this one, there is a solution that you implemented: In the debug instrumentation function buf_flush_list_now_set() treat not only the system tablespace but also undo log tablespaces in a special way.

Generated at Thu Feb 08 10:38:11 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.