[MDEV-33363] CI failure: innodb.import_corrupted: Assertion failed: oldest_lsn > log_sys.last_checkpoint_lsn - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.8.1, 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL), 11.4
Fix Version/s: 10.11.8, 11.0.6, 11.1.5, 11.2.4, 11.4.2
Component/s: Storage Engine - InnoDB
Labels:
- debug

Description

The failure is observed in st-10.11-~~MDEV-33361~~. Since the assert looks unrelated to the change in the branch, the issue must be from the base branch.

The assert is introduced in ~~MDEV-14425~~ and likely to have introduced the issue. I am setting the affected version as 10.8.1 when ~~MDEV-14425~~ was merged.

https://buildbot.mariadb.org/#/builders/234/builds/26196/steps/9/logs/stdio
Test: innodb.import_corrupted
Branch: st-10.11-~~MDEV-33361~~
Commit: 93189df44
Platform: amd64-windows

innodb.import_corrupted 'innodb'         w6 [ fail ]

        Test ended at 2024-02-02 10:01:21

CURRENT_TEST: innodb.import_corrupted

mysqltest: At line 65: query 'ALTER TABLE t2 IMPORT TABLESPACE' failed with wrong errno <Unknown> (2013): 'Lost connection to server during query', instead of ER_NOT_KEYFILE (1034)...

Server [mysqld.1 - pid: 47680, winpid: 47680, exit: 0] failed during test run

Server log from this test:

----------SERVER LOG START-----------

2024-02-02 10:01:20 124 [Note] InnoDB: Sync to disk of `test`.`t1` started.

2024-02-02 10:01:20 124 [Note] InnoDB: Stopping purge

2024-02-02 10:01:21 124 [Note] InnoDB: Writing table metadata to './test/t1.cfg'

2024-02-02 10:01:21 124 [Note] InnoDB: Table `test`.`t1` flushed to disk

2024-02-02 10:01:21 124 [Note] InnoDB: Deleting the meta-data file './test/t1.cfg'

2024-02-02 10:01:21 124 [Note] InnoDB: Resuming purge

2024-02-02 10:01:21 124 [Note] InnoDB: Importing tablespace for table 'test/t1' that was exported from host 'WIN-5PHL1LR4M8G'

2024-02-02 10:01:21 124 [Note] InnoDB: Phase I - Update all pages

2024-02-02 10:01:21 124 [Note] InnoDB: Sync to disk

2024-02-02 10:01:21 124 [Note] InnoDB: Sync to disk - done!

2024-02-02 10:01:21 124 [Note] InnoDB: Phase II - Purge records from index `PRIMARY`

Assertion failed: oldest_lsn > log_sys.last_checkpoint_lsn, file C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc, line 1982

server.dll!my_sigabrt_handler()[my_thr_init.c:465]

ucrtbase.dll!raise()

ucrtbase.dll!abort()

ucrtbase.dll!_get_wpgmptr()

ucrtbase.dll!_wassert()

server.dll!log_checkpoint_low()[buf0flu.cc:1982]

server.dll!log_checkpoint()[buf0flu.cc:2042]

server.dll!buf_flush_page_cleaner()[buf0flu.cc:2509]

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

import_checkpoint_race.test
3 kB
2024-02-16 04:42

Issue Links

relates to

MDEV-22027 Assertion oldest_lsn >= log_sys.last_checkpoint_lsn failed in log_checkpoint()

Closed

MDEV-14425 Change the InnoDB redo log format to reduce write amplification

Closed

MDEV-32269 InnoDB after ALTER TABLE…IMPORT TABLESPACE may not be crash safe

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2024-02-02 11:38

Here is some code up to the failing assertion in log_checkpoint_low():

10.11 93189df44eeb431110bf9ca759bafa856ae90c82
if (oldest_lsn == log_sys.last_checkpoint_lsn /* \|\| … */) {
log_sys.latch.wr_unlock();
return true;
}

ut_ad(!recv_no_log_write);
ut_ad(oldest_lsn > log_sys.last_checkpoint_lsn);

This would seem to guarantee that the equality cannot hold. So, we must have had an oldest_lsn that is smaller than the log_sys.last_checkpoint_lsn. That would seem to be a sign of a severe error in the checkpoint logic.

The parameter oldest_lsn was computed by the following in log_checkpoint():

10.11 93189df44eeb431110bf9ca759bafa856ae90c82
log_sys.latch.wr_lock(SRW_LOCK_CALL);
const lsn_t end_lsn= log_sys.get_lsn();
mysql_mutex_lock(&buf_pool.flush_list_mutex);
const lsn_t oldest_lsn= buf_pool.get_oldest_modification(end_lsn);
mysql_mutex_unlock(&buf_pool.flush_list_mutex);
return log_checkpoint_low(oldest_lsn, end_lsn);

The field log_sys.last_checkpoint_lsn is protected by log_sys.latch (and write_lock and flush_lock) when log_t::write_checkpoint() is modifying it. Also the end_lsn=log_sys.lsn is protected by the exclusive log_sys.latch. Here are some local variables from the stack trace:

10.11 93189df44eeb431110bf9ca759bafa856ae90c82
ucrtbase!wassert
Unable to enumerate locals, Win32 error 0n87
Private symbols (symbols.pri) are required for locals.
Type ".hh dbgerr005" for details.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
server!log_checkpoint_low [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 1982]
unsigned int64 oldest_lsn = 0xdbbfe9
unsigned int64 end_lsn = 0xdc0a71
unsigned int64 flush_lsn = 0x00000216`d35ab7b0
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
server!log_checkpoint [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 2042]
unsigned int64 oldest_lsn = 0xdbbfe9
unsigned int64 end_lsn = 0xdc0a71
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
server!buf_flush_page_cleaner [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 2509]
unsigned int64 n = 0
unsigned int64 dirty_blocks = 0x1b5
unsigned int64 n_flushed = 0
unsigned int64 oldest_lsn = 0xdbbfe9
double pct_lwm = 0
unsigned int64 soft_lsn_limit = 0
double dirty_pct = 87.2255489021956123
unsigned int64 lsn_limit = 0
struct timespec abstime = struct timespec
unsigned int64 last_pages = 0x71a
unsigned int64 last_activity_count = 0
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The variable flush_lsn, whose scope had not started before the failing assertion was evaluated, is garbage, but the rest looks reasonable.

The oldest modified block in the buf_pool.flush_list must have an buf_page_t::oldest_modification_ of 0xdbbfe9. For some reason, the latest checkpoint LSN (log_sys.last_checkpoint_lsn) is larger than this. This means that the log checkpoint had been advanced prematurely on a previous invocation of log_checkpoint_low(), while some older page modifications had not yet been written back from the buffer pool.

Marko Mäkelä added a comment - 2024-02-02 11:38 Here is some code up to the failing assertion in log_checkpoint_low() : 10.11 93189df44eeb431110bf9ca759bafa856ae90c82 if (oldest_lsn == log_sys.last_checkpoint_lsn /* || … */ ) { log_sys.latch.wr_unlock(); return true ; } ut_ad(!recv_no_log_write); ut_ad(oldest_lsn > log_sys.last_checkpoint_lsn); This would seem to guarantee that the equality cannot hold. So, we must have had an oldest_lsn that is smaller than the log_sys.last_checkpoint_lsn . That would seem to be a sign of a severe error in the checkpoint logic. The parameter oldest_lsn was computed by the following in log_checkpoint() : 10.11 93189df44eeb431110bf9ca759bafa856ae90c82 log_sys.latch.wr_lock(SRW_LOCK_CALL); const lsn_t end_lsn= log_sys.get_lsn(); mysql_mutex_lock(&buf_pool.flush_list_mutex); const lsn_t oldest_lsn= buf_pool.get_oldest_modification(end_lsn); mysql_mutex_unlock(&buf_pool.flush_list_mutex); return log_checkpoint_low(oldest_lsn, end_lsn); The field log_sys.last_checkpoint_lsn is protected by log_sys.latch (and write_lock and flush_lock ) when log_t::write_checkpoint() is modifying it. Also the end_lsn=log_sys.lsn is protected by the exclusive log_sys.latch . Here are some local variables from the stack trace: 10.11 93189df44eeb431110bf9ca759bafa856ae90c82 ucrtbase!wassert Unable to enumerate locals, Win32 error 0n87 Private symbols (symbols.pri) are required for locals. Type ".hh dbgerr005" for details. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ server!log_checkpoint_low [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 1982] unsigned int64 oldest_lsn = 0xdbbfe9 unsigned int64 end_lsn = 0xdc0a71 unsigned int64 flush_lsn = 0x00000216`d35ab7b0 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ server!log_checkpoint [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 2042] unsigned int64 oldest_lsn = 0xdbbfe9 unsigned int64 end_lsn = 0xdc0a71 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ server!buf_flush_page_cleaner [C:\Buildbot\amd64-windows\build\storage\innobase\buf\buf0flu.cc @ 2509] unsigned int64 n = 0 unsigned int64 dirty_blocks = 0x1b5 unsigned int64 n_flushed = 0 unsigned int64 oldest_lsn = 0xdbbfe9 double pct_lwm = 0 unsigned int64 soft_lsn_limit = 0 double dirty_pct = 87.2255489021956123 unsigned int64 lsn_limit = 0 struct timespec abstime = struct timespec unsigned int64 last_pages = 0x71a unsigned int64 last_activity_count = 0 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ The variable flush_lsn , whose scope had not started before the failing assertion was evaluated, is garbage, but the rest looks reasonable. The oldest modified block in the buf_pool.flush_list must have an buf_page_t::oldest_modification_ of 0xdbbfe9 . For some reason, the latest checkpoint LSN ( log_sys.last_checkpoint_lsn ) is larger than this. This means that the log checkpoint had been advanced prematurely on a previous invocation of log_checkpoint_low() , while some older page modifications had not yet been written back from the buffer pool.

Marko Mäkelä added a comment - 2024-02-07 13:05

~~MDEV-22027~~ was a bug like this.

Marko Mäkelä added a comment - 2024-02-07 13:05 MDEV-22027 was a bug like this.

Michael Widenius added a comment - 2024-02-12 15:42

I just got this failure in a 10.11 non debug build on Centos:

innodb.import_corrupted
Server log from this test:
---------~~SERVER LOG START~~----------
2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk of `test`.`t1` started.
2024-02-12 15:03:37 129 [Note] InnoDB: Stopping purge
2024-02-12 15:03:37 129 [Note] InnoDB: Writing table metadata to './test/t1.cfg'
2024-02-12 15:03:37 129 [Note] InnoDB: Table `test`.`t1` flushed to disk
2024-02-12 15:03:37 129 [Note] InnoDB: Deleting the meta-data file './test/t1.cf
g'
2024-02-12 15:03:37 129 [Note] InnoDB: Resuming purge
2024-02-12 15:03:37 129 [Note] InnoDB: Importing tablespace for table 'test/t1'
that was exported from host 'localhost.localdomain'
2024-02-12 15:03:37 129 [Note] InnoDB: Phase I - Update all pages
2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk
2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk - done!
2024-02-12 15:03:37 129 [Note] InnoDB: Phase II - Purge records from index `PRIM
ARY`
mariadbd: /my/maria-10.11/storage/innobase/buf/buf0flu.cc:1982: bool log_checkpo
int_low(lsn_t, lsn_t): Assertion `oldest_lsn > log_sys.last_checkpoint_lsn' fail
ed.
240212 15:03:37 [ERROR] mysqld got signal 6 ;
Sorry, we probably made a mistake, and this is a bug.
mysys/stacktrace.c:215(my_print_stacktrace)[0x113e1e5]
sql/signal_handler.cc:238(handle_fatal_signal)[0xa2abe8]
sigaction.c:0(__restore_rt)[0x7f54d630d5d0]
/lib64/libc.so.6(gsignal+0x37)[0x7f54d4326207]
/lib64/libc.so.6(abort+0x148)[0x7f54d43278f8]
/lib64/libc.so.6(+0x2f026)[0x7f54d431f026]
/lib64/libc.so.6(+0x2f0d2)[0x7f54d431f0d2]
buf/buf0flu.cc:1994(log_checkpoint_low(unsigned long, unsigned long))[0x102a018]
buf/buf0flu.cc:2042(log_checkpoint())[0x102a2f2]
buf/buf0flu.cc:2511(buf_flush_page_cleaner())[0x102cdb9]
4.8.2/thread:115(std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run())
[0x102e0f3]

Michael Widenius added a comment - 2024-02-12 15:42 I just got this failure in a 10.11 non debug build on Centos: innodb.import_corrupted Server log from this test: --------- SERVER LOG START ---------- 2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk of `test`.`t1` started. 2024-02-12 15:03:37 129 [Note] InnoDB: Stopping purge 2024-02-12 15:03:37 129 [Note] InnoDB: Writing table metadata to './test/t1.cfg' 2024-02-12 15:03:37 129 [Note] InnoDB: Table `test`.`t1` flushed to disk 2024-02-12 15:03:37 129 [Note] InnoDB: Deleting the meta-data file './test/t1.cf g' 2024-02-12 15:03:37 129 [Note] InnoDB: Resuming purge 2024-02-12 15:03:37 129 [Note] InnoDB: Importing tablespace for table 'test/t1' that was exported from host 'localhost.localdomain' 2024-02-12 15:03:37 129 [Note] InnoDB: Phase I - Update all pages 2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk 2024-02-12 15:03:37 129 [Note] InnoDB: Sync to disk - done! 2024-02-12 15:03:37 129 [Note] InnoDB: Phase II - Purge records from index `PRIM ARY` mariadbd: /my/maria-10.11/storage/innobase/buf/buf0flu.cc:1982: bool log_checkpo int_low(lsn_t, lsn_t): Assertion `oldest_lsn > log_sys.last_checkpoint_lsn' fail ed. 240212 15:03:37 [ERROR] mysqld got signal 6 ; Sorry, we probably made a mistake, and this is a bug. mysys/stacktrace.c:215(my_print_stacktrace) [0x113e1e5] sql/signal_handler.cc:238(handle_fatal_signal) [0xa2abe8] sigaction.c:0(__restore_rt) [0x7f54d630d5d0] /lib64/libc.so.6(gsignal+0x37) [0x7f54d4326207] /lib64/libc.so.6(abort+0x148) [0x7f54d43278f8] /lib64/libc.so.6(+0x2f026) [0x7f54d431f026] /lib64/libc.so.6(+0x2f0d2) [0x7f54d431f0d2] buf/buf0flu.cc:1994(log_checkpoint_low(unsigned long, unsigned long)) [0x102a018] buf/buf0flu.cc:2042(log_checkpoint()) [0x102a2f2] buf/buf0flu.cc:2511(buf_flush_page_cleaner()) [0x102cdb9] 4.8.2/thread:115(std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run()) [0x102e0f3]

Debarun Banerjee added a comment - 2024-02-13 15:51

In absence of any repeatable test, I started my analysis from the design and code. It was quite informative going through the design changes done to improve the concurrency for mini-transaction commit and flush list ordering. Finally, I could find the exact reason and precise testcase to repeat the issue precisely. I will upload a possible solution for review and discuss within Innodb team. The testcase took sometime as it needs specific sequence of operations between background and session threads. The current test fails > 50% of the time with the assert consistently which should be good for verification.

Some details of the analysis.

1. The checkpoint path: The code looks good and there don't look to be any race across multiple concurrent checkpoint.

2. The mtr commit: There is one vulnerable piece of code in log_t::append_prepare_wait which is somewhat hard to reach generally but should be fixed. This turned out to be different from the current issue and is not feasible to reach by the test. I will file a separate bug for this one with more details and a repeatable test.

3. The import path: The issue is found here. In "~~MDEV-28708~~: increased congestion on buf_pool.flush_list_mutex", along with other concurrency enhancement, the LSN assignment for non-logging pages are changed from system LSN to last checkpoint LSN. This apparently harmless change has caused this issue. During checkpoint we release the log sys latch, while writing the checkpoint block log_t::write_checkpoint(). The state is set to "checkpoint_pending" which effectively synchronizing with other concurrent checkpoint calls. In this interval the IMPORT pages are added to the flush list with previous checkpoint LSN causing the next checkpoint LSN to get a lower value.

One way is to use the system LSN for the non-redo logged pages. It seems right and is inline with previous design. If we do want to add the pages in flush list from other direction, it is possible to check the pending state and use the next checkpoint LSN.

Debarun Banerjee added a comment - 2024-02-13 15:51 In absence of any repeatable test, I started my analysis from the design and code. It was quite informative going through the design changes done to improve the concurrency for mini-transaction commit and flush list ordering. Finally, I could find the exact reason and precise testcase to repeat the issue precisely. I will upload a possible solution for review and discuss within Innodb team. The testcase took sometime as it needs specific sequence of operations between background and session threads. The current test fails > 50% of the time with the assert consistently which should be good for verification. Some details of the analysis. 1. The checkpoint path: The code looks good and there don't look to be any race across multiple concurrent checkpoint. 2. The mtr commit: There is one vulnerable piece of code in log_t::append_prepare_wait which is somewhat hard to reach generally but should be fixed. This turned out to be different from the current issue and is not feasible to reach by the test. I will file a separate bug for this one with more details and a repeatable test. 3. The import path: The issue is found here. In " MDEV-28708 : increased congestion on buf_pool.flush_list_mutex", along with other concurrency enhancement, the LSN assignment for non-logging pages are changed from system LSN to last checkpoint LSN. This apparently harmless change has caused this issue. During checkpoint we release the log sys latch, while writing the checkpoint block log_t::write_checkpoint(). The state is set to "checkpoint_pending" which effectively synchronizing with other concurrent checkpoint calls. In this interval the IMPORT pages are added to the flush list with previous checkpoint LSN causing the next checkpoint LSN to get a lower value. One way is to use the system LSN for the non-redo logged pages. It seems right and is inline with previous design. If we do want to add the pages in flush list from other direction, it is possible to check the pending state and use the next checkpoint LSN.

Debarun Banerjee added a comment - 2024-02-13 16:19

Hi marko

Please review the analysis and patch.
https://github.com/MariaDB/server/pull/3061

Debarun Banerjee added a comment - 2024-02-13 16:19 Hi marko Please review the analysis and patch. https://github.com/MariaDB/server/pull/3061

Marko Mäkelä added a comment - 2024-02-14 13:56

Great work! It would be nice if you can make the DEBUG_SYNC instrumented test more stable so that it can be included in the regression test suite.

Marko Mäkelä added a comment - 2024-02-14 13:56 Great work! It would be nice if you can make the DEBUG_SYNC instrumented test more stable so that it can be included in the regression test suite.

Debarun Banerjee added a comment - 2024-02-16 04:47

Attached the testcase that can repeat the issue fairly accurately. I am excluding the test from final patch as in some cases it results is debug sync timeout due to concurrent background operations. It could be hard to maintain such tests with stable output in CI.

Debarun Banerjee added a comment - 2024-02-16 04:47 Attached the testcase that can repeat the issue fairly accurately. I am excluding the test from final patch as in some cases it results is debug sync timeout due to concurrent background operations. It could be hard to maintain such tests with stable output in CI.

People

Assignee:: Debarun Banerjee

Reporter:: Debarun Banerjee

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2024-02-02 10:33

Updated:: 2024-02-20 09:31

Resolved:: 2024-02-16 08:28

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration