[MDEV-33137] Assertion `end_lsn == page_lsn' failed in recv_recover_page Created: 2023-12-28 Updated: 2024-01-11 Resolved: 2024-01-10 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.6, 10.11, 11.0, 11.1, 11.2 |
| Fix Version/s: | 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3, 11.3.2 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Elena Stepanova | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | regression, rr-profile | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Description |
|
It looks like the failure appeared after this merge in 10.11
However I couldn't reproduce it on 10.6. A couple of test runs which reproduce the failure currently fairly well:
|
| Comments |
| Comment by Marko Mäkelä [ 2024-01-03 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The assertion failure is directly connected with something that we have been observing on many non-debug builders, related to
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-03 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Here is some data from the recovery run:
That is, we have a FIL_PAGE_LSN that is somewhere between the recovery start LSN (the checkpoint LSN) and the end of the recovered log. These do correspond to the log output:
The last checkpoint had been completed rather shortly before the server had been killed:
At that point the oldest_modification of the page 0x100000005 was 167115222, which is exactly the same as the local debug variable recv_start_lsn in the recovery trace. The write of the page (with FIL_PAGE_LSN=159994270) had taken place when log_sys.lsn and log_sys.flushed_to_disk_lsn were both 161771086, at when 1120317 of the server process. At the time of the write completion, the buf_flush_page_cleaner was in a timed wait for the start of its next 1-second slice. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-03 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
After the last write of the undo page 0x100000005 and the server kill, we got a number of mini-transactions that had modified the page:
The purge_truncation_callback is something that was refactored into a separate task in But, we need to answer the question: Where did the LSN 159993940 come from, and what is causing that mismatch? This should be our mini-transaction of interest:
The 73 bytes of log records of this mini-transaction are as follows (one per line):
Because we have innodb_encrypt_log=ON, the log record payload (after the record type and page identifier) is encrypted. That is why the OPTION records do not have the expected subtype byte 0 after the page number. We can observe that some WRITE logging for page 1:5 could be optimized to make use of the These log records were copied to the memory-mapped ib_logfile0 at log_sys.buf + 59342846. The circular log file had wrapped around exactly 1 time. We have log_sys.file_size = 100663296 and log_sys.first_lsn = 12288. Therefore, the start offset of the mini-transaction 59342846 should correspond to the LSN 100663296-12288+59342846 = 159993854. If this calculation is correct, the 159993940 that we saw during recovery should correspond to byte 86, which would be 13 bytes after the end of the above records. In the log_sys.buf the ‘extra’ bytes include the mini-transaction end marker. Because we have innodb_encrypt_log=ON, after the first byte (in this case 0 because the log has wrapped around an odd number of times) we have an 8-byte cryptographic nonce that is roughly the LSN at that time. Finally there is a CRC-32C checksum of the mini-transaction: So, the end_lsn=159993940 during recovery seems to correspond to the end LSN of this mini-transaction in the log. It is also equal to the m_commit_lsn. After this we have further mini-transactions that modify the page:
At recovery, the interesting LSN are 159993940 and 159994270. This only identifies the place that I need to investigate deeper, tomorrow. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
A peculiarity about the mini-transaction that ends at LSN 159994006 is that the only log record for the page 1:5 is the OPTION record written by mtr_t::page_checksum() (
The | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I was misled by a data watchpoint. The rollback segment header is actually falsely claimed as modified in a later mini-transaction that reuses the same memory buffer:
At the time of the next mtr_t::commit() call, the mtr_t::m_memo refers to 5 buffer pool pages, all claimed to be modified. In fact, there are log records for only 3 of the pages: 2, 0, 0x3a. The pages 5 and 0x36 were not modified. They were registered as such at the start of the mini-transaction, at the end of the first loop body of trx_purge_free_segment(). I checked for other calls of mtr_t::memo_push() that would set the MODIFY flag. In buf_page_free() it is correct, because the page will actually be modified (or marked as free in the file). All other calls look incorrect to me:
These calls had been added to MariaDB Server 10.6 in | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
elenst, can you please test the above patch on a branch where you can easily reproduce this? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2024-01-05 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I couldn't reproduce the failure on 10.11 or 11.2 with the patch above in a reasonable amount of attempts, so it apparently fixes at least the reported failure. For the record, this is the test I used this time, both for 10.11 and 11.2:
Comparing to the tests in the description, it enforces InnoDB storage engine and is somewhat simpler (less DDL), to avoid irrelevant failures. This way, without the patch I got the failure in 1-2 attempts on each branch, and with the patch got a clean run of 15 attempts on each branch. However, it only checks that there are no crashes, assertion failures or corruption-like errors upon/after crash recovery. It doesn't verify the functionality or performance. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-08 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you, elenst. I think that the | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Debarun Banerjee [ 2024-01-10 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
1. Setting MTR_MEMO_PAGE_X_MODIFY without actually modifying the page could result in the problem. So, I agree here but we should work on a mtr test too.
2. There is a second issue with setting MTR_MEMO_PAGE_X_MODIFY directly without the page being modified later.
3. In purge code perhaps better/cleaner to reacquire the pages as opposed to pinning it. Usually pinning page to keep it alive is a special case and is better to be used when absolutely required. I suspect it is not going to cause any harm during purge if we get the required pages again. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-10 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
debarun, I mostly agree with you. For test coverage, we occasionally had some crash recovery tests failing due to this:
I do not have any ideas how to improve the regression test suite with respect to this. I did think of adding an assertion to mtr_t::memo_push() that would disallow parameters that include the MODIFY flag, but I decided against it, because then buf_page_free() would have had to be changed to less optimal code. I prefer to buffer-fix some blocks across mini-transaction boundaries in operations that need to be split into multiple small mini-transactions. The buffer-fix will not block page writes or log checkpoints, but it will prevent page eviction. I agree that there are some dangers around buffer-fixing, like The benefit of the buffer-fix trick is twofold:
Note: In | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2024-01-10 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
marko, debarun, thank you for such informative discussion. After all, I think that page fixing is reasonable solution for this case. As for testing, we could use the same technique for setting debug sync points in purge threads as it was used in |