[MDEV-26631] Race condition in the test innodb.doublewrite Created: 2021-09-17 Updated: 2021-11-10 Resolved: 2021-09-27 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB, Tests |
| Affects Version/s: | 10.6 |
| Fix Version/s: | 10.6.5 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Marko Mäkelä | Assignee: | Thirunarayanan Balathandayuthapani |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | debug, race, recovery, rr-profile | ||
| Issue Links: |
|
||||||||||||||||
| Description |
|
If I revert the
At first, I was suspecting that the debug variable innodb_buf_flush_list_now is not working reliably, but the function buf_flush_sync() does appear to write out all pages from the buffer pool before the server was killed and restarted. The failure has not occured on any CI system yet, and it does not fail reliably for me when not using rr. Initially, it did fail without rr for the strict_full_crc32 combinations, for the 32-bit build. |
| Comments |
| Comment by Marko Mäkelä [ 2021-09-17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
During a run of
I reliably see all strict_full_crc32 combinations failing on my system (with nproc 40), using a 32-bit build, all in that same test line 192:
If I only run ./mtr --parallel=auto innodb.doublewrite (all 2×5 combinations in parallel), they will pass. It definitely is some kind of a race condition. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-09-17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The recovery fails here:
In my initial debugging, I failed to notice that the recovered log is not for page number 0 (the page that is being corrupted by the test) but for page number 5. This code and data structure is part of the That log record was written by the following before the server was killed:
Before the first kill, the test did try to ensure that purge will not run until that point:
Adding a wait for purge before the second kill did not fix this failure after another kill:
If I omit that change and run the test with -
That would be an all-zero frame of the page 1 of table t1 that was corrupted in a subsequent test step. The write of that log record originated from debug instrumentation, as expected:
Before that page was written, there was a doublewrite batch that covered this page. A copy of the page was written to offset 0x104000 in the system tablespace as well as the final location. There were no other data file writes before the server was killed. We did recover a copy of the page in buf_dblwr_t::init_or_load_pages(), but recovery will not find the tablespace for t1 (tablespace ID 5) at this point:
There was an earlier call to fil_ibd_load() that returned FIL_LOAD_DEFER, but that could be due to an ‘old sin’ that caused us to fail at line 192 already (if I didn’t disable purge): we would apparently fail to restore a copy of page 0 from the doublewrite buffer. My following work-around attempt did not avoid an assertion failure for applying log to page 1 (with page number mismatch on the page):
These failures are possibly related to | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-09-17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Removing the following fault injection makes the test pass, both with ./mtr --rr and when running ./mtr --big-test --parallel=60 --force:
The root cause of all failures could be that | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Thirunarayanan Balathandayuthapani [ 2021-09-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Patch is in bb-10.6- | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-09-24 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I think that this is a step to the right direction. But, a bit more work is needed in order to ensure that we are not giving up recovery in the case that the first page had been partially written without using the doublewrite buffer. (Doublewrite buffer would be skipped when the page was initialized by redo log records.) I suspect that the proposed fix could break the deferred recovery in this case. |