[MDEV-32820] Race condition between trx_purge_free_segment() and trx_undo_create() Created: 2023-11-16 Updated: 2024-01-10 Resolved: 2023-11-21 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 11.1.1, 10.11.3, 11.0.2, 10.5.20, 10.6.13, 10.8.8, 10.9.6, 10.10.4, 11.2, 11.2.1 |
| Fix Version/s: | 10.5.24, 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3, 11.3.2 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Marko Mäkelä | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | corruption, race, regression | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Description |
|
The test encryption.create_or_replace failed on a IA-32 debug build like this:
Concurrently, another thread is trying to free pages from this undo log tablespace:
This task was added in Yet another thread is trying to write out pages of this undo tablespace:
So far, I failed to reproduce this locally by running 2,800 repetitions of the test on a local 32-bit build. I started another campaign of 56,000 repetitions, and it is over ¼ through, with no failures. |
| Comments |
| Comment by Marko Mäkelä [ 2023-11-16 ] | |||||||||||||||
|
Next, I will try to reproduce this with a lower concurrency:
| |||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-16 ] | |||||||||||||||
|
A lower-concurrency test did not work out either:
| |||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-16 ] | |||||||||||||||
|
Further low-concurrency attempts to reproduce this failed, as did attempts to reproduce this with a 64-bit executable. The function trx_undo_create() acquires an exclusive latch on block = rseg->get(mtr, err) that it should contiguously hold until its grand-caller trx_undo_report_row_operation() invokes mtr_t::commit(). The critical section trx_purge_truncate_rseg_history() is intended to be protected by both an exclusive latch on rseg_hdr (which corresponds to the above block) and an exclusive rseg.latch that was acquired in purge_sys_t::iterator::free_history(). At the time of the crash, trx_purge_free_segment() is expected to be holding these latches as well as an exclusive latch on the segment header page that is known as b in trx_purge_truncate_rseg_history(). It is waiting for an exclusive latch on the undo tablespace. I think that it is plausible to assume that trx_purge_free_segment() has already successfully invoked fseg_free_step_not_header() at least once. It turns out that when trx_purge_free_segment() enters its first while loop body, it fails to re-latch rseg_hdr between mini-transactions. To fix this, we might recursively acquire another exclusive rseg_hdr latch:
Before the fix of A better alternative would be to make trx_purge_free_segment() acquire the exclusive rseg_hdr page latch between each call to fseg_free_step_not_header(). It corresponds to what we did before It would seem that the one-time CI failure in the Description must have involved at least 2 invocations to fseg_free_step_not_header() in trx_purge_free_segment() before a further invocation was blocked. A previous invocation of fseg_free_step_not_header() (after the first one) would have corrupted things for the concurrently running trx_undo_create(). Presumably, in the failed CI run, some undo log had accumulated from previous tests, causing the need for at least 3 calls to fseg_free_step_not_header() within trx_purge_free_segment(). When the test is being run by itself, the undo segment can always be freed in fewer steps. | |||||||||||||||
| Comment by Vladislav Lesin [ 2023-11-21 ] | |||||||||||||||
|
The fix looks good to me. | |||||||||||||||
| Comment by Marko Mäkelä [ 2023-12-13 ] | |||||||||||||||
|
There was a very low-probability race condition between freeing and allocating undo log pages that can corrupt InnoDB transaction metadata. Data could be retrieved by producing a logical dump after starting the server with innodb_force_recovery=3. |