[MDEV-33004] innodb.cursor-restore-locking test fails Created: 2023-12-12 Updated: 2024-02-07 Resolved: 2024-02-07 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB, Tests |
| Affects Version/s: | 10.5 |
| Fix Version/s: | 10.5.25 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Vladislav Lesin | Assignee: | Vladislav Lesin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Description |
|
https://buildbot.mariadb.org/#/builders/572/builds/4845/steps/8/logs/stdio:
|
| Comments |
| Comment by Vladislav Lesin [ 2024-01-24 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
The bug is well reproducible and can be recorded with rr. The initial state before the bug: The bug scenario is the following: 1. trx_purge() is invoked by purge coordinator thread, trx_purge_fetch_next_rec() returns NULL because "START TRANSACTION WITH CONSISTENT SNAPSHOT" is still active and the condition (purge_sys.tail.trx_no >= purge_sys.low_limit_no()) is true. trx_purge_attach_undo_recs() and trx_purge() return 0 to srv_do_purge(), see the following call stack:
2. Consider the following code in srv_do_purge():
So, trx_purge() purges nothing because there is transaction with view, which can see the changes, made by trx_1 and trx2, i.e. delete-marked and inserted records. trx_sys.rseg_history_len equal to 2. srv_do_purge() returns 2 despite it purged nothing. 3. Consider the following code in purge_coordinator_callback_low():
After the above code execution purge_state.m_history_length is equal to 2, the timer is set to 10 milliseconds and purge_coordinator_callback() is quitted. 4. When 10 milliseconds is expired, purge_coordinator_timer_callback() is invoked. Take a look the following code in purge_coordinator_timer_callback():
So, purge_state.m_history_length was set to 2 on step 3. It was set to the value of trx_sys.rseg_history_len. The (purge_state.m_history_length == trx_sys.rseg_history_len) condition is true, that's why purge_coordinator_timer_callback() does not awake purge coordinator thread. Every further purge_coordinator_timer_callback() will not awake purge coordinator thread because the above condition stays true. The overall logic is the following. Save history length before suspending, and if the history length was not changed while purge coordinator thread was suspended, then just do nothing. But it does not take into account that history can be not purged before coordinator thread is suspended. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2024-01-25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
Some useful notes from Marko: marko
marko
marko | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-02-07 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
This was fixed by removing the test innodb.cursor-restore-locking from 10.5 only. The test will remain in 10.6 and later releases, where it is stable. |