[MDEV-22543] Galera SST donation fails, FLUSH TABLES WITH READ LOCK times out Created: 2020-05-13 Updated: 2021-04-19 Resolved: 2020-08-11 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera, Tests |
| Affects Version/s: | 10.1, 10.2, 10.3, 10.4 |
| Fix Version/s: | 10.2.35, 10.3.26, 10.4.16, 10.5.7 |
| Type: | Bug | Priority: | Major |
| Reporter: | Teemu Ollakka | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Fixed | Votes: | 3 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Description |
|
SST donation fails occasionally under heavy load due to FLUSH TABLES WITH READ LOCK timing out after one second of waiting:
The following MTR test demonstrates the issue by issuing an UPDATE on donor node and stopping the UPDATE execution at sync point after some MDL locks have been taken. When node_2 tries to join with SST, the lock wait time out happens.
Apparently the reason for early time out is the following in MDL_context::acquire_lock
The call to thd_is_connected() always returns false for SST donor THD, so if the lock wait lasts more than one second, it will bail out with timeout. |