[MDEV-17565] Sporadic Galera failures when testing MariaDB with mtr Created: 2018-10-30 Updated: 2021-06-24 Resolved: 2021-06-24 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera, Galera SST |
| Affects Version/s: | 10.1.36, 10.2.18, 10.3 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Critical |
| Reporter: | Julius Goryavsky | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | Galera, galera | ||
| Description |
|
Several bugs in Galera lead to sporadic failures when testing MariaDB server with the mtr due to false error messages (or warnings) that are not related to the new fixes: 1) Some mtr tests sometimes fails due to warnings. 2) Some mtr tests sometimes fails due to warnings. 3) Some mtr test sometimes fails when node is evicted from the cluster in middle of SST. 4) If SST fails due to a network error, the node that acted as a donor sometimes does not return to its original state, which leads to failure due to the inability to continue the test execution (due to a timeout). |
| Comments |
| Comment by Julius Goryavsky [ 2018-10-30 ] |
|
https://github.com/MariaDB/galera/pull/4 The patch includes a some changes taken from the latest 1) Some mtr tests sometimes fails due to Currently gcs_.caused() function works only when the group Instead of failing immediately, this patch changes gcs_.caused() 2) Some mtr tests sometimes fails due to This is because when processing cluster configuration changes, To correct this error, I added an additional call to the 3) Some mtr test sometimes fails when node is evicted from Even when node evicted, the SST script may completes normally. To fix this, we should avoid joining the cluster through 4) If SST fails due to a network error, the node that acted If sst_sent() fails node should restore itself back to joined |
| Comment by Jan Lindström (Inactive) [ 2019-12-05 ] |
|
Yurchenko Can you also review the changes please. |
| Comment by Julius Goryavsky [ 2020-01-21 ] |
|
julien.fritsch I transferred these changes to current versions (after review), now I check regressions and then I commit changes on github |
| Comment by Jan Lindström (Inactive) [ 2021-06-24 ] |
|
SST issues were fixed on major script cleanuup. |