[MDEV-15918] Configure buildbot so that Galera suites are run with lower concurrency Created: 2018-04-18 Updated: 2023-09-28 Resolved: 2023-09-28 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera, Tests |
| Fix Version/s: | N/A |
| Type: | Task | Priority: | Major |
| Reporter: | Jan Lindström (Inactive) | Assignee: | Unassigned |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Description |
|
Effected suites:
These should be run so that no other suite is run in parallel. |
| Comments |
| Comment by Elena Stepanova [ 2018-04-27 ] |
|
I have no problem doing it technically, but we need to have justification for it first. Running tests without parallelism is an expensive exercise and it shouldn't be done just because it seems easier than fixing badly written tests. For example, if it's just a hope that the tests will run faster and thus will hit less timeouts, it's not good enough reason. Also, please note that before we do it, you need to remove them from the list of default suites. If they can't run in a normal fashion, they can't be a part of the default set. |
| Comment by Seppo Jaakola [ 2018-05-17 ] |
|
With galera and galera_3nodes suites, mtr deploys a synchronous mariadb cluster, where nodes need to maintain consensus. In highly loaded test environment, nodes may not get enough CPU cycles to keep up communicating with each other, and this may lead to cluster split. Some sporadic test failure logs suggest that this scenario has happened in buildbot testing. |
| Comment by Elena Stepanova [ 2018-05-17 ] |
|
It does't make sense to me. If test cases are designed so that they fail just because the machine is slow, reducing parallelism is not a solution to anything. There will still be slow builders, there might still be delays of various sorts, failures will still be happening. That said, if dbart and serg both agree to it, I can make it happen, but it absolutely means that all affected Galera tests must be excluded from the default test set before we reduce parallelism for them. |
| Comment by Sergei Golubchik [ 2018-05-17 ] |
|
I agree with elenst and I'd rather increase timeouts. You cannot know how slow the builder is. Even no parallelism in the builder, buildslave aidi still runs 30 builders in parallel. A test has no control over that. |
| Comment by Seppo Jaakola [ 2018-05-30 ] |
|
Ok then, I will extend galera timeouts in galera and galera_3nodes suites, and create a pull request with that. It is probable that more tests with fail after this, but we can fix them one by one. |
| Comment by Alexey Bychko (Inactive) [ 2019-07-02 ] |
|
I'll check if we have this on Azure |
| Comment by Elena Stepanova [ 2023-09-28 ] |
|
We have already reduced parallelism for galera tests significantly (sadly). |