[MDEV-25288] Create a list of tests for distributions to run upon building/packaging MariaDB server Created: 2021-03-29 Updated: 2021-06-11 Resolved: 2021-04-22 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Tests |
| Fix Version/s: | 10.2.38, 10.3.29, 10.4.19, 10.5.10, 10.6.0 |
| Type: | Task | Priority: | Blocker |
| Reporter: | Elena Stepanova | Assignee: | Elena Stepanova |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Description |
|
Replace the black list of unstable tests introduced in Since the goal of running MTR tests during 3-rd party build process is not to test MariaDB code but only to ensure that the build is functional, there is no need to run thousands of MariaDB tests and combinations. So, the new collection will be much shorter and more static than the previous unstable-tests list. Criteria:
(*) Once the list is created, this condition won't be explicitly re-checked before the release, as it was done for unstable-tests. Instead, we should have an buildbot step which would run these tests, and its failure should be "fatal" and require an immediate action.
On my machine it takes ~3 min on a non-debug 10.2 build, with MTR running in shm with --parallel=4.
In the current edition, the list is implemented as a shell script under mysql-test/collections/ One disadvantage of having it this way (and same would be if it was a plain text list) is that i out-of-source builds it will be a bit awkward to call, because collections remain in the sourcedir, together with MTR tests.
|
| Comments |
| Comment by Elena Stepanova [ 2021-04-11 ] | ||||||||||
|
otto, We are about to replace the old unstable-tests list with a "white" list to be used by distributions to test the builds, and for similar purposes where thorough code testing is not required. Since you are the main (and possibly the only) user of unstable-tests, we need your confirmation that the new solution works for you. The first edition is a list of ~150 tests in a form of a shell script. It initializes the list and runs MTR with it. Providing extra parameters, if needed, is up to the caller. The script sits in mysql-test/collections/ folder, which was meant by upstream for a similar purpose, but hasn't been used much in MariaDB. So, for example in this run instead of
it would be something like
(the paths can probably be dealt with in a more elegant way, but that's the general idea) Examples of the script per version: | ||||||||||
| Comment by Otto Kekäläinen [ 2021-04-15 ] | ||||||||||
|
In the past 1-2 years there have not been many false positive new tests. The failures I remember lately from upstream MariaDB.org CI systems or downstream distro CI systems were real failures. This is just to establish that current method has served well. Currently the main suite is run and it has 700+ tests after the unstable tests are subtracted. Wouldn't the easiest solution be to permanently clean up the suite=main so it has only good tests? I don't have any strong opinions on this, I'll just adapt the debian/rules file to whatever new system you come up with. | ||||||||||
| Comment by Elena Stepanova [ 2021-04-15 ] | ||||||||||
We can totally do it this way. In fact, it was the initial idea – there would be a text file with the same tests to run, and you would call MTR as
The only reason I switched to the script at the end is that such a list can't have any comments (or anyone who uses it would have to grep/sed them out), and I don't think it will survive this way. Sooner or later somebody will add a comment "for readability", or will comment out a test temporarily; and we can't even put a comment saying "don't do that", because it can't have comments.
As was previously discussed, if the foundation or distributions want to take over the maintenance of the unstable-tests list, they are very welcome to do so. So far nobody volunteered, the offer still stands.
I assume by "good" you mean "stable" (as in fact many of unstable tests are good for their main purpose, much better than many of the stable ones.)
It would mean running and maintaining duplicate tests for no good reason.
Right. That's another reason for the change. | ||||||||||
| Comment by Elena Stepanova [ 2021-04-20 ] | ||||||||||
|
I will assume for now that the text file is preferred and will change to it. We can get back to a script later if needed. | ||||||||||
| Comment by Elena Stepanova [ 2021-04-21 ] | ||||||||||
|
mysql-test/collections/smoke_test is pushed into 10.2.
For a merge 10.3=>10.4 the file should be taken as is from 10.3, and these lines should be added at the end:
For a merge 10.4=>10.5 the file should be taken as is from 10.4, and these lines should be added at the end:
For a merge 10.5=>10.6 the file should be taken as is from 10.5 | ||||||||||
| Comment by Marko Mäkelä [ 2021-04-22 ] | ||||||||||
|
I merged this from 10.2 up to 10.6 according to the given instructions. | ||||||||||
| Comment by Otto Kekäläinen [ 2021-05-09 ] | ||||||||||
|
elenst WIP branch to start using your new smoke_test is available at https://github.com/ottok/mariadb/commit/bd3f3a91a0d1a437936c6d6a04fee5091073ffe5 Is this how you envisioned it? These changes alone are not yet enough. There should also be changes to the CMakeLists.txt so that they install the smoke_test to /usr/share/mysql-test/collections/ so that the smoke_test can be run on systems that have the mariadb-test package installed (otherwise it will fail like https://salsa.debian.org/mariadb-team/mariadb-server/-/jobs/1628544 did). | ||||||||||
| Comment by Otto Kekäläinen [ 2021-05-09 ] | ||||||||||
|
Looked more into this and discovered that mtr has option skip-test-list=FILE but there is no run-test-list=FILE. So this is the reason why you initially suggested a script as a wrapper to mtr. Now I understand. However I don't understand why you then didn't go with it, the second version you did (a list that is intended to cat to bash) seems quite unpractical and clearly inferior. Anyway, the commit above shows the lines that need an update to use the smoke_test, so it is easy for you to finalize the implementation however you like is best. | ||||||||||
| Comment by Elena Stepanova [ 2021-05-09 ] | ||||||||||
I didn't go with it because you explicitly objected the script and didn't object the alternative which was described. Given that Debian packaging is the only active user of the list known to me, it was a decisive factor.
A white list (at least the text form of it) cannot be used universally in installations, so there was no point installing it. It is only meant for source builds, which correlates with the original reason from several years ago for having a list, as I remember it – you couldn't build the MariaDB server on Debian, because the test step during the build would sometimes fail, and thus the whole build process would fail. It wasn't about installation tests then. That said, a script (rather than a text list) could have been extended to check for the presence of optional plugins and thus made potentially usable for package installations, but since you objected the script, there was no point installing the file. Anyway, if you now want it converted to a script, it can be done for the next release. I expected that much. | ||||||||||
| Comment by Otto Kekäläinen [ 2021-05-09 ] | ||||||||||
No I didn't, I merely asked you 3 questions about your design as I didn't understand it. Please don't read questions as advice. In the demo commit I have shown you where your changes have an effect. It is not a suggestion of a change. Use that information to your best. If my original comment had any opinion, then at most it was that I questioned the need for the change. For the record, I have still some doubts. If you think the change is needed, then it is your call. If you continue to work on this, please strive for the best possible code quality in your view, don't introduce dead code into the code base, pick the designs you think are the best ones, and take ownership of your changes. Thanks! | ||||||||||
| Comment by Elena Stepanova [ 2021-05-09 ] | ||||||||||
|
otto, I very much agree with you on the "dead code" point. To clarify this part:
The best design (at least the user-facing part of it) is a subjective matter. When we have only one known user of the "feature", or one who represents the small user group better than any of us, it makes no sense to use our taste for the "best design", we need to get it from the user, and the decision will be based on it. With the lack of response, we have to guess what you meant, as it happened in this case. Think of it as user/customer requirements. If you disagree with a suggested solution, please just say so, then there will be less back-and-forth. | ||||||||||
| Comment by Otto Kekäläinen [ 2021-05-10 ] | ||||||||||
Please put yourself as the subject then, and pick the design you think is the best one. Debian and other distros are not "users" of the MySQL/MariaDB main test suite in particular. The root problem here is that a piece of software has a test suite called "main" but there isn't the culture or mechanisms to keep the main test suite from regressing. Whatever QA a distro is doing is only the last line of defense. Ideally the developer who breaks a test suite should not be able to commit such code to the mainline at all (hopefully we get protected branches some day) or if a regression is found, the developer should be in a place that they feel ownership to fix their tests. Because of this, I have my doubts is a separate test list only for distro use makes much sense, but if you feel that is the right solution, and you probably know all tests in and out, then you can implement it. Please note that this is not a request nor an objection, only a comment. | ||||||||||
| Comment by Elena Stepanova [ 2021-05-10 ] | ||||||||||
Okay. If I am to decide, it will be the solution I described in the email, I'll outline it here now, for clarity. Recently the new buildbot (new MariaDB CI at buildbot.mariadb.org maintained by MariaDB Foundation) launched so-called "branch protection". It means that before a developer can push something into a main tree, the development tree with this commit must undergo a build and a test run on a certain collection of platforms, and all tests must pass. The test part is implemented via running a subset of MTR tests. Unlike the whole MTR test universe, this limited subset is expected to be kept passable by whatever means necessary, because otherwise the development work will completely stall due to not being able to push any code. Thus, in my opinion, the same subset of tests should be used wherever MTR is used outside the MariaDB development. How exactly it is implemented, should be decided by the Foundation as they are maintaining the branch protection test runs. I'll create a separate task for the Foundation to provide the means to do identical test runs outside the buildbot.
Whatever "QA" a distro is doing shouldn't be a line of defense from MariaDB code problems at all; it should care for the quality of the build, which has been my point from the start. And for the quality of the build, it is the only line of defense, because MariaDB cannot check the builds which it doesn't produce.
No, they are not, and they shouldn't be. They are "users" of a solution which is required to provide "nearly perfect" test runs which ensure the quality of the build, which is what the task was about.
I agree with you that the regression test suite must be stable and any other way to avoid instability is nothing but a workaround. If we want to get somewhere instead of having hypothetical debates, we have to stay realistic. The experience shows that so far MariaDB development hasn't had resources to polish the MTR test suite to perfection, and it's very unlikely to change in near future. I can't provide you with stable MTR tests either. I don't write them, don't own them, don't maintain them and don't even use them; they are owned by MariaDB development. There are two practical approaches to mitigate instability of tests: either ignore certain failures, annoying and inconvenient as it is – that's what happens in the regular MariaDB CI test runs and local test runs performed by developers, but it was said to be unacceptable for distributions; or filter the existing test suite and try to run a selection of tests to achieve the best with what we have; that's what the distributions have been doing all way long, be it with "unstable-tests" list or something else, and what the branch protection is doing now. | ||||||||||
| Comment by Otto Kekäläinen [ 2021-05-24 ] | ||||||||||
|
For the record, in MariaDB 10.5.10 the unstable-tests list is still used in both Debian/Ubuntu and Fedora:
I suspect the list will be used until it is removed upstream. As stated previously and in https://jira.mariadb.org/browse/MDEV-10604?focusedCommentId=168540&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-168540, the unstable-tests currently has worked well and does its job. |