[MDEV-25288] Create a list of tests for distributions to run upon building/packaging MariaDB server Created: 2021-03-29  Updated: 2021-06-11  Resolved: 2021-04-22

Status: Closed
Project: MariaDB Server
Component/s: Tests
Fix Version/s: 10.2.38, 10.3.29, 10.4.19, 10.5.10, 10.6.0

Type: Task Priority: Blocker
Reporter: Elena Stepanova Assignee: Elena Stepanova
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-7069 Fix buildbot failures in main server ... Stalled
relates to MDEV-10605 main.create_delayed failed in buildbo... Open

 Description   

Replace the black list of unstable tests introduced in MDEV-10604 and maintained since then with a list of tests which distributions can run in their build/packaging process.

Since the goal of running MTR tests during 3-rd party build process is not to test MariaDB code but only to ensure that the build is functional, there is no need to run thousands of MariaDB tests and combinations. So, the new collection will be much shorter and more static than the previous unstable-tests list.

Criteria:

  • separate components (plugins, clients, scripts) should be represented by at least one test when possible;
  • every suite should be represented by at least one test, except for those which aren't run regularly and thus aren't maintained properly;
  • the tests shouldn't have any sporadic failures on main maintenance branches in the last half of a year (*);
  • the tests can't require debug build, windows, or a special configuration not normally present on build machines;
  • out of the remaining candidates, the priority is given to those with higher line coverage and/or more generic ones (judging by name) and/or more those which closer relate to the system/environment (judging by name).

(*) Once the list is created, this condition won't be explicitly re-checked before the release, as it was done for unstable-tests. Instead, we should have an buildbot step which would run these tests, and its failure should be "fatal" and require an immediate action.

Draft list for 10.2 (first edition)

archive.archive
archive.archive_gis
archive.partition_archive
binlog.binlog_base64_flag
binlog.binlog_database
binlog.binlog_innodb
binlog.binlog_parallel_replication_marks_stm_mix
binlog.binlog_row_mix_innodb_myisam
binlog.flashback
binlog_encryption.encrypted_slave
connect.bin
connect.bson
connect.dbf
connect.dir
connect.endian
connect.general
connect.json
connect.mysql
connect.secure_file_priv
connect.tbl
csv.csv
disks.disks
encryption.encryption_force
encryption.innodb_encryption_tables
encryption.tempfiles_encrypted
federated.federated
federated.federatedx
gcol.gcol_select_innodb
handler.interface
heap.heap
innodb.innodb
innodb.autoinc_persist
innodb.innodb_defrag_binlog
innodb.innodb_mysql
innodb.monitor
innodb.purge
innodb.table_flags
innodb.xa_recovery
innodb_fts.fulltext
innodb_gis.geometry
innodb_gis.rtree
innodb_zip.innodb-zip
innodb_zip.page_size
json.json_no_table
main-test_sql_discovery.create
main.blackhole
main.bootstrap
main.compress
main.connect
main.ctype_collate
main.ctype_utf8
main.default
main.dyncol
main.fulltext
main.function_defaults
main.gis
main.grant
main.handlersocket
main.information_schema
main.innodb_ext_key
main.log_tables
main.lowercase_fs_off
main.myisam
main.mysql_client_test
main.mysql_protocols
main.mysql_upgrade
main.mysqladmin
main.mysqlbinlog
main.mysqlcheck
main.mysqld--defaults-file
main.mysqldump
main.mysqlhotcopy_myisam
main.mysqlshow
main.mysqlslap
main.mysqltest
main.parser
main.partition
main.perror
main.plugin
main.plugin_auth
main.pool_of_threads
main.ps
main.repair
main.shutdown
main.sp
main.ssl
main.ssl_compress
main.stat_tables
main.statistics
main.subselect
main.symlink
main.temp_table
main.timezone
main.type_timestamp_hires
main.user_var
main.userstat
main.variables
main.view
main.win
main.xa
maria.maria
mariabackup.full_backup
metadata_lock_info.table_metadata_lock
mroonga/storage.variable_version
mroonga/wrapper.count_star
multi_source.multisource
oqgraph.general-innodb
parts.rpl_partition
perfschema.selects
plugins.auth_ed25519
plugins.cracklib_password_check
plugins.dialog
plugins.fulltext_plugin
plugins.locales
plugins.pam_cleartext
plugins.processlist
plugins.qc_info
plugins.server_audit
plugins.simple_password_check
plugins.sql_error_log
plugins.two_password_validations
plugins.unix_socket
query_response_time.basic
rocksdb.rocksdb
roles.definer
rpl.rpl_gtid_basic
rpl.rpl_relayrotate
rpl.rpl_row_blob_innodb
rpl.rpl_semi_sync_event
rpl.rpl_sp
rpl.rpl_stm_binlog_max_cache_size
rpl.rpl_switch_stm_row_mixed
sequence.simple
spider.basic_sql
spider.ha
sql_discovery.simple
sys_vars.sysvars_aria
sys_vars.sysvars_server_notembedded
vcol.vcol_syntax
wsrep.variables

On my machine it takes ~3 min on a non-debug 10.2 build, with MTR running in shm with --parallel=4.

10.3 change over 10.2 list

--- a/mysql-test/collections/smoke_test
+++ b/mysql-test/collections/smoke_test
@@ -48,7 +48,8 @@ innodb_gis.rtree
 innodb_zip.innodb-zip
 innodb_zip.page_size
 json.json_no_table
-main-test_sql_discovery.create
+# Doesn't work in 10.3+ due to MDEV-25384
+# main-test_sql_discovery.create
 main.blackhole
 main.bootstrap
 main.compress
@@ -145,13 +146,22 @@ sys_vars.sysvars_server_notembedded
 wsrep.variables
 "
 
+tests_103="
+compat/oracle.binlog_ptr_mysqlbinlog
+compat/oracle.sp-package
+compat/maxdb.rpl_mariadb_timestamp
+sql_sequence.mysqldump
+versioning.simple
+versioning.trx_id
+"
+
 tests_to_run=""
 
 while read -r test
 do
   test=`echo $test | sed -e s/\#.*//`
   tests_to_run="$tests_to_run $test"
-done <<< $(echo "$tests_102")
+done <<< $(echo "$tests_102 $tests_103")
 
 echo $tests_to_run
 

10.4 additions to 10.3 list

diff --git a/mysql-test/collections/smoke_test b/mysql-test/collections/smoke_test
index 445e0f6d529..986d9856ac2 100755
--- a/mysql-test/collections/smoke_test
+++ b/mysql-test/collections/smoke_test
@@ -154,6 +154,10 @@ sql_sequence.mysqldump
 versioning.simple
 versioning.trx_id
 "
+tests_104="
+period.versioning
+plugins.multiauth
+"
 
 tests_to_run=""
 
@@ -161,7 +165,7 @@ while read -r test
 do
   test=`echo $test | sed -e s/\#.*//`
   tests_to_run="$tests_to_run $test"
-done <<< $(echo "$tests_102 $tests_103")
+done <<< $(echo "$tests_102 $tests_103 $tests_104")
 
 echo $tests_to_run

10.5 additions to 10.4 list

diff --git a/mysql-test/collections/smoke_test b/mysql-test/collections/smoke_test
index 986d9856ac2..2391f4f6057 100755
--- a/mysql-test/collections/smoke_test
+++ b/mysql-test/collections/smoke_test
@@ -159,13 +159,18 @@ period.versioning
 plugins.multiauth
 "
 
+tests_105="
+client.mariadb-conv
+innodb_i_s.innodb_sys_tables
+"
+
 tests_to_run=""
 
 while read -r test
 do
   test=`echo $test | sed -e s/\#.*//`
   tests_to_run="$tests_to_run $test"
-done <<< $(echo "$tests_102 $tests_103 $tests_104")
+done <<< $(echo "$tests_102 $tests_103 $tests_104 $tests_105")
 
 echo $tests_to_run

In the current edition, the list is implemented as a shell script under mysql-test/collections/
It runs MTR with the pre-defined list of tests and optionally with other MTR parameters passed from the command line.

One disadvantage of having it this way (and same would be if it was a plain text list) is that i out-of-source builds it will be a bit awkward to call, because collections remain in the sourcedir, together with MTR tests.
For example, in the old buildbot deb package builds, it would be something like

cd builddir/mysql-test
../../mysql-test/collections/smoke_test <other options>



 Comments   
Comment by Elena Stepanova [ 2021-04-11 ]

otto,

We are about to replace the old unstable-tests list with a "white" list to be used by distributions to test the builds, and for similar purposes where thorough code testing is not required. Since you are the main (and possibly the only) user of unstable-tests, we need your confirmation that the new solution works for you.

The first edition is a list of ~150 tests in a form of a shell script. It initializes the list and runs MTR with it. Providing extra parameters, if needed, is up to the caller.

The script sits in mysql-test/collections/ folder, which was meant by upstream for a similar purpose, but hasn't been used much in MariaDB. So, for example in this run instead of

cd builddir/mysql-test && ./mtr --force --mem --parallel=8 --skip-rpl --suite=main --skip-test-list=unstable-tests  || exit 1 ;

it would be something like

cd builddir/mysql-test && ../../mysql-test/collections/smoke_test --force --mem --parallel=8 || exit 1 ;

(the paths can probably be dealt with in a more elegant way, but that's the general idea)

Examples of the script per version:
for 10.2
for 10.3
for 10.4
for 10.5

Comment by Otto Kekäläinen [ 2021-04-15 ]

In the past 1-2 years there have not been many false positive new tests. The failures I remember lately from upstream MariaDB.org CI systems or downstream distro CI systems were real failures. This is just to establish that current method has served well.

Currently the main suite is run and it has 700+ tests after the unstable tests are subtracted.

Wouldn't the easiest solution be to permanently clean up the suite=main so it has only good tests?
Or create a new suite called 'core' or 'base' or something? Why create another abstraction layer in the form of a new script that calls mtr instead of calling mtr directly?

I don't have any strong opinions on this, I'll just adapt the debian/rules file to whatever new system you come up with.

Comment by Elena Stepanova [ 2021-04-15 ]

Why create another abstraction layer in the form of a new script that calls mtr instead of calling mtr directly?

We can totally do it this way. In fact, it was the initial idea – there would be a text file with the same tests to run, and you would call MTR as

perl ./mtr `cat <the list>` <other options>

The only reason I switched to the script at the end is that such a list can't have any comments (or anyone who uses it would have to grep/sed them out), and I don't think it will survive this way. Sooner or later somebody will add a comment "for readability", or will comment out a test temporarily; and we can't even put a comment saying "don't do that", because it can't have comments.
Still, it's not a big deal, we'll probably catch it quickly if somebody breaks it this way, so it's up to you.

This is just to establish that current method has served well.

As was previously discussed, if the foundation or distributions want to take over the maintenance of the unstable-tests list, they are very welcome to do so. So far nobody volunteered, the offer still stands.

Wouldn't the easiest solution be to permanently clean up the suite=main so it has only good tests?

I assume by "good" you mean "stable" (as in fact many of unstable tests are good for their main purpose, much better than many of the stable ones.)
The history shows that not only isn't it easy, but so far it's been unachievable, otherwise it would have been done long time ago.
Or, if by "cleaning" you mean simply moving out from "main" all tests which have shown signs of instability, it's technically is possible as a snapshot solution, but it is by no means permanent. In the next release, many tests will be modified, new tests will be added, and some previously failing ones will be fixed. Some of the changes are bound to be unstable. Basically it is the exact same activity as the unstable-tests list – going through all sporadic failures, through all test changes and all JIRA items concerning tests since the last release – only instead of modifying the text file somebody would have to physically move test files back and forth, constantly shuffling the suites. It makes no sense.

Or create a new suite called 'core' or 'base' or something?

It would mean running and maintaining duplicate tests for no good reason.

Currently the main suite is run and it has 700+ tests after the unstable tests are subtracted.

Right. That's another reason for the change.
The main suite for the most part represents the core functionality of the server (plus client programs and tools). It touches a lot of code and thus is very important in terms of testing the code for regressions.
But that's not the goal when you (or anyone who builds the server for purposes other than development) run the tests. You don't need to test the code, you need to test the build for viability. Running 95% of those 700 tests is a waste of time for this purpose; it does you no good to run every permutation of SELECT with every variation of optimizer_switch to make sure that there have been no changes in the plans anywhere. On the other hand, you don't run tests you may need – e.g. those that at least load the plugins you built and make sure they are usable. I don't claim that the new collection is perfect, but it should be more relevant than the main suite.

Comment by Elena Stepanova [ 2021-04-20 ]

I will assume for now that the text file is preferred and will change to it. We can get back to a script later if needed.

Comment by Elena Stepanova [ 2021-04-21 ]

mysql-test/collections/smoke_test is pushed into 10.2.
For a merge 10.2=>10.3 the file should be taken as is from 10.2, and these lines should be added at the end:

compat/oracle.binlog_ptr_mysqlbinlog
compat/oracle.sp-package
compat/maxdb.rpl_mariadb_timestamp
sql_sequence.mysqldump
versioning.simple
versioning.trx_id

For a merge 10.3=>10.4 the file should be taken as is from 10.3, and these lines should be added at the end:

period.versioning
plugins.multiauth

For a merge 10.4=>10.5 the file should be taken as is from 10.4, and these lines should be added at the end:

client.mariadb-conv
innodb_i_s.innodb_sys_tables

For a merge 10.5=>10.6 the file should be taken as is from 10.5

Comment by Marko Mäkelä [ 2021-04-22 ]

I merged this from 10.2 up to 10.6 according to the given instructions.

Comment by Otto Kekäläinen [ 2021-05-09 ]

elenst WIP branch to start using your new smoke_test is available at https://github.com/ottok/mariadb/commit/bd3f3a91a0d1a437936c6d6a04fee5091073ffe5

Is this how you envisioned it?

These changes alone are not yet enough. There should also be changes to the CMakeLists.txt so that they install the smoke_test to /usr/share/mysql-test/collections/ so that the smoke_test can be run on systems that have the mariadb-test package installed (otherwise it will fail like https://salsa.debian.org/mariadb-team/mariadb-server/-/jobs/1628544 did).

Comment by Otto Kekäläinen [ 2021-05-09 ]

Looked more into this and discovered that mtr has option skip-test-list=FILE but there is no run-test-list=FILE. So this is the reason why you initially suggested a script as a wrapper to mtr. Now I understand. However I don't understand why you then didn't go with it, the second version you did (a list that is intended to cat to bash) seems quite unpractical and clearly inferior.

Anyway, the commit above shows the lines that need an update to use the smoke_test, so it is easy for you to finalize the implementation however you like is best.

Comment by Elena Stepanova [ 2021-05-09 ]

I don't understand why you then didn't go with it, the second version you did (a list that is intended to cat to bash) seems quite unpractical and clearly inferior

I didn't go with it because you explicitly objected the script and didn't object the alternative which was described. Given that Debian packaging is the only active user of the list known to me, it was a decisive factor.

There should also be changes to the CMakeLists.txt so that they install the smoke_test

A white list (at least the text form of it) cannot be used universally in installations, so there was no point installing it. It is only meant for source builds, which correlates with the original reason from several years ago for having a list, as I remember it – you couldn't build the MariaDB server on Debian, because the test step during the build would sometimes fail, and thus the whole build process would fail. It wasn't about installation tests then.

That said, a script (rather than a text list) could have been extended to check for the presence of optional plugins and thus made potentially usable for package installations, but since you objected the script, there was no point installing the file.

Anyway, if you now want it converted to a script, it can be done for the next release. I expected that much.

Comment by Otto Kekäläinen [ 2021-05-09 ]

I didn't go with it because you explicitly objected the script and didn't object the alternative which was described.

No I didn't, I merely asked you 3 questions about your design as I didn't understand it. Please don't read questions as advice.

In the demo commit I have shown you where your changes have an effect. It is not a suggestion of a change. Use that information to your best.

If my original comment had any opinion, then at most it was that I questioned the need for the change. For the record, I have still some doubts. If you think the change is needed, then it is your call.

If you continue to work on this, please strive for the best possible code quality in your view, don't introduce dead code into the code base, pick the designs you think are the best ones, and take ownership of your changes.

Thanks!

Comment by Elena Stepanova [ 2021-05-09 ]

otto,

I very much agree with you on the "dead code" point.
I've sent an email for what I see as a better solution now when new circumstances (new buildbot) came to existence. It has little to no relevance to the one described here in JIRA, so I won't flood with it here.

To clarify this part:

pick the designs you think are the best ones

The best design (at least the user-facing part of it) is a subjective matter. When we have only one known user of the "feature", or one who represents the small user group better than any of us, it makes no sense to use our taste for the "best design", we need to get it from the user, and the decision will be based on it. With the lack of response, we have to guess what you meant, as it happened in this case. Think of it as user/customer requirements. If you disagree with a suggested solution, please just say so, then there will be less back-and-forth.

Comment by Otto Kekäläinen [ 2021-05-10 ]

The best design (at least the user-facing part of it) is a subjective matter. When we have only one known user of the "feature"

Please put yourself as the subject then, and pick the design you think is the best one.

Debian and other distros are not "users" of the MySQL/MariaDB main test suite in particular. The root problem here is that a piece of software has a test suite called "main" but there isn't the culture or mechanisms to keep the main test suite from regressing.

Whatever QA a distro is doing is only the last line of defense. Ideally the developer who breaks a test suite should not be able to commit such code to the mainline at all (hopefully we get protected branches some day) or if a regression is found, the developer should be in a place that they feel ownership to fix their tests. Because of this, I have my doubts is a separate test list only for distro use makes much sense, but if you feel that is the right solution, and you probably know all tests in and out, then you can implement it. Please note that this is not a request nor an objection, only a comment.

Comment by Elena Stepanova [ 2021-05-10 ]

Please put yourself as the subject then, and pick the design you think is the best one.

Okay. If I am to decide, it will be the solution I described in the email, I'll outline it here now, for clarity.

Recently the new buildbot (new MariaDB CI at buildbot.mariadb.org maintained by MariaDB Foundation) launched so-called "branch protection". It means that before a developer can push something into a main tree, the development tree with this commit must undergo a build and a test run on a certain collection of platforms, and all tests must pass.

The test part is implemented via running a subset of MTR tests. Unlike the whole MTR test universe, this limited subset is expected to be kept passable by whatever means necessary, because otherwise the development work will completely stall due to not being able to push any code.

Thus, in my opinion, the same subset of tests should be used wherever MTR is used outside the MariaDB development. How exactly it is implemented, should be decided by the Foundation as they are maintaining the branch protection test runs.

I'll create a separate task for the Foundation to provide the means to do identical test runs outside the buildbot.
The smoke_test solution implemented in this task will be removed from the code in the next release.
The old "unstable-tests" list will remain in the code for a while, as it has been there for a long time and somebody may be using it, but it won't be maintained any longer.

Whatever QA a distro is doing is only the last line of defense

Whatever "QA" a distro is doing shouldn't be a line of defense from MariaDB code problems at all; it should care for the quality of the build, which has been my point from the start. And for the quality of the build, it is the only line of defense, because MariaDB cannot check the builds which it doesn't produce.

Debian and other distros are not "users" of the MySQL/MariaDB main test suite in particular.

No, they are not, and they shouldn't be. They are "users" of a solution which is required to provide "nearly perfect" test runs which ensure the quality of the build, which is what the task was about.

The root problem here is that a piece of software has a test suite called "main" but there isn't the culture or mechanisms to keep the main test suite from regressing.

I agree with you that the regression test suite must be stable and any other way to avoid instability is nothing but a workaround.
But this task was not about solving the "root problem" at all.

If we want to get somewhere instead of having hypothetical debates, we have to stay realistic. The experience shows that so far MariaDB development hasn't had resources to polish the MTR test suite to perfection, and it's very unlikely to change in near future. I can't provide you with stable MTR tests either. I don't write them, don't own them, don't maintain them and don't even use them; they are owned by MariaDB development.

There are two practical approaches to mitigate instability of tests: either ignore certain failures, annoying and inconvenient as it is – that's what happens in the regular MariaDB CI test runs and local test runs performed by developers, but it was said to be unacceptable for distributions; or filter the existing test suite and try to run a selection of tests to achieve the best with what we have; that's what the distributions have been doing all way long, be it with "unstable-tests" list or something else, and what the branch protection is doing now.

Comment by Otto Kekäläinen [ 2021-05-24 ]

For the record, in MariaDB 10.5.10 the unstable-tests list is still used in both Debian/Ubuntu and Fedora:

I suspect the list will be used until it is removed upstream. As stated previously and in https://jira.mariadb.org/browse/MDEV-10604?focusedCommentId=168540&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-168540, the unstable-tests currently has worked well and does its job.

Generated at Thu Feb 08 09:36:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.