[MDEV-15409] make sure every sst script is tested in buildbot Created: 2018-02-24  Updated: 2018-06-11  Resolved: 2018-03-23

Status: Closed
Project: MariaDB Server
Component/s: Galera SST
Fix Version/s: 10.1.32, 10.2.14

Type: Task Priority: Blocker
Reporter: Sergei Golubchik Assignee: Sergei Golubchik
Resolution: Fixed Votes: 2
Labels: None

Issue Links:
Blocks
is blocked by MDEV-14069 galera_sst_mysqldump.test fails with:... Closed
is blocked by MDEV-15541 Backport MDEV-13968 to 10.1: SST mysq... Closed
PartOf
includes MDEV-14305 Add smoke sst test Closed
Relates
relates to MDEV-14305 Add smoke sst test Closed
Sprint: 10.2.14

 Description   

make sure every sst script is tested in buildbot



 Comments   
Comment by Elena Stepanova [ 2018-03-12 ]

I've set up non-MTR tests in buildbot, for now on deb packages with 10.1, 10.2, 10.3 main branches.

The tests are run amongst the installation/upgrade bunch. They use the resulting VM image from install test (which installs MariaDB server and whatever dependencies it pulls).
The logic is very basic:

  • stop the "default" server (the one that's run by the service),
  • create a minimal wsrep config and 3 custom configs for 3 nodes,
  • copy the data directory created by the install test (which is basically empty, but fully bootstrapped and has a user for SST auth) for each of 3 nodes,
  • start one node with wsrep-new-cluster,
  • create a table, insert a value,
  • start two other nodes, one after another, using the configured SST method,
  • make sure they joined the cluster and picked up the previously created table (SST worked),
  • do something on one of the nodes and make sure other nodes picked up the change (runtime replication works).

The test is run 4 times, once for each of mariabackup, xtrabackup-v2, mysqldump, rsync.

Tests are shown in buildbot as galera-mariabackup, galera-rsync etc.

Error logs from all nodes and syslog are stored.

Notes:

  • mysqldump has been disabled for 10.1 due to MDEV-15541
  • to start nodes, mysqld_safe is used. The initial idea was to use mysqld_multi, but it didn't work out, because mariabackup/xtrabackup SST methods don't work when the datadir is provided on the command line (which is what mysqld_multi would do). It can, however, work with defaults-extra-file, which is how it has been set up.
  • xtrabackup-v2 on Power is disabled, because xtrabackup itself is only available for amd64/i386.
  • xtrabackup-v2 on artful/i386 fails, because xtrabackup package is missing at Percona site; maybe needs to be disabled.

Tests seem to behave more or less as expected at the first glance, but probably some intermittent failures will happen and will need to be fixed.

Examples:
current 10.1 (passed, except for disabled mysqldump): http://buildbot.askmonty.org/buildbot/builders/kvm-deb-stretch-amd64/builds/2932
10.1, revision prior to MDEV-15254 bugfix (xtrabackup-v2 failed): https://internal.askmonty.org/buildbot/builders/kvm-deb-artful-amd64/builds/698
current 10.2 (passed): https://internal.askmonty.org/buildbot/builders/kvm-deb-stretch-amd64/builds/2931
current 10.3 (xtrabackup-v2 fails, says 10.3.6 format is not supported): http://buildbot.askmonty.org/buildbot/builders/kvm-deb-artful-amd64/builds/697 – to be looked at, possibly it's not supposed to work and needs to be disabled

Comment by Daniel Black [ 2018-03-12 ]

elenst thanks for setting up all these tests.

xtrabackup-v2 and mariadbbackup sst failures with mysqld_multi:

As xtrabackup and mariadbbackup can take --datadir as an argument and the datadir is passed to the sst scripts there shouldn't be a reason these scripts are relying on a configuration file settings. wlad commented here https://github.com/MariaDB/server/pull/554#issuecomment-359403975 that settings come directly from the server.

In theory it should also work with --defaults-group-suffix= too.

on 10.3 xtrabackup-v2 failure:

Its hitting the error:
https://github.com/percona/percona-xtrabackup/blob/e861671ab35dea0eeb6e7d96a1683c65f7445060/storage/innobase/log/log0recv.cc#L1391

however there's a lot more formats in mariadb
https://github.com/MariaDB/server/blob/fe0e263e6d9c3330dbc8de4608fc62e8f4700a95/storage/innobase/log/log0recv.cc#L946..L952

I was working on some docker related tests. Can I get a link to the buildbot test source please?

Comment by Elena Stepanova [ 2018-03-12 ]

As xtrabackup and mariadbbackup can take --datadir as an argument and the datadir is passed to the sst scripts there shouldn't be a reason these scripts are relying on a configuration file settings

The problem is not with xtrabackup and mariabackup, it's with SST methods xtrabackup-v2 and mariabackup. At some point (at the last move-back step in particular) they don't pass datadir over to innobackupex, it attempts to find it in default config files, and things get messed up.

I was working on some docker related tests. Can I get a link to the buildbot test source please?

There isn't much of a source, the text above basically describes it all, but anyway, everything is in buildbot's maria-master.cfg, def getDebGaleraStep (it will be in https://github.com/MariaDB/mariadb.org-tools/blob/master/buildbot/maria-master.cfg after it auto-commits next night). Please don't modify it directly even if it looks ugly to you.
If you don't want to wait till it auto-commits, you can naturally see all steps in the buildbot logs.

Please also note that addition of these tests doesn't rule out the need of MTR tests. While of course ultimately it's MariaDB's responsibility to test the final code, somehow the rule that all contributions must come with their own tests has been largely forgotten, there have been many patches to SST scripts without any tests whatsoever. It would be beneficial for the quality if they were tested more thoroughly before submission and followed guidelines upon submission, even if it would cause some decrease in the amount of patches.

Comment by Daniel Black [ 2018-03-13 ]

Thanks for the details elenst. I've no intention of modifying the buildbot config, just gaining ideas for testing. I apologize for my broken galera SST changes. Quick remedies especially after releases aren't good enough. They weren't tested properly, I'm really sorry, I lost patience with the state of SST MTR tests and hoped component based testing was sufficient. It wasn't. I'll try to fix some of the MTR tests for galera and I promise to do better next time. So sorry.

Comment by Sergei Golubchik [ 2018-03-15 ]
  1. could you please check why galera_sst_mysqldump fails?
  2. do we have any tests for wsrep_sst_mariabackup? If not, perhaps they can be created based on galera_sst_xtrabackup* tests?
Comment by Sachin Setiya (Inactive) [ 2018-03-19 ]

It is failing because It gives 1205 error on each call of "show status" in wait_untill_connected_again.inc

Comment by Aurélien LEQUOY [ 2018-03-20 ]

For me the problem is on client

have a look : https://jira.mariadb.org/browse/MDEV-15383?focusedCommentId=108624&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-108624

Comment by Aurélien LEQUOY [ 2018-03-20 ]

the problem is not mariadb-Server, not on galera but on stupid client : libmariadbclient18 10.2.13+maria~stretch

https://jira.mariadb.org/browse/MDEV-15383?focusedCommentId=108624&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-108624

You need to add a full test to test galera cluster WITH SST, and don't push new version without try SST. I lost so many time with this, i remember on 10.0.x i spend one month to find tricky problem on xtrabackup (this time) and solve 95% of bug open on Percona.

Generated at Thu Feb 08 08:21:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.