Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15409

make sure every sst script is tested in buildbot

Details

    • 10.2.14

    Description

      make sure every sst script is tested in buildbot

      Attachments

        Issue Links

          Activity

            serg Sergei Golubchik created issue -
            serg Sergei Golubchik made changes -
            Field Original Value New Value
            serg Sergei Golubchik made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ] Sergei Golubchik [ serg ]
            serg Sergei Golubchik made changes -
            Sprint 10.2.14 [ 229 ]
            elenst Elena Stepanova made changes -
            elenst Elena Stepanova added a comment - - edited

            I've set up non-MTR tests in buildbot, for now on deb packages with 10.1, 10.2, 10.3 main branches.

            The tests are run amongst the installation/upgrade bunch. They use the resulting VM image from install test (which installs MariaDB server and whatever dependencies it pulls).
            The logic is very basic:

            • stop the "default" server (the one that's run by the service),
            • create a minimal wsrep config and 3 custom configs for 3 nodes,
            • copy the data directory created by the install test (which is basically empty, but fully bootstrapped and has a user for SST auth) for each of 3 nodes,
            • start one node with wsrep-new-cluster,
            • create a table, insert a value,
            • start two other nodes, one after another, using the configured SST method,
            • make sure they joined the cluster and picked up the previously created table (SST worked),
            • do something on one of the nodes and make sure other nodes picked up the change (runtime replication works).

            The test is run 4 times, once for each of mariabackup, xtrabackup-v2, mysqldump, rsync.

            Tests are shown in buildbot as galera-mariabackup, galera-rsync etc.

            Error logs from all nodes and syslog are stored.

            Notes:

            • mysqldump has been disabled for 10.1 due to MDEV-15541
            • to start nodes, mysqld_safe is used. The initial idea was to use mysqld_multi, but it didn't work out, because mariabackup/xtrabackup SST methods don't work when the datadir is provided on the command line (which is what mysqld_multi would do). It can, however, work with defaults-extra-file, which is how it has been set up.
            • xtrabackup-v2 on Power is disabled, because xtrabackup itself is only available for amd64/i386.
            • xtrabackup-v2 on artful/i386 fails, because xtrabackup package is missing at Percona site; maybe needs to be disabled.

            Tests seem to behave more or less as expected at the first glance, but probably some intermittent failures will happen and will need to be fixed.

            Examples:
            current 10.1 (passed, except for disabled mysqldump): http://buildbot.askmonty.org/buildbot/builders/kvm-deb-stretch-amd64/builds/2932
            10.1, revision prior to MDEV-15254 bugfix (xtrabackup-v2 failed): https://internal.askmonty.org/buildbot/builders/kvm-deb-artful-amd64/builds/698
            current 10.2 (passed): https://internal.askmonty.org/buildbot/builders/kvm-deb-stretch-amd64/builds/2931
            current 10.3 (xtrabackup-v2 fails, says 10.3.6 format is not supported): http://buildbot.askmonty.org/buildbot/builders/kvm-deb-artful-amd64/builds/697 – to be looked at, possibly it's not supposed to work and needs to be disabled

            elenst Elena Stepanova added a comment - - edited I've set up non-MTR tests in buildbot, for now on deb packages with 10.1, 10.2, 10.3 main branches. The tests are run amongst the installation/upgrade bunch. They use the resulting VM image from install test (which installs MariaDB server and whatever dependencies it pulls). The logic is very basic: stop the "default" server (the one that's run by the service), create a minimal wsrep config and 3 custom configs for 3 nodes, copy the data directory created by the install test (which is basically empty, but fully bootstrapped and has a user for SST auth) for each of 3 nodes, start one node with wsrep-new-cluster , create a table, insert a value, start two other nodes, one after another, using the configured SST method, make sure they joined the cluster and picked up the previously created table (SST worked), do something on one of the nodes and make sure other nodes picked up the change (runtime replication works). The test is run 4 times, once for each of mariabackup , xtrabackup-v2 , mysqldump , rsync . Tests are shown in buildbot as galera-mariabackup , galera-rsync etc. Error logs from all nodes and syslog are stored. Notes: mysqldump has been disabled for 10.1 due to MDEV-15541 to start nodes, mysqld_safe is used. The initial idea was to use mysqld_multi , but it didn't work out, because mariabackup/xtrabackup SST methods don't work when the datadir is provided on the command line (which is what mysqld_multi would do). It can, however, work with defaults-extra-file , which is how it has been set up. xtrabackup-v2 on Power is disabled, because xtrabackup itself is only available for amd64/i386. xtrabackup-v2 on artful/i386 fails, because xtrabackup package is missing at Percona site; maybe needs to be disabled. Tests seem to behave more or less as expected at the first glance, but probably some intermittent failures will happen and will need to be fixed. Examples: current 10.1 (passed, except for disabled mysqldump): http://buildbot.askmonty.org/buildbot/builders/kvm-deb-stretch-amd64/builds/2932 10.1, revision prior to MDEV-15254 bugfix (xtrabackup-v2 failed): https://internal.askmonty.org/buildbot/builders/kvm-deb-artful-amd64/builds/698 current 10.2 (passed): https://internal.askmonty.org/buildbot/builders/kvm-deb-stretch-amd64/builds/2931 current 10.3 (xtrabackup-v2 fails, says 10.3.6 format is not supported): http://buildbot.askmonty.org/buildbot/builders/kvm-deb-artful-amd64/builds/697 – to be looked at, possibly it's not supposed to work and needs to be disabled
            danblack Daniel Black added a comment -

            elenst thanks for setting up all these tests.

            xtrabackup-v2 and mariadbbackup sst failures with mysqld_multi:

            As xtrabackup and mariadbbackup can take --datadir as an argument and the datadir is passed to the sst scripts there shouldn't be a reason these scripts are relying on a configuration file settings. wlad commented here https://github.com/MariaDB/server/pull/554#issuecomment-359403975 that settings come directly from the server.

            In theory it should also work with --defaults-group-suffix= too.

            on 10.3 xtrabackup-v2 failure:

            Its hitting the error:
            https://github.com/percona/percona-xtrabackup/blob/e861671ab35dea0eeb6e7d96a1683c65f7445060/storage/innobase/log/log0recv.cc#L1391

            however there's a lot more formats in mariadb
            https://github.com/MariaDB/server/blob/fe0e263e6d9c3330dbc8de4608fc62e8f4700a95/storage/innobase/log/log0recv.cc#L946..L952

            I was working on some docker related tests. Can I get a link to the buildbot test source please?

            danblack Daniel Black added a comment - elenst thanks for setting up all these tests. xtrabackup-v2 and mariadbbackup sst failures with mysqld_multi: As xtrabackup and mariadbbackup can take --datadir as an argument and the datadir is passed to the sst scripts there shouldn't be a reason these scripts are relying on a configuration file settings. wlad commented here https://github.com/MariaDB/server/pull/554#issuecomment-359403975 that settings come directly from the server. In theory it should also work with --defaults-group-suffix= too. on 10.3 xtrabackup-v2 failure: Its hitting the error: https://github.com/percona/percona-xtrabackup/blob/e861671ab35dea0eeb6e7d96a1683c65f7445060/storage/innobase/log/log0recv.cc#L1391 however there's a lot more formats in mariadb https://github.com/MariaDB/server/blob/fe0e263e6d9c3330dbc8de4608fc62e8f4700a95/storage/innobase/log/log0recv.cc#L946..L952 I was working on some docker related tests. Can I get a link to the buildbot test source please?

            As xtrabackup and mariadbbackup can take --datadir as an argument and the datadir is passed to the sst scripts there shouldn't be a reason these scripts are relying on a configuration file settings

            The problem is not with xtrabackup and mariabackup, it's with SST methods xtrabackup-v2 and mariabackup. At some point (at the last move-back step in particular) they don't pass datadir over to innobackupex, it attempts to find it in default config files, and things get messed up.

            I was working on some docker related tests. Can I get a link to the buildbot test source please?

            There isn't much of a source, the text above basically describes it all, but anyway, everything is in buildbot's maria-master.cfg, def getDebGaleraStep (it will be in https://github.com/MariaDB/mariadb.org-tools/blob/master/buildbot/maria-master.cfg after it auto-commits next night). Please don't modify it directly even if it looks ugly to you.
            If you don't want to wait till it auto-commits, you can naturally see all steps in the buildbot logs.

            Please also note that addition of these tests doesn't rule out the need of MTR tests. While of course ultimately it's MariaDB's responsibility to test the final code, somehow the rule that all contributions must come with their own tests has been largely forgotten, there have been many patches to SST scripts without any tests whatsoever. It would be beneficial for the quality if they were tested more thoroughly before submission and followed guidelines upon submission, even if it would cause some decrease in the amount of patches.

            elenst Elena Stepanova added a comment - As xtrabackup and mariadbbackup can take --datadir as an argument and the datadir is passed to the sst scripts there shouldn't be a reason these scripts are relying on a configuration file settings The problem is not with xtrabackup and mariabackup, it's with SST methods xtrabackup-v2 and mariabackup. At some point (at the last move-back step in particular) they don't pass datadir over to innobackupex, it attempts to find it in default config files, and things get messed up. I was working on some docker related tests. Can I get a link to the buildbot test source please? There isn't much of a source, the text above basically describes it all, but anyway, everything is in buildbot's maria-master.cfg, def getDebGaleraStep (it will be in https://github.com/MariaDB/mariadb.org-tools/blob/master/buildbot/maria-master.cfg after it auto-commits next night). Please don't modify it directly even if it looks ugly to you. If you don't want to wait till it auto-commits, you can naturally see all steps in the buildbot logs. Please also note that addition of these tests doesn't rule out the need of MTR tests. While of course ultimately it's MariaDB's responsibility to test the final code, somehow the rule that all contributions must come with their own tests has been largely forgotten, there have been many patches to SST scripts without any tests whatsoever. It would be beneficial for the quality if they were tested more thoroughly before submission and followed guidelines upon submission, even if it would cause some decrease in the amount of patches.
            danblack Daniel Black added a comment -

            Thanks for the details elenst. I've no intention of modifying the buildbot config, just gaining ideas for testing. I apologize for my broken galera SST changes. Quick remedies especially after releases aren't good enough. They weren't tested properly, I'm really sorry, I lost patience with the state of SST MTR tests and hoped component based testing was sufficient. It wasn't. I'll try to fix some of the MTR tests for galera and I promise to do better next time. So sorry.

            danblack Daniel Black added a comment - Thanks for the details elenst . I've no intention of modifying the buildbot config, just gaining ideas for testing. I apologize for my broken galera SST changes. Quick remedies especially after releases aren't good enough. They weren't tested properly, I'm really sorry, I lost patience with the state of SST MTR tests and hoped component based testing was sufficient. It wasn't. I'll try to fix some of the MTR tests for galera and I promise to do better next time. So sorry.
            1. could you please check why galera_sst_mysqldump fails?
            2. do we have any tests for wsrep_sst_mariabackup? If not, perhaps they can be created based on galera_sst_xtrabackup* tests?
            serg Sergei Golubchik added a comment - could you please check why galera_sst_mysqldump fails? do we have any tests for wsrep_sst_mariabackup? If not, perhaps they can be created based on galera_sst_xtrabackup* tests?
            serg Sergei Golubchik made changes -
            Assignee Sergei Golubchik [ serg ] Sachin Setiya [ sachin.setiya.007 ]

            It is failing because It gives 1205 error on each call of "show status" in wait_untill_connected_again.inc

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - It is failing because It gives 1205 error on each call of "show status" in wait_untill_connected_again.inc
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            Aurelien_LEQUOY Aurélien LEQUOY added a comment - For me the problem is on client have a look : https://jira.mariadb.org/browse/MDEV-15383?focusedCommentId=108624&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-108624

            the problem is not mariadb-Server, not on galera but on stupid client : libmariadbclient18 10.2.13+maria~stretch

            https://jira.mariadb.org/browse/MDEV-15383?focusedCommentId=108624&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-108624

            You need to add a full test to test galera cluster WITH SST, and don't push new version without try SST. I lost so many time with this, i remember on 10.0.x i spend one month to find tricky problem on xtrabackup (this time) and solve 95% of bug open on Percona.

            Aurelien_LEQUOY Aurélien LEQUOY added a comment - the problem is not mariadb-Server, not on galera but on stupid client : libmariadbclient18 10.2.13+maria~stretch https://jira.mariadb.org/browse/MDEV-15383?focusedCommentId=108624&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-108624 You need to add a full test to test galera cluster WITH SST , and don't push new version without try SST. I lost so many time with this, i remember on 10.0.x i spend one month to find tricky problem on xtrabackup (this time) and solve 95% of bug open on Percona.
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            Fix Version/s 10.1.32 [ 22908 ]
            Fix Version/s 10.2.14 [ 22911 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ] Sergei Golubchik [ serg ]
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 85740 ] MariaDB v4 [ 133476 ]

            People

              serg Sergei Golubchik
              serg Sergei Golubchik
              Votes:
              2 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.