Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23769

MTR can abort before it prints the test result summary

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5
    • 10.2.37, 10.3.28, 10.4.18, 10.5.9
    • Tests
    • None

    Description

      perl ./mtr main.1st main.1st --force --mysqld=--xx
      

      10.2 80075ba0

      $ perl ./mtr main.1st main.1st --force --mysqld=--xx
      Logging: ./mtr  main.1st main.1st --force --mysqld=--xx
      vardir: /data/bld/10.2-debug/mysql-test/var
      Checking leftover processes...
      Removing old var directory...
       - WARNING: Using the 'mysql-test/var' symlink
      Creating var directory '/data/bld/10.2-debug/mysql-test/var'...
      Checking supported features...
      MariaDB Version 10.2.34-MariaDB-debug
       - SSL connections supported
       - binaries are debug compiled
       - binaries built with wsrep patch
      Collecting tests...
      Installing system database...
       
      ==============================================================================
       
      TEST                                      RESULT   TIME (ms) or COMMENT
      --------------------------------------------------------------------------
       
      worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
      worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 142300, winpid: 142300, exit: 512] died after mysql-test-run waited 0.2 seconds for /data/bld/10.2-debug/mysql-test/var/run/mysqld.1.pid to be created.
      main.1st                                 [ fail ]
              Test ended at 2020-09-20 15:43:11
       
      CURRENT_TEST: main.1st
       
       
      Failed to start mysqld.1
      mysqltest failed but provided no output
       
       
       - saving '/data/bld/10.2-debug/mysql-test/var/log/main.1st/' to '/data/bld/10.2-debug/mysql-test/var/log/main.1st/'
      ***Warnings generated in error logs during shutdown after running tests: main.1st
       
      2020-09-20 15:43:11 139634223757120 [ERROR] /data/bld/10.2-debug/bin/mysqld: unknown option '--xx'
      2020-09-20 15:43:11 139634223757120 [ERROR] Aborting
       
      worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 142306, winpid: 142306, exit: 512] died after mysql-test-run waited 0.2 seconds for /data/bld/10.2-debug/mysql-test/var/run/mysqld.1.pid to be created.
      main.1st                                 [ fail ]
              Test ended at 2020-09-20 15:43:11
       
      CURRENT_TEST: main.1st
      CURRENT_TEST: main.1st
       
       
      Failed to start mysqld.1
       
       
       - saving '/data/bld/10.2-debug/mysql-test/var/log/main.1st/' to '/data/bld/10.2-debug/mysql-test/var/log/main.1st/'
      ***Warnings generated in error logs during shutdown after running tests: main.1st
       
      2020-09-20 15:43:11 140377120438080 [ERROR] /data/bld/10.2-debug/bin/mysqld: unknown option '--xx'
      2020-09-20 15:43:11 140377120438080 [ERROR] Aborting
      

      Note that at the end there is no normal output saying X out of Y tests failed etc.

      Neither the specific test nor the specific option are important for reproducing the problem. The common factor seems to be that there should be at least two tests which fail and also have warnings/errors in the server error log which are caught and not ignored by the warning check; and one of such tests should be the last one executed.

      for example, if I create a test t.test like this

      t.test

      set a= 1;
      

      (a simple test which will fail because a is an unknown identifier), and then run MTR like this:

      perl ./mtr main.t main.t --force --mysqld=--lock-wait-timeout=-1
      

      I get the problem described above:

      ==============================================================================
       
      TEST                                      RESULT   TIME (ms) or COMMENT
      --------------------------------------------------------------------------
       
      worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
      set a= 1;
      main.t                                   [ fail ]
              Test ended at 2020-09-20 15:50:58
       
      CURRENT_TEST: main.t
      mysqltest: At line 1: query 'set a= 1' failed: 1193: Unknown system variable 'a'
       
       - saving '/data/bld/10.2-debug/mysql-test/var/log/main.t/' to '/data/bld/10.2-debug/mysql-test/var/log/main.t/'
      ***Warnings generated in error logs during shutdown after running tests: main.t
       
      2020-09-20 15:50:58 140006505084736 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
       
      set a= 1;
      main.t                                   [ fail ]
              Test ended at 2020-09-20 15:50:58
       
      CURRENT_TEST: main.t
      mysqltest: At line 1: query 'set a= 1' failed: 1193: Unknown system variable 'a'
       
       - saving '/data/bld/10.2-debug/mysql-test/var/log/main.t/' to '/data/bld/10.2-debug/mysql-test/var/log/main.t/'
      ***Warnings generated in error logs during shutdown after running tests: main.t
       
      2020-09-20 15:50:58 139746030774080 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
      

      Same happens even if there is a "good" test in between, so the failing ones don't have to be consequent:

      perl ./mtr main.t main.1st main.t --force --mysqld=--lock-wait-timeout=-1 --noreorder
      

      But if we run it in a different order, as

      perl ./mtr main.t main.t main.1st --force --mysqld=--lock-wait-timeout=-1 --noreorder
      

      then the summary is printed:

      ...
      main.1st                                 [ fail ]  Found warnings/errors in server log file!
              Test ended at 2020-09-20 15:53:34
      line
      2020-09-20 15:53:34 140296160270144 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
      ^ Found warnings in /data/bld/10.2-debug/mysql-test/var/log/mysqld.1.err
      ok
       
       - saving '/data/bld/10.2-debug/mysql-test/var/log/main.1st/' to '/data/bld/10.2-debug/mysql-test/var/log/main.1st/'
      --------------------------------------------------------------------------
      The servers were restarted 2 times
      Spent 0.000 of 4 seconds executing testcases
      mysql-test-run: WARNING: Got errors/warnings while running tests, please examine '/data/bld/10.2-debug/mysql-test/var/log/warnings' for details.
       
      Completed: Failed 3/3 tests, 0.00% were successful.
       
      Failing test(s): main.t main.1st
       
      The log files in var/log may give you some hint of what went wrong.
       
      If you want to report this error, please read first the documentation
      at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
       
      Errors/warnings were found in logfiles during server shutdown after running the
      following sequence(s) of tests:
          main.t
          main.t
      mysql-test-run: *** ERROR: there were failing test cases
      

      Same if we run the tests in the same "bad" order, but without inducing warnings in the log:

      perl ./mtr main.t main.1st main.t --force  --noreorder
      

      ...
              Test ended at 2020-09-20 15:54:48
       
      CURRENT_TEST: main.t
      mysqltest: At line 1: query 'set a= 1' failed: 1193: Unknown system variable 'a'
       
       - saving '/data/bld/10.2-debug/mysql-test/var/log/main.t/' to '/data/bld/10.2-debug/mysql-test/var/log/main.t/'
      --------------------------------------------------------------------------
      The servers were restarted 1 times
      Spent 0.002 of 4 seconds executing testcases
       
      Completed: Failed 2/3 tests, 33.33% were successful.
       
      Failing test(s): main.t
       
      The log files in var/log may give you some hint of what went wrong.
       
      If you want to report this error, please read first the documentation
      at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
       
      mysql-test-run: *** ERROR: there were failing test cases
      

      etc.

      According to my preliminary look, there is something wrong with the main loop in mysql-test-run.pl, in the subroutine run_test_server, in case it encounters line of the type WARNINGS. The following one-liner helped me to avoid the problem on my side:

      diff --git a/mysql-test/mysql-test-run.pl b/mysql-test/mysql-test-run.pl
      index f28034e77c1..ce32a97ec75 100755
      --- a/mysql-test/mysql-test-run.pl
      +++ b/mysql-test/mysql-test-run.pl
      @@ -778,6 +778,7 @@ sub run_test_server ($$$) {
                   # Test failure due to warnings, force is off
                   return ("Warnings in log", 1, $completed, $extra_warnings);
                 }
      +        redo;
               }
              elsif ($line =~ /^SPENT/) {
                add_total_times($line);
      

      I don't believe it can be the whole fix, and not at all sure it's even a right change; it is just to give the idea where to start looking from. The logic is quite messy and seems error-prone, so it needs to be fully analyzed before making any changes.

      Attachments

        Issue Links

          Activity

            Hi serg, yes indeed above line is sufficient meaning the server got the WARNINGS command and said to worker to finish its work (doesn't supply any other command), my patch was focusing on anything that can happen after BYE command (in this bug other worker didn't finish the work), but your patch is better.
            So I guess this should be your patch I understood how things work .

            anel Anel Husakovic added a comment - Hi serg , yes indeed above line is sufficient meaning the server got the WARNINGS command and said to worker to finish its work (doesn't supply any other command), my patch was focusing on anything that can happen after BYE command (in this bug other worker didn't finish the work), but your patch is better. So I guess this should be your patch I understood how things work .

            Good, then go ahead and push it.
            Thanks for helping me understand what's going on there!

            serg Sergei Golubchik added a comment - Good, then go ahead and push it. Thanks for helping me understand what's going on there!

            Ok serg I prepared it for 10.2, just one time check.
            I was also wondering when ending, when the socket is closed?
            Based on example when remove is called, close is also needed and we don't use it.

            So do we need something like this:

            diff --git a/mysql-test/mysql-test-run.pl b/mysql-test/mysql-test-run.pl
            index a493c813fea..5c2dc4bb5d0 100755
            --- a/mysql-test/mysql-test-run.pl
            +++ b/mysql-test/mysql-test-run.pl
            @@ -802,6 +802,7 @@ sub run_test_server ($$$) {
                      # Client disconnected
                      mtr_verbose("Child closed socket");
                      $s->remove($sock);
            +        $sock->close;
                      if (--$childs == 0){
                        return ("Completed", $test_failure, $completed, $extra_warnings);
                      }
            

            I tested on some examples and it works.

            anel Anel Husakovic added a comment - Ok serg I prepared it for 10.2 , just one time check. I was also wondering when ending, when the socket is closed? Based on example when remove is called, close is also needed and we don't use it. So do we need something like this: diff --git a/mysql-test/mysql-test-run.pl b/mysql-test/mysql-test-run.pl index a493c813fea..5c2dc4bb5d0 100755 --- a/mysql-test/mysql-test-run.pl +++ b/mysql-test/mysql-test-run.pl @@ -802,6 +802,7 @@ sub run_test_server ($$$) { # Client disconnected mtr_verbose("Child closed socket"); $s->remove($sock); + $sock->close; if (--$childs == 0){ return ("Completed", $test_failure, $completed, $extra_warnings); } I tested on some examples and it works.

            Looks good too.

            I think it's fine either way, clients don't reconnect — there's a small number of clients (<100) they connect once and stay connected up to the end. And at the end perl will close everything automatically. So whatever you prefer — you can close sockets there or keep the code as is.

            serg Sergei Golubchik added a comment - Looks good too. I think it's fine either way, clients don't reconnect — there's a small number of clients (<100) they connect once and stay connected up to the end. And at the end perl will close everything automatically. So whatever you prefer — you can close sockets there or keep the code as is.

            Thanks Serg.
            I decided to add line also .
            Closing the MDEV with patch cd927dd34555a34e7 for 10.2.
            Thanks for the review.

            anel Anel Husakovic added a comment - Thanks Serg. I decided to add line also . Closing the MDEV with patch cd927dd34555a34e7 for 10.2 . Thanks for the review.

            People

              anel Anel Husakovic
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.