[MDEV-23769] MTR can abort before it prints the test result summary Created: 2020-09-20  Updated: 2020-11-11  Resolved: 2020-11-10

Status: Closed
Project: MariaDB Server
Component/s: Tests
Affects Version/s: 10.1, 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.2.37, 10.3.28, 10.4.18, 10.5.9

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Anel Husakovic
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-22631 some test causes MTR interruption wit... Closed
relates to MDEV-23650 test S3 in buildbot Closed

 Description   

perl ./mtr main.1st main.1st --force --mysqld=--xx

10.2 80075ba0

$ perl ./mtr main.1st main.1st --force --mysqld=--xx
Logging: ./mtr  main.1st main.1st --force --mysqld=--xx
vardir: /data/bld/10.2-debug/mysql-test/var
Checking leftover processes...
Removing old var directory...
 - WARNING: Using the 'mysql-test/var' symlink
Creating var directory '/data/bld/10.2-debug/mysql-test/var'...
Checking supported features...
MariaDB Version 10.2.34-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
 - binaries built with wsrep patch
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 142300, winpid: 142300, exit: 512] died after mysql-test-run waited 0.2 seconds for /data/bld/10.2-debug/mysql-test/var/run/mysqld.1.pid to be created.
main.1st                                 [ fail ]
        Test ended at 2020-09-20 15:43:11
 
CURRENT_TEST: main.1st
 
 
Failed to start mysqld.1
mysqltest failed but provided no output
 
 
 - saving '/data/bld/10.2-debug/mysql-test/var/log/main.1st/' to '/data/bld/10.2-debug/mysql-test/var/log/main.1st/'
***Warnings generated in error logs during shutdown after running tests: main.1st
 
2020-09-20 15:43:11 139634223757120 [ERROR] /data/bld/10.2-debug/bin/mysqld: unknown option '--xx'
2020-09-20 15:43:11 139634223757120 [ERROR] Aborting
 
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 142306, winpid: 142306, exit: 512] died after mysql-test-run waited 0.2 seconds for /data/bld/10.2-debug/mysql-test/var/run/mysqld.1.pid to be created.
main.1st                                 [ fail ]
        Test ended at 2020-09-20 15:43:11
 
CURRENT_TEST: main.1st
CURRENT_TEST: main.1st
 
 
Failed to start mysqld.1
 
 
 - saving '/data/bld/10.2-debug/mysql-test/var/log/main.1st/' to '/data/bld/10.2-debug/mysql-test/var/log/main.1st/'
***Warnings generated in error logs during shutdown after running tests: main.1st
 
2020-09-20 15:43:11 140377120438080 [ERROR] /data/bld/10.2-debug/bin/mysqld: unknown option '--xx'
2020-09-20 15:43:11 140377120438080 [ERROR] Aborting

Note that at the end there is no normal output saying X out of Y tests failed etc.

Neither the specific test nor the specific option are important for reproducing the problem. The common factor seems to be that there should be at least two tests which fail and also have warnings/errors in the server error log which are caught and not ignored by the warning check; and one of such tests should be the last one executed.

for example, if I create a test t.test like this

t.test

set a= 1;

(a simple test which will fail because a is an unknown identifier), and then run MTR like this:

perl ./mtr main.t main.t --force --mysqld=--lock-wait-timeout=-1

I get the problem described above:

==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
set a= 1;
main.t                                   [ fail ]
        Test ended at 2020-09-20 15:50:58
 
CURRENT_TEST: main.t
mysqltest: At line 1: query 'set a= 1' failed: 1193: Unknown system variable 'a'
 
 - saving '/data/bld/10.2-debug/mysql-test/var/log/main.t/' to '/data/bld/10.2-debug/mysql-test/var/log/main.t/'
***Warnings generated in error logs during shutdown after running tests: main.t
 
2020-09-20 15:50:58 140006505084736 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
 
set a= 1;
main.t                                   [ fail ]
        Test ended at 2020-09-20 15:50:58
 
CURRENT_TEST: main.t
mysqltest: At line 1: query 'set a= 1' failed: 1193: Unknown system variable 'a'
 
 - saving '/data/bld/10.2-debug/mysql-test/var/log/main.t/' to '/data/bld/10.2-debug/mysql-test/var/log/main.t/'
***Warnings generated in error logs during shutdown after running tests: main.t
 
2020-09-20 15:50:58 139746030774080 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000

Same happens even if there is a "good" test in between, so the failing ones don't have to be consequent:

perl ./mtr main.t main.1st main.t --force --mysqld=--lock-wait-timeout=-1 --noreorder

But if we run it in a different order, as

perl ./mtr main.t main.t main.1st --force --mysqld=--lock-wait-timeout=-1 --noreorder

then the summary is printed:

...
main.1st                                 [ fail ]  Found warnings/errors in server log file!
        Test ended at 2020-09-20 15:53:34
line
2020-09-20 15:53:34 140296160270144 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
^ Found warnings in /data/bld/10.2-debug/mysql-test/var/log/mysqld.1.err
ok
 
 - saving '/data/bld/10.2-debug/mysql-test/var/log/main.1st/' to '/data/bld/10.2-debug/mysql-test/var/log/main.1st/'
--------------------------------------------------------------------------
The servers were restarted 2 times
Spent 0.000 of 4 seconds executing testcases
mysql-test-run: WARNING: Got errors/warnings while running tests, please examine '/data/bld/10.2-debug/mysql-test/var/log/warnings' for details.
 
Completed: Failed 3/3 tests, 0.00% were successful.
 
Failing test(s): main.t main.1st
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
Errors/warnings were found in logfiles during server shutdown after running the
following sequence(s) of tests:
    main.t
    main.t
mysql-test-run: *** ERROR: there were failing test cases

Same if we run the tests in the same "bad" order, but without inducing warnings in the log:

perl ./mtr main.t main.1st main.t --force  --noreorder

...
        Test ended at 2020-09-20 15:54:48
 
CURRENT_TEST: main.t
mysqltest: At line 1: query 'set a= 1' failed: 1193: Unknown system variable 'a'
 
 - saving '/data/bld/10.2-debug/mysql-test/var/log/main.t/' to '/data/bld/10.2-debug/mysql-test/var/log/main.t/'
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 0.002 of 4 seconds executing testcases
 
Completed: Failed 2/3 tests, 33.33% were successful.
 
Failing test(s): main.t
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
mysql-test-run: *** ERROR: there were failing test cases

etc.

According to my preliminary look, there is something wrong with the main loop in mysql-test-run.pl, in the subroutine run_test_server, in case it encounters line of the type WARNINGS. The following one-liner helped me to avoid the problem on my side:

diff --git a/mysql-test/mysql-test-run.pl b/mysql-test/mysql-test-run.pl
index f28034e77c1..ce32a97ec75 100755
--- a/mysql-test/mysql-test-run.pl
+++ b/mysql-test/mysql-test-run.pl
@@ -778,6 +778,7 @@ sub run_test_server ($$$) {
             # Test failure due to warnings, force is off
             return ("Warnings in log", 1, $completed, $extra_warnings);
           }
+        redo;
         }
        elsif ($line =~ /^SPENT/) {
          add_total_times($line);

I don't believe it can be the whole fix, and not at all sure it's even a right change; it is just to give the idea where to start looking from. The logic is quite messy and seems error-prone, so it needs to be fully analyzed before making any changes.



 Comments   
Comment by Anel Husakovic [ 2020-10-03 ]

Hi elenst,
I played a bit with this MDEV and here is a patch, hope I covered all test cases mentioned above.

  • Patch is solving generating report on warning
  • Patch is doing cosmetic fix of `current_test` log file which holds the old log value of test `CURRENT TEST:..` in `mark_log()` in case of `unknown option` and as such
    the logic which is using it's content doesn't output valid log content and doesn't generate valid `$test-> {'comment'}

    ` message.

Here are the test cases observed. Note that only in 6. scenario there is `print $child "HELLO\n";` problem in printing (to delete?).

1. Valid test case, but warning in server log

Single valid test + warning -> result OK

Show all

$ ./mtr MDEV-23769 --force
Logging: ./mtr  MDEV-23769 --force
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
Checking supported features...
MariaDB Version 10.1.47-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
Sphinx 'indexer' binary not found, sphinx suite will be skipped
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
#
# MDEV-23769: MTR can abort before it prints the test result summary
#
show databases;
Database
information_schema
mtr
mysql
performance_schema
test
main.MDEV-23769                          [ fail ]  Found warnings/errors in server log file!
        Test ended at 2020-10-03 02:13:00
line
2020-10-03  2:13:00 140309996242496 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
^ Found warnings in /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/mysqld.1.err
ok
 
 - saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 4 seconds executing testcases
mysql-test-run: WARNING: Got errors/warnings while running tests, please examine '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/warnings' for details.
 
Completed: Failed 1/1 tests, 0.00% were successful.
 
Failing test(s): main.MDEV-23769
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
mysql-test-run: *** ERROR: there were failing test cases

2. Valid test cases (same), but warning in server log (Note reported single test case)

Multiple valid tests (same) + warning -> result OK

Show all

$ ./mtr MDEV-23769 MDEV-23769 --force
Logging: ./mtr  MDEV-23769 MDEV-23769 --force
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
Checking supported features...
MariaDB Version 10.1.47-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
Sphinx 'indexer' binary not found, sphinx suite will be skipped
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
#
# MDEV-23769: MTR can abort before it prints the test result summary
#
show databases;
Database
information_schema
mtr
mysql
performance_schema
test
main.MDEV-23769                          [ fail ]  Found warnings/errors in server log file!
        Test ended at 2020-10-03 02:15:05
line
2020-10-03  2:15:04 139868136609344 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
^ Found warnings in /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/mysqld.1.err
ok
 
 - saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
#
# MDEV-23769: MTR can abort before it prints the test result summary
#
show databases;
Database
information_schema
mtr
mysql
performance_schema
test
main.MDEV-23769                          [ fail ]  Found warnings/errors in server log file!
        Test ended at 2020-10-03 02:15:05
line
2020-10-03  2:15:05 140513782085184 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
^ Found warnings in /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/mysqld.1.err
ok
 
 - saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 0.000 of 5 seconds executing testcases
mysql-test-run: WARNING: Got errors/warnings while running tests, please examine '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/warnings' for details.
 
Completed: Failed 2/2 tests, 0.00% were successful.
 
Failing test(s): main.MDEV-23769
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
mysql-test-run: *** ERROR: there were failing test cases

3. Valid test cases (different test cases), but warning in server log (Note one passed one failed)

Multiple valid tests (different) + warning -> result OK

Show all

$ ./mtr MDEV-23769 1st --force
Logging: ./mtr  MDEV-23769 1st --force
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
Checking supported features...
MariaDB Version 10.1.47-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
Sphinx 'indexer' binary not found, sphinx suite will be skipped
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
#
# MDEV-23769: MTR can abort before it prints the test result summary
#
show databases;
Database
information_schema
mtr
mysql
performance_schema
test
main.MDEV-23769                          [ fail ]  Found warnings/errors in server log file!
        Test ended at 2020-10-03 02:17:09
line
2020-10-03  2:17:08 140658805515840 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
^ Found warnings in /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/mysqld.1.err
ok
 
 - saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
main.1st                                 [ pass ]      1
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 0.001 of 4 seconds executing testcases
mysql-test-run: WARNING: Got errors/warnings while running tests, please examine '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/warnings' for details.
 
Completed: Failed 1/2 tests, 50.00% were successful.
 
Failing test(s): main.MDEV-23769
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
mysql-test-run: *** ERROR: there were failing test cases

4. Now for example 3. create artificial failure with warnings in server log (note, both test failed)

Single failing test + warning -> result OK

Show all

 
$ ./mtr MDEV-23769 1st --force
Logging: ./mtr  MDEV-23769 1st --force
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
Checking supported features...
MariaDB Version 10.1.47-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
Sphinx 'indexer' binary not found, sphinx suite will be skipped
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
#
# MDEV-23769: MTR can abort before it prints the test result summary
#
show databases;
Database
information_schema
mtr
mysql
performance_schema
test
main.MDEV-23769                          [ fail ]  Found warnings/errors in server log file!
        Test ended at 2020-10-03 02:17:52
line
2020-10-03  2:17:51 140260493149760 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
^ Found warnings in /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/mysqld.1.err
ok
 
 - saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
main.1st                                 [ fail ]
        Test ended at 2020-10-03 02:17:52
 
CURRENT_TEST: main.1st
mysqltest: At line 5: End of line junk detected: "show tables in mysql
"
 
The result from queries just before the failure was:
show databases;
Database
information_schema
mtr
mysql
performance_schema
test
 
 - saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.1st/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.1st/'
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 0.000 of 4 seconds executing testcases
mysql-test-run: WARNING: Got errors/warnings while running tests, please examine '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/warnings' for details.
 
Completed: Failed 2/2 tests, 0.00% were successful.
 
Failing test(s): main.MDEV-23769 main.1st
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
mysql-test-run: *** ERROR: there were failing test cases

5. Generate unknown option, for the same example as 3. (note: both failed)

Double failing tests + unknown option-> result OK

Show all

 
$ ./mtr MDEV-23769 1st --mysqld=--xx --force
Logging: ./mtr  MDEV-23769 1st --mysqld=--xx --force
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
Checking supported features...
MariaDB Version 10.1.47-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
Sphinx 'indexer' binary not found, sphinx suite will be skipped
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 52531, winpid: 52531, exit: 512] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/run/mysqld.1.pid to be created.
main.MDEV-23769                          [ fail ]
        Test ended at 2020-10-03 02:19:01
 
CURRENT_TEST: main.MDEV-23769
 
 
Failed to start mysqld.1
mysqltest failed but provided no output
 
 
 - saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
***Warnings generated in error logs during shutdown after running tests: main.MDEV-23769
 
2020-10-03  2:19:01 140238937720384 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
2020-10-03  2:19:01 140238937720384 [ERROR] /home/anel/mariadb/10.1-mtr-in-source/sql/mysqld: unknown option '--xx'
2020-10-03  2:19:01 140238937720384 [ERROR] Aborting
 
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 52537, winpid: 52537, exit: 512] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/run/mysqld.1.pid to be created.
main.1st                                 [ fail ]
        Test ended at 2020-10-03 02:19:02
 
CURRENT_TEST: main.1st
 
 
Failed to start mysqld.1
mysqltest failed but provided no output
 
 
 - saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.1st/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.1st/'
***Warnings generated in error logs during shutdown after running tests: main.1st
 
2020-10-03  2:19:02 140048201700928 [ERROR] /home/anel/mariadb/10.1-mtr-in-source/sql/mysqld: unknown option '--xx'
2020-10-03  2:19:02 140048201700928 [ERROR] Aborting
 
Can't use an undefined value as a symbol reference at ./mtr line 657.
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 0.000 of 5 seconds executing testcases
 
Completed: Failed 2/2 tests, 0.00% were successful.
 
Failing test(s): main.MDEV-23769 main.1st
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
Errors/warnings were found in logfiles during server shutdown after running the
following sequence(s) of tests:
    main.MDEV-23769
    main.1st
mysql-test-run: *** ERROR: there were failing test cases

6. Again the same option, but with 2 same tests -> not perl error: `Can't use an undefined value as a symbol reference at ./mtr line 657.
` ` print $child "HELLO\n";` (solution delete the line).

Single failing test called 2 times + unknown option-> result OK

Show all

$ ./mtr MDEV-23769 MDEV-23769 --mysqld=--xx --force
Logging: ./mtr  MDEV-23769 MDEV-23769 --mysqld=--xx --force
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
Checking supported features...
MariaDB Version 10.1.47-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
Sphinx 'indexer' binary not found, sphinx suite will be skipped
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 52595, winpid: 52595, exit: 512] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/run/mysqld.1.pid to be created.
main.MDEV-23769                          [ fail ]
        Test ended at 2020-10-03 02:20:12
 
CURRENT_TEST: main.MDEV-23769
 
 
Failed to start mysqld.1
mysqltest failed but provided no output
 
 
 - saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
***Warnings generated in error logs during shutdown after running tests: main.MDEV-23769
 
2020-10-03  2:20:12 139815724283456 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
2020-10-03  2:20:12 139815724283456 [ERROR] /home/anel/mariadb/10.1-mtr-in-source/sql/mysqld: unknown option '--xx'
2020-10-03  2:20:12 139815724283456 [ERROR] Aborting
 
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 52601, winpid: 52601, exit: 512] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/run/mysqld.1.pid to be created.
main.MDEV-23769                          [ fail ]
        Test ended at 2020-10-03 02:20:12
 
CURRENT_TEST: main.MDEV-23769
 
 
Failed to start mysqld.1
mysqltest failed but provided no output
 
 
 - saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
***Warnings generated in error logs during shutdown after running tests: main.MDEV-23769
 
2020-10-03  2:20:12 140650797745728 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
2020-10-03  2:20:12 140650797745728 [ERROR] /home/anel/mariadb/10.1-mtr-in-source/sql/mysqld: unknown option '--xx'
2020-10-03  2:20:12 140650797745728 [ERROR] Aborting
 
Can't use an undefined value as a symbol reference at ./mtr line 657.
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 0.000 of 5 seconds executing testcases
 
Completed: Failed 2/2 tests, 0.00% were successful.
 
Failing test(s): main.MDEV-23769
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
Errors/warnings were found in logfiles during server shutdown after running the
following sequence(s) of tests:
    main.MDEV-23769
    main.MDEV-23769
mysql-test-run: *** ERROR: there were failing test cases

Comment by Anel Husakovic [ 2020-10-22 ]

Having warning/error is a necessary condition.
Sufficient condition is to call at least 2 tests.
Incorrect behaviour can be seen if:

  • Both tests are correct but there is error generated (incorrect option passed):
    So this

    ./mysql-test/mtr --mysqld=--lock-wait-timeout=-xx 1st 

    will generate and write statistic at the end,
    but

    ./mysql-test/mtr --mysqld=--lock-wait-timeout=-xx 1st 1st --force

    will not generate the statistic.
    Note: warning is generated

    (--mysqld=--lock-wait-timeout=-1)

    statistic will be generated

Example of stdout in this case:

Failed to start mysqld.1
 
 - saving '/home/anel/mariadb/builds/10.1/mysql-test/var/log/main.1st/' to '/home/anel/mariadb/builds/10.1/mysql-test/var/log/main.1st/'
***Warnings generated in error logs during shutdown after running tests: main.1st
 
2020-10-22 15:23:22 140610533295296 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
2020-10-22 15:23:22 140610533295296 [ERROR] /home/anel/mariadb/builds/10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
2020-10-22 15:23:22 140610533295296 [ERROR] Aborting

  • Having failing test and call it twice with a warning/error:
    So this

    $ ./mysql-test/mtr --mysqld=--lock-wait-timeout=-1 fail fail --force


    or this

    $ ./mysql-test/mtr --mysqld=--lock-wait-timeout=-xx moj moj --force

    will not generate and write statistic at the end

Example of stdout:

mysqltest: At line 1: query 'set a=0' failed: 1193: Unknown system variable 'a'
 
 - saving '/home/anel/mariadb/builds/10.1/mysql-test/var/log/main.moj/' to '/home/anel/mariadb/builds/10.1/mysql-test/var/log/main.moj/'
***Warnings generated in error logs during shutdown after running tests: main.moj
 
2020-10-22 15:20:36 139669439161536 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000

  • However, if we run the above test with the patch, statistic will be generated, meaning it will write something like this for above 2 scenarios:

    Errors/warnings were found in logfiles during server shutdown after running the
    following sequence(s) of tests:
        main.1st
        main.1st
    mysql-test-run: *** ERROR: there were failing test cases
    

Completed: Failed 2/2 tests, 0.00% were successful.
 
Failing test(s): main.fail
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
Errors/warnings were found in logfiles during server shutdown after running the
following sequence(s) of tests:
    main.fail
    main.fail
mysql-test-run: *** ERROR: there were failing test cases

Comment by Anel Husakovic [ 2020-10-26 ]

Hi serg,
regarding this MDEV I understood the problem.
Here I will write how mtr works - for future reference and also to people starting using it in depth:

  • mtr is creating server socket my $server = new IO::Socket::INET
  • Creating the workers using fork my $child_pid= My::SafeProcess::Base::_safe_fork();
  • For each worker run_worker() function is called. Execution:
  • Creating a new socket my $server = new IO::Socket::INET, to connect to server_port from master.
  • Initiate communication with master using START command - > print $server "START\n"; <- master sends 1. test and send command TESTCASE $next->write_test($sock, 'TESTCASE');)
  • Worker gets command TESTCASE processes test case (by calling run_testcase()) which start/(restart if needed) the server and sends TESTRESULT - $test->write_test($server, 'TESTRESULT'); -> master accept this command, run mtr_report_test, check does test fail and also generate new command TESTCASE if new test case exist or BYE.
  • Patch introduced check_warnings_after_shoutdown is 8193c32743f5. Note that this function is reading error_logs although name suggest different extract_warning_lines($mysqld->value('#log-error').

Applying above on current problem:
To verify problem one can run single worker with 2 tests and force option or N workers and at least N+1 tests to get the error:

$ ./mtr --mysqld=--lock-wait-timeout=-xx 1st 1st --force --parallel 1 --verbose 
$ ./mtr --mysqld=--lock-wait-timeout=-xx 1st 1st 1st  --force --parallel 2 --verbose

Note: for N workers and N tests we have working example.
So what happens is that N=1, tests=2, wrong option:

  • Worker start communication calling START, master sends first test and TESTCASE command.
  • Worker runs run_worker() and test fails - Note: no warnings are generated. Sends TESTRESULT to master.
  • Master processes TESTRESULT and sends new test with TESTCASE
  • Worker runs TESTCASE but before that restarts the server from run_worker() and there invoke check_warnings_post_shutdown().
  • This function calls WARNING command to master and master execute it.
  • In parallel to this worker runs test which fails and send TESTRESULT to master
  • Master proceed execution plan after WARNING and sends to worker BYE command, but note that worker didn't finish it's execution.
  • Worker eventually finishes TESTRESULT, master processed it and again calls BYE command. In parallel worker execute BYE by calling again check_warnings_post_shutdown() which prints new warning. But still last BYE command was sent and there is no worker that can process it.
  • Still is unclear for me why the return values of run_testcase are not checked and logic created based on that?

Debugger problem during testing: I would appreciate if someone can let me know how to properly debug perl scripts for parallel >1 as well as good practice using debugger.
1. How the result of debugger could be this (one line executed from run_worker and after that one line of run_test_server ) :S ???? :

Break points:
  DB<3> L
./mtr:
 610:	  my ($server, $tests, $childs) = @_;
    break if (1)
 930:	  my ($server_port, $thread_num)= @_;
    break if (1)
 
main::run_worker(./mtr:930):	  my ($server_port, $thread_num)= @_;
[pid=2470->2529]  DB<3> n
main::run_test_server(./mtr:612):	  my $num_saved_cores= 0;  # Number of core files saved in vardir/log/ so far.
  DB<3> T
@ = DB::DB called from file './mtr' line 930
. = main::run_worker(48547, 1) called from file './mtr' line 519
. = main::main() called from file './mtr' line 397
[pid=2470->2529]  DB<3> n
main::run_test_server(./mtr:613):	  my $num_saved_datadir= 0;  # Number of datadirs saved in vardir/log/ so far.
  DB<3> n
main::run_worker(./mtr:932):	  $SIG{INT}= sub { exit(1); };
[pid=2470->2529]  DB<3> n
main::run_test_server(./mtr:614):	  my $num_failed_test= 0; # Number of tests failed so far
  DB<3> n
main::run_worker(./mtr:933):	  $SIG{HUP}= sub { exit(1); };

2. Also tried using Trepan and following is obtained:

Use of uninitialized value in subroutine entry at /home/anel/perl5/lib/perl5/Devel/Trepan/DB/../../../Devel/Trepan/DB/Sub.pm line 290.
main.1st                                 [ fail ]
        Test ended at 2020-10-26 19:29:06
 
CURRENT_TEST: main.1st
CURRENT_TEST: main.1st
 
 
Failed to start mysqld.1
 
 
Debugged program terminated.  Use 'q' to quit or 'R' to restart.
(trepanpl):  - saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/'                                                                  
***Warnings generated in error logs during shutdown after running tests: main.1st
 
2020-10-26 19:29:06 139830264824896 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
2020-10-26 19:29:06 139830264824896 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
2020-10-26 19:29:06 139830264824896 [ERROR] Aborting
 
 
trepan.pl: Program received signal PIPE.
!! main::(./mtr:641 @0x55b607d9ddf0)
      if ($sock == $server) {
 
(trepanpl): c                                                                                                                                                                                                               
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 0.000 of 59 seconds executing testcases
 
Completed: Failed 2/2 tests, 0.00% were successful.
 
Failing test(s): main.1st
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
Errors/warnings were found in logfiles during server shutdown after running the
following sequence(s) of tests:
    main.1st
    main.1st
mysql-test-run: *** ERROR: there were failing test cases
Debugged program terminated.  Use 'q' to quit or 'R' to restart.

I tried a lot variations but don't have solution to this.
Can you please suggest some architecture for this?

Comment by Anel Husakovic [ 2020-10-27 ]

Hi Serg,
can you please review https://github.com/MariaDB/server/commit/911655c36cc6
This is the result:

Single worker

Show all

 
$ ./mtr --mysqld=--lock-wait-timeout=-xx 1st grant5 --force --parallel 1
Logging: ./mtr  --mysqld=--lock-wait-timeout=-xx 1st grant5 --force --parallel 1
vardir: /home/anel/mariadb/in-source-10.1/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/anel/mariadb/in-source-10.1/mysql-test/var'...
Checking supported features...
MariaDB Version 10.1.48-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
Sphinx 'indexer' binary not found, sphinx suite will be skipped
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 6070, winpid: 6070, exit: 256] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/in-source-10.1/mysql-test/var/run/mysqld.1.pid to be created.
main.1st                                 [ fail ]
        Test ended at 2020-10-27 02:32:41
 
CURRENT_TEST: main.1st
 
 
Failed to start mysqld.1
mysqltest failed but provided no output
 
 
 - saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/'
***Warnings generated in error logs during shutdown after running tests: main.1st
 
2020-10-27  2:32:41 139724427658304 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
2020-10-27  2:32:41 139724427658304 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
2020-10-27  2:32:41 139724427658304 [ERROR] Aborting
 
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 6074, winpid: 6074, exit: 256] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/in-source-10.1/mysql-test/var/run/mysqld.1.pid to be created.
main.grant5                              [ fail ]
        Test ended at 2020-10-27 02:32:41
 
CURRENT_TEST: main.grant5
 
 
Failed to start mysqld.1
mysqltest failed but provided no output
 
 
 - saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.grant5/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.grant5/'
***Warnings generated in error logs during shutdown after running tests: main.grant5
 
2020-10-27  2:32:41 140098250875968 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
2020-10-27  2:32:41 140098250875968 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
2020-10-27  2:32:41 140098250875968 [ERROR] Aborting
 
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 0.000 of 4 seconds executing testcases
 
Completed: Failed 2/2 tests, 0.00% were successful.
 
Failing test(s): main.1st main.grant5
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
Errors/warnings were found in logfiles during server shutdown after running the
following sequence(s) of tests:
    main.1st
    main.grant5
mysql-test-run: *** ERROR: there were failing test cases

Multiple workers

Show all

$ ./mtr --mysqld=--lock-wait-timeout=-xx 1st grant5 1st --force --parallel 2
Logging: ./mtr  --mysqld=--lock-wait-timeout=-xx 1st grant5 1st --force --parallel 2
vardir: /home/anel/mariadb/in-source-10.1/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/anel/mariadb/in-source-10.1/mysql-test/var'...
Checking supported features...
MariaDB Version 10.1.48-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
Sphinx 'indexer' binary not found, sphinx suite will be skipped
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                  WORKER RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 301, with reserved ports 16020..16039
worker[2] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
worker[2] mysql-test-run: WARNING: Process [mysqld.1 - pid: 6124, winpid: 6124, exit: 256] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/in-source-10.1/mysql-test/var/2/run/mysqld.1.pid to be created.
main.1st                                 w2 [ fail ]
        Test ended at 2020-10-27 02:34:10
 
CURRENT_TEST: main.1st
 
 
Failed to start mysqld.1
mysqltest failed but provided no output
 
 
 - saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/2/log/main.1st/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/'
***Warnings generated in error logs during shutdown after running tests: main.1st
 
2020-10-27  2:34:10 139893202371648 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
2020-10-27  2:34:10 139893202371648 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
2020-10-27  2:34:10 139893202371648 [ERROR] Aborting
 
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 6126, winpid: 6126, exit: 256] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/in-source-10.1/mysql-test/var/1/run/mysqld.1.pid to be created.
main.1st                                 w1 [ fail ]
        Test ended at 2020-10-27 02:34:10
 
CURRENT_TEST: main.1st
 
 
Failed to start mysqld.1
mysqltest failed but provided no output
 
 
 - saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/1/log/main.1st/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/'
***Warnings generated in error logs during shutdown after running tests: main.1st
 
2020-10-27  2:34:10 140193610740800 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
2020-10-27  2:34:10 140193610740800 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
2020-10-27  2:34:10 140193610740800 [ERROR] Aborting
 
worker[2] mysql-test-run: WARNING: Process [mysqld.1 - pid: 6132, winpid: 6132, exit: 256] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/in-source-10.1/mysql-test/var/2/run/mysqld.1.pid to be created.
main.grant5                              w2 [ fail ]
        Test ended at 2020-10-27 02:34:10
 
CURRENT_TEST: main.grant5
 
 
Failed to start mysqld.1
mysqltest failed but provided no output
 
 
 - saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/2/log/main.grant5/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.grant5/'
***Warnings generated in error logs during shutdown after running tests: main.grant5
 
2020-10-27  2:34:10 140128548090944 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
2020-10-27  2:34:10 140128548090944 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
2020-10-27  2:34:10 140128548090944 [ERROR] Aborting
 
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 0.000 of 4 seconds executing testcases
 
Completed: Failed 3/3 tests, 0.00% were successful.
 
Failing test(s): main.1st main.grant5
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
Errors/warnings were found in logfiles during server shutdown after running the
following sequence(s) of tests:
    main.1st
    main.1st
    main.grant5
mysql-test-run: *** ERROR: there were failing test cases

Comment by Sergei Golubchik [ 2020-10-30 ]

Thanks! May be this description about how mtr server communicates with workers should be in the KB? Might be difficult to find it here in the future.

Anyway, it seems the bug is that the server sends a new TESTCASE command after WARNINGS. WARNINGS is used only for warnings-on-shutdown, so whenever the worker sends WARNINGS it should already know what to do and doesn't need a new test case.

Your new fix does exactly that — avoids sending a TESTCASE command as a reply to WARNINGS.

But why do you need $wait_on_result ? In what case a worker doesn't know what to do and needs a command (like BYE) after WARNINGS? I tried just

             return ("Warnings in log", 1, $completed, $extra_warnings);
           }
+          next;
         }

and it seemed to work fine.

Comment by Anel Husakovic [ 2020-10-30 ]

Thanks for the review Serg.
Sure I can write it i KB also .
Regarding the wait_on_result my guess was to wait if another worker is in phase of executing the test case, so this variable is just signal to guard that another worker has finished the test case, is my logic correct ?

I haven't dispatched behavior without variable, but I can take a look and give some conclusion about tomorrow.
Thanks.

Comment by Anel Husakovic [ 2020-11-03 ]

Hi serg, yes indeed above line is sufficient meaning the server got the WARNINGS command and said to worker to finish its work (doesn't supply any other command), my patch was focusing on anything that can happen after BYE command (in this bug other worker didn't finish the work), but your patch is better.
So I guess this should be your patch I understood how things work .

Comment by Sergei Golubchik [ 2020-11-08 ]

Good, then go ahead and push it.
Thanks for helping me understand what's going on there!

Comment by Anel Husakovic [ 2020-11-09 ]

Ok serg I prepared it for 10.2, just one time check.
I was also wondering when ending, when the socket is closed?
Based on example when remove is called, close is also needed and we don't use it.

So do we need something like this:

diff --git a/mysql-test/mysql-test-run.pl b/mysql-test/mysql-test-run.pl
index a493c813fea..5c2dc4bb5d0 100755
--- a/mysql-test/mysql-test-run.pl
+++ b/mysql-test/mysql-test-run.pl
@@ -802,6 +802,7 @@ sub run_test_server ($$$) {
          # Client disconnected
          mtr_verbose("Child closed socket");
          $s->remove($sock);
+        $sock->close;
          if (--$childs == 0){
            return ("Completed", $test_failure, $completed, $extra_warnings);
          }

I tested on some examples and it works.

Comment by Sergei Golubchik [ 2020-11-10 ]

Looks good too.

I think it's fine either way, clients don't reconnect — there's a small number of clients (<100) they connect once and stay connected up to the end. And at the end perl will close everything automatically. So whatever you prefer — you can close sockets there or keep the code as is.

Comment by Anel Husakovic [ 2020-11-10 ]

Thanks Serg.
I decided to add line also .
Closing the MDEV with patch cd927dd34555a34e7 for 10.2.
Thanks for the review.

Generated at Thu Feb 08 09:24:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.