|
Hi elenst,
I played a bit with this MDEV and here is a patch, hope I covered all test cases mentioned above.
- Patch is solving generating report on warning
- Patch is doing cosmetic fix of `current_test` log file which holds the old log value of test `CURRENT TEST:..` in `mark_log()` in case of `unknown option` and as such
the logic which is using it's content doesn't output valid log content and doesn't generate valid `$test->
{'comment'}
` message.
Here are the test cases observed. Note that only in 6. scenario there is `print $child "HELLO\n";` problem in printing (to delete?).
1. Valid test case, but warning in server log
|
Single valid test + warning -> result OK
|
|
Show all
|
$ ./mtr MDEV-23769 --force
|
Logging: ./mtr MDEV-23769 --force
|
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
|
Checking leftover processes...
|
Removing old var directory...
|
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
|
Checking supported features...
|
MariaDB Version 10.1.47-MariaDB-debug
|
- SSL connections supported
|
- binaries are debug compiled
|
Sphinx 'indexer' binary not found, sphinx suite will be skipped
|
Collecting tests...
|
Installing system database...
|
|
==============================================================================
|
|
TEST RESULT TIME (ms) or COMMENT
|
--------------------------------------------------------------------------
|
|
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
|
#
|
# MDEV-23769: MTR can abort before it prints the test result summary
|
#
|
show databases;
|
Database
|
information_schema
|
mtr
|
mysql
|
performance_schema
|
test
|
main.MDEV-23769 [ fail ] Found warnings/errors in server log file!
|
Test ended at 2020-10-03 02:13:00
|
line
|
2020-10-03 2:13:00 140309996242496 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
|
^ Found warnings in /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/mysqld.1.err
|
ok
|
|
- saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
|
--------------------------------------------------------------------------
|
The servers were restarted 0 times
|
Spent 0.000 of 4 seconds executing testcases
|
mysql-test-run: WARNING: Got errors/warnings while running tests, please examine '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/warnings' for details.
|
|
Completed: Failed 1/1 tests, 0.00% were successful.
|
|
Failing test(s): main.MDEV-23769
|
|
The log files in var/log may give you some hint of what went wrong.
|
|
If you want to report this error, please read first the documentation
|
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
|
|
mysql-test-run: *** ERROR: there were failing test cases
|
2. Valid test cases (same), but warning in server log (Note reported single test case)
|
Multiple valid tests (same) + warning -> result OK
|
|
Show all
|
$ ./mtr MDEV-23769 MDEV-23769 --force
|
Logging: ./mtr MDEV-23769 MDEV-23769 --force
|
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
|
Checking leftover processes...
|
Removing old var directory...
|
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
|
Checking supported features...
|
MariaDB Version 10.1.47-MariaDB-debug
|
- SSL connections supported
|
- binaries are debug compiled
|
Sphinx 'indexer' binary not found, sphinx suite will be skipped
|
Collecting tests...
|
Installing system database...
|
|
==============================================================================
|
|
TEST RESULT TIME (ms) or COMMENT
|
--------------------------------------------------------------------------
|
|
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
|
#
|
# MDEV-23769: MTR can abort before it prints the test result summary
|
#
|
show databases;
|
Database
|
information_schema
|
mtr
|
mysql
|
performance_schema
|
test
|
main.MDEV-23769 [ fail ] Found warnings/errors in server log file!
|
Test ended at 2020-10-03 02:15:05
|
line
|
2020-10-03 2:15:04 139868136609344 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
|
^ Found warnings in /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/mysqld.1.err
|
ok
|
|
- saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
|
#
|
# MDEV-23769: MTR can abort before it prints the test result summary
|
#
|
show databases;
|
Database
|
information_schema
|
mtr
|
mysql
|
performance_schema
|
test
|
main.MDEV-23769 [ fail ] Found warnings/errors in server log file!
|
Test ended at 2020-10-03 02:15:05
|
line
|
2020-10-03 2:15:05 140513782085184 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
|
^ Found warnings in /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/mysqld.1.err
|
ok
|
|
- saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
|
--------------------------------------------------------------------------
|
The servers were restarted 1 times
|
Spent 0.000 of 5 seconds executing testcases
|
mysql-test-run: WARNING: Got errors/warnings while running tests, please examine '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/warnings' for details.
|
|
Completed: Failed 2/2 tests, 0.00% were successful.
|
|
Failing test(s): main.MDEV-23769
|
|
The log files in var/log may give you some hint of what went wrong.
|
|
If you want to report this error, please read first the documentation
|
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
|
|
mysql-test-run: *** ERROR: there were failing test cases
|
3. Valid test cases (different test cases), but warning in server log (Note one passed one failed)
|
Multiple valid tests (different) + warning -> result OK
|
|
Show all
|
$ ./mtr MDEV-23769 1st --force
|
Logging: ./mtr MDEV-23769 1st --force
|
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
|
Checking leftover processes...
|
Removing old var directory...
|
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
|
Checking supported features...
|
MariaDB Version 10.1.47-MariaDB-debug
|
- SSL connections supported
|
- binaries are debug compiled
|
Sphinx 'indexer' binary not found, sphinx suite will be skipped
|
Collecting tests...
|
Installing system database...
|
|
==============================================================================
|
|
TEST RESULT TIME (ms) or COMMENT
|
--------------------------------------------------------------------------
|
|
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
|
#
|
# MDEV-23769: MTR can abort before it prints the test result summary
|
#
|
show databases;
|
Database
|
information_schema
|
mtr
|
mysql
|
performance_schema
|
test
|
main.MDEV-23769 [ fail ] Found warnings/errors in server log file!
|
Test ended at 2020-10-03 02:17:09
|
line
|
2020-10-03 2:17:08 140658805515840 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
|
^ Found warnings in /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/mysqld.1.err
|
ok
|
|
- saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
|
main.1st [ pass ] 1
|
--------------------------------------------------------------------------
|
The servers were restarted 1 times
|
Spent 0.001 of 4 seconds executing testcases
|
mysql-test-run: WARNING: Got errors/warnings while running tests, please examine '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/warnings' for details.
|
|
Completed: Failed 1/2 tests, 50.00% were successful.
|
|
Failing test(s): main.MDEV-23769
|
|
The log files in var/log may give you some hint of what went wrong.
|
|
If you want to report this error, please read first the documentation
|
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
|
|
mysql-test-run: *** ERROR: there were failing test cases
|
|
4. Now for example 3. create artificial failure with warnings in server log (note, both test failed)
|
Single failing test + warning -> result OK
|
|
Show all
|
|
$ ./mtr MDEV-23769 1st --force
|
Logging: ./mtr MDEV-23769 1st --force
|
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
|
Checking leftover processes...
|
Removing old var directory...
|
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
|
Checking supported features...
|
MariaDB Version 10.1.47-MariaDB-debug
|
- SSL connections supported
|
- binaries are debug compiled
|
Sphinx 'indexer' binary not found, sphinx suite will be skipped
|
Collecting tests...
|
Installing system database...
|
|
==============================================================================
|
|
TEST RESULT TIME (ms) or COMMENT
|
--------------------------------------------------------------------------
|
|
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
|
#
|
# MDEV-23769: MTR can abort before it prints the test result summary
|
#
|
show databases;
|
Database
|
information_schema
|
mtr
|
mysql
|
performance_schema
|
test
|
main.MDEV-23769 [ fail ] Found warnings/errors in server log file!
|
Test ended at 2020-10-03 02:17:52
|
line
|
2020-10-03 2:17:51 140260493149760 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
|
^ Found warnings in /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/mysqld.1.err
|
ok
|
|
- saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
|
main.1st [ fail ]
|
Test ended at 2020-10-03 02:17:52
|
|
CURRENT_TEST: main.1st
|
mysqltest: At line 5: End of line junk detected: "show tables in mysql
|
"
|
|
The result from queries just before the failure was:
|
show databases;
|
Database
|
information_schema
|
mtr
|
mysql
|
performance_schema
|
test
|
|
- saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.1st/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.1st/'
|
--------------------------------------------------------------------------
|
The servers were restarted 1 times
|
Spent 0.000 of 4 seconds executing testcases
|
mysql-test-run: WARNING: Got errors/warnings while running tests, please examine '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/warnings' for details.
|
|
Completed: Failed 2/2 tests, 0.00% were successful.
|
|
Failing test(s): main.MDEV-23769 main.1st
|
|
The log files in var/log may give you some hint of what went wrong.
|
|
If you want to report this error, please read first the documentation
|
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
|
|
mysql-test-run: *** ERROR: there were failing test cases
|
|
5. Generate unknown option, for the same example as 3. (note: both failed)
|
Double failing tests + unknown option-> result OK
|
|
Show all
|
|
$ ./mtr MDEV-23769 1st --mysqld=--xx --force
|
Logging: ./mtr MDEV-23769 1st --mysqld=--xx --force
|
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
|
Checking leftover processes...
|
Removing old var directory...
|
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
|
Checking supported features...
|
MariaDB Version 10.1.47-MariaDB-debug
|
- SSL connections supported
|
- binaries are debug compiled
|
Sphinx 'indexer' binary not found, sphinx suite will be skipped
|
Collecting tests...
|
Installing system database...
|
|
==============================================================================
|
|
TEST RESULT TIME (ms) or COMMENT
|
--------------------------------------------------------------------------
|
|
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
|
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 52531, winpid: 52531, exit: 512] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/run/mysqld.1.pid to be created.
|
main.MDEV-23769 [ fail ]
|
Test ended at 2020-10-03 02:19:01
|
|
CURRENT_TEST: main.MDEV-23769
|
|
|
Failed to start mysqld.1
|
mysqltest failed but provided no output
|
|
|
- saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
|
***Warnings generated in error logs during shutdown after running tests: main.MDEV-23769
|
|
2020-10-03 2:19:01 140238937720384 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
|
2020-10-03 2:19:01 140238937720384 [ERROR] /home/anel/mariadb/10.1-mtr-in-source/sql/mysqld: unknown option '--xx'
|
2020-10-03 2:19:01 140238937720384 [ERROR] Aborting
|
|
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 52537, winpid: 52537, exit: 512] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/run/mysqld.1.pid to be created.
|
main.1st [ fail ]
|
Test ended at 2020-10-03 02:19:02
|
|
CURRENT_TEST: main.1st
|
|
|
Failed to start mysqld.1
|
mysqltest failed but provided no output
|
|
|
- saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.1st/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.1st/'
|
***Warnings generated in error logs during shutdown after running tests: main.1st
|
|
2020-10-03 2:19:02 140048201700928 [ERROR] /home/anel/mariadb/10.1-mtr-in-source/sql/mysqld: unknown option '--xx'
|
2020-10-03 2:19:02 140048201700928 [ERROR] Aborting
|
|
Can't use an undefined value as a symbol reference at ./mtr line 657.
|
--------------------------------------------------------------------------
|
The servers were restarted 1 times
|
Spent 0.000 of 5 seconds executing testcases
|
|
Completed: Failed 2/2 tests, 0.00% were successful.
|
|
Failing test(s): main.MDEV-23769 main.1st
|
|
The log files in var/log may give you some hint of what went wrong.
|
|
If you want to report this error, please read first the documentation
|
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
|
|
Errors/warnings were found in logfiles during server shutdown after running the
|
following sequence(s) of tests:
|
main.MDEV-23769
|
main.1st
|
mysql-test-run: *** ERROR: there were failing test cases
|
|
6. Again the same option, but with 2 same tests -> not perl error: `Can't use an undefined value as a symbol reference at ./mtr line 657.
` ` print $child "HELLO\n";` (solution delete the line).
|
Single failing test called 2 times + unknown option-> result OK
|
|
Show all
|
$ ./mtr MDEV-23769 MDEV-23769 --mysqld=--xx --force
|
Logging: ./mtr MDEV-23769 MDEV-23769 --mysqld=--xx --force
|
vardir: /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var
|
Checking leftover processes...
|
Removing old var directory...
|
Creating var directory '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var'...
|
Checking supported features...
|
MariaDB Version 10.1.47-MariaDB-debug
|
- SSL connections supported
|
- binaries are debug compiled
|
Sphinx 'indexer' binary not found, sphinx suite will be skipped
|
Collecting tests...
|
Installing system database...
|
|
==============================================================================
|
|
TEST RESULT TIME (ms) or COMMENT
|
--------------------------------------------------------------------------
|
|
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
|
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 52595, winpid: 52595, exit: 512] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/run/mysqld.1.pid to be created.
|
main.MDEV-23769 [ fail ]
|
Test ended at 2020-10-03 02:20:12
|
|
CURRENT_TEST: main.MDEV-23769
|
|
|
Failed to start mysqld.1
|
mysqltest failed but provided no output
|
|
|
- saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
|
***Warnings generated in error logs during shutdown after running tests: main.MDEV-23769
|
|
2020-10-03 2:20:12 139815724283456 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
|
2020-10-03 2:20:12 139815724283456 [ERROR] /home/anel/mariadb/10.1-mtr-in-source/sql/mysqld: unknown option '--xx'
|
2020-10-03 2:20:12 139815724283456 [ERROR] Aborting
|
|
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 52601, winpid: 52601, exit: 512] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/run/mysqld.1.pid to be created.
|
main.MDEV-23769 [ fail ]
|
Test ended at 2020-10-03 02:20:12
|
|
CURRENT_TEST: main.MDEV-23769
|
|
|
Failed to start mysqld.1
|
mysqltest failed but provided no output
|
|
|
- saving '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/' to '/home/anel/mariadb/10.1-mtr-in-source/mysql-test/var/log/main.MDEV-23769/'
|
***Warnings generated in error logs during shutdown after running tests: main.MDEV-23769
|
|
2020-10-03 2:20:12 140650797745728 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
|
2020-10-03 2:20:12 140650797745728 [ERROR] /home/anel/mariadb/10.1-mtr-in-source/sql/mysqld: unknown option '--xx'
|
2020-10-03 2:20:12 140650797745728 [ERROR] Aborting
|
|
Can't use an undefined value as a symbol reference at ./mtr line 657.
|
--------------------------------------------------------------------------
|
The servers were restarted 1 times
|
Spent 0.000 of 5 seconds executing testcases
|
|
Completed: Failed 2/2 tests, 0.00% were successful.
|
|
Failing test(s): main.MDEV-23769
|
|
The log files in var/log may give you some hint of what went wrong.
|
|
If you want to report this error, please read first the documentation
|
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
|
|
Errors/warnings were found in logfiles during server shutdown after running the
|
following sequence(s) of tests:
|
main.MDEV-23769
|
main.MDEV-23769
|
mysql-test-run: *** ERROR: there were failing test cases
|
|
|
|
Having warning/error is a necessary condition.
Sufficient condition is to call at least 2 tests.
Incorrect behaviour can be seen if:
- Both tests are correct but there is error generated (incorrect option passed):
So this
./mysql-test/mtr --mysqld=--lock-wait-timeout=-xx 1st
|
will generate and write statistic at the end,
but
./mysql-test/mtr --mysqld=--lock-wait-timeout=-xx 1st 1st --force
|
will not generate the statistic.
Note: warning is generated
(--mysqld=--lock-wait-timeout=-1)
|
statistic will be generated
Example of stdout in this case:
Failed to start mysqld.1
|
|
- saving '/home/anel/mariadb/builds/10.1/mysql-test/var/log/main.1st/' to '/home/anel/mariadb/builds/10.1/mysql-test/var/log/main.1st/'
|
***Warnings generated in error logs during shutdown after running tests: main.1st
|
|
2020-10-22 15:23:22 140610533295296 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
|
2020-10-22 15:23:22 140610533295296 [ERROR] /home/anel/mariadb/builds/10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
|
2020-10-22 15:23:22 140610533295296 [ERROR] Aborting
|
Example of stdout:
mysqltest: At line 1: query 'set a=0' failed: 1193: Unknown system variable 'a'
|
|
- saving '/home/anel/mariadb/builds/10.1/mysql-test/var/log/main.moj/' to '/home/anel/mariadb/builds/10.1/mysql-test/var/log/main.moj/'
|
***Warnings generated in error logs during shutdown after running tests: main.moj
|
|
2020-10-22 15:20:36 139669439161536 [Warning] option 'lock_wait_timeout': unsigned value 18446744073709551615 adjusted to 31536000
|
- However, if we run the above test with the patch, statistic will be generated, meaning it will write something like this for above 2 scenarios:
Errors/warnings were found in logfiles during server shutdown after running the
|
following sequence(s) of tests:
|
main.1st
|
main.1st
|
mysql-test-run: *** ERROR: there were failing test cases
|
Completed: Failed 2/2 tests, 0.00% were successful.
|
|
Failing test(s): main.fail
|
|
The log files in var/log may give you some hint of what went wrong.
|
|
If you want to report this error, please read first the documentation
|
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
|
|
Errors/warnings were found in logfiles during server shutdown after running the
|
following sequence(s) of tests:
|
main.fail
|
main.fail
|
mysql-test-run: *** ERROR: there were failing test cases
|
|
|
Hi serg,
regarding this MDEV I understood the problem.
Here I will write how mtr works - for future reference and also to people starting using it in depth:
- mtr is creating server socket my $server = new IO::Socket::INET
- Creating the workers using fork my $child_pid= My::SafeProcess::Base::_safe_fork();
- For each worker run_worker() function is called. Execution:
- Creating a new socket my $server = new IO::Socket::INET, to connect to server_port from master.
- Initiate communication with master using START command - > print $server "START\n"; <- master sends 1. test and send command TESTCASE $next->write_test($sock, 'TESTCASE');)
- Worker gets command TESTCASE processes test case (by calling run_testcase()) which start/(restart if needed) the server and sends TESTRESULT - $test->write_test($server, 'TESTRESULT'); -> master accept this command, run mtr_report_test, check does test fail and also generate new command TESTCASE if new test case exist or BYE.
- Patch introduced check_warnings_after_shoutdown is 8193c32743f5. Note that this function is reading error_logs although name suggest different extract_warning_lines($mysqld->value('#log-error').
Applying above on current problem:
To verify problem one can run single worker with 2 tests and force option or N workers and at least N+1 tests to get the error:
$ ./mtr --mysqld=--lock-wait-timeout=-xx 1st 1st --force --parallel 1 --verbose
|
$ ./mtr --mysqld=--lock-wait-timeout=-xx 1st 1st 1st --force --parallel 2 --verbose
|
Note: for N workers and N tests we have working example.
So what happens is that N=1, tests=2, wrong option:
- Worker start communication calling START, master sends first test and TESTCASE command.
- Worker runs run_worker() and test fails - Note: no warnings are generated. Sends TESTRESULT to master.
- Master processes TESTRESULT and sends new test with TESTCASE
- Worker runs TESTCASE but before that restarts the server from run_worker() and there invoke check_warnings_post_shutdown().
- This function calls WARNING command to master and master execute it.
- In parallel to this worker runs test which fails and send TESTRESULT to master
- Master proceed execution plan after WARNING and sends to worker BYE command, but note that worker didn't finish it's execution.
- Worker eventually finishes TESTRESULT, master processed it and again calls BYE command. In parallel worker execute BYE by calling again check_warnings_post_shutdown() which prints new warning. But still last BYE command was sent and there is no worker that can process it.
- Still is unclear for me why the return values of run_testcase are not checked and logic created based on that?
Debugger problem during testing: I would appreciate if someone can let me know how to properly debug perl scripts for parallel >1 as well as good practice using debugger.
1. How the result of debugger could be this (one line executed from run_worker and after that one line of run_test_server ) :S ???? :
Break points:
|
DB<3> L
|
./mtr:
|
610: my ($server, $tests, $childs) = @_;
|
break if (1)
|
930: my ($server_port, $thread_num)= @_;
|
break if (1)
|
|
main::run_worker(./mtr:930): my ($server_port, $thread_num)= @_;
|
[pid=2470->2529] DB<3> n
|
main::run_test_server(./mtr:612): my $num_saved_cores= 0; # Number of core files saved in vardir/log/ so far.
|
DB<3> T
|
@ = DB::DB called from file './mtr' line 930
|
. = main::run_worker(48547, 1) called from file './mtr' line 519
|
. = main::main() called from file './mtr' line 397
|
[pid=2470->2529] DB<3> n
|
main::run_test_server(./mtr:613): my $num_saved_datadir= 0; # Number of datadirs saved in vardir/log/ so far.
|
DB<3> n
|
main::run_worker(./mtr:932): $SIG{INT}= sub { exit(1); };
|
[pid=2470->2529] DB<3> n
|
main::run_test_server(./mtr:614): my $num_failed_test= 0; # Number of tests failed so far
|
DB<3> n
|
main::run_worker(./mtr:933): $SIG{HUP}= sub { exit(1); };
|
2. Also tried using Trepan and following is obtained:
Use of uninitialized value in subroutine entry at /home/anel/perl5/lib/perl5/Devel/Trepan/DB/../../../Devel/Trepan/DB/Sub.pm line 290.
|
main.1st [ fail ]
|
Test ended at 2020-10-26 19:29:06
|
|
CURRENT_TEST: main.1st
|
CURRENT_TEST: main.1st
|
|
|
Failed to start mysqld.1
|
|
|
Debugged program terminated. Use 'q' to quit or 'R' to restart.
|
(trepanpl): - saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/'
|
***Warnings generated in error logs during shutdown after running tests: main.1st
|
|
2020-10-26 19:29:06 139830264824896 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
|
2020-10-26 19:29:06 139830264824896 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
|
2020-10-26 19:29:06 139830264824896 [ERROR] Aborting
|
|
|
trepan.pl: Program received signal PIPE.
|
!! main::(./mtr:641 @0x55b607d9ddf0)
|
if ($sock == $server) {
|
|
(trepanpl): c
|
--------------------------------------------------------------------------
|
The servers were restarted 1 times
|
Spent 0.000 of 59 seconds executing testcases
|
|
Completed: Failed 2/2 tests, 0.00% were successful.
|
|
Failing test(s): main.1st
|
|
The log files in var/log may give you some hint of what went wrong.
|
|
If you want to report this error, please read first the documentation
|
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
|
|
Errors/warnings were found in logfiles during server shutdown after running the
|
following sequence(s) of tests:
|
main.1st
|
main.1st
|
mysql-test-run: *** ERROR: there were failing test cases
|
Debugged program terminated. Use 'q' to quit or 'R' to restart.
|
I tried a lot variations but don't have solution to this.
Can you please suggest some architecture for this?
|
|
Hi Serg,
can you please review https://github.com/MariaDB/server/commit/911655c36cc6
This is the result:
|
Single worker
|
|
Show all
|
|
$ ./mtr --mysqld=--lock-wait-timeout=-xx 1st grant5 --force --parallel 1
|
Logging: ./mtr --mysqld=--lock-wait-timeout=-xx 1st grant5 --force --parallel 1
|
vardir: /home/anel/mariadb/in-source-10.1/mysql-test/var
|
Checking leftover processes...
|
Removing old var directory...
|
Creating var directory '/home/anel/mariadb/in-source-10.1/mysql-test/var'...
|
Checking supported features...
|
MariaDB Version 10.1.48-MariaDB-debug
|
- SSL connections supported
|
- binaries are debug compiled
|
Sphinx 'indexer' binary not found, sphinx suite will be skipped
|
Collecting tests...
|
Installing system database...
|
|
==============================================================================
|
|
TEST RESULT TIME (ms) or COMMENT
|
--------------------------------------------------------------------------
|
|
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
|
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 6070, winpid: 6070, exit: 256] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/in-source-10.1/mysql-test/var/run/mysqld.1.pid to be created.
|
main.1st [ fail ]
|
Test ended at 2020-10-27 02:32:41
|
|
CURRENT_TEST: main.1st
|
|
|
Failed to start mysqld.1
|
mysqltest failed but provided no output
|
|
|
- saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/'
|
***Warnings generated in error logs during shutdown after running tests: main.1st
|
|
2020-10-27 2:32:41 139724427658304 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
|
2020-10-27 2:32:41 139724427658304 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
|
2020-10-27 2:32:41 139724427658304 [ERROR] Aborting
|
|
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 6074, winpid: 6074, exit: 256] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/in-source-10.1/mysql-test/var/run/mysqld.1.pid to be created.
|
main.grant5 [ fail ]
|
Test ended at 2020-10-27 02:32:41
|
|
CURRENT_TEST: main.grant5
|
|
|
Failed to start mysqld.1
|
mysqltest failed but provided no output
|
|
|
- saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.grant5/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.grant5/'
|
***Warnings generated in error logs during shutdown after running tests: main.grant5
|
|
2020-10-27 2:32:41 140098250875968 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
|
2020-10-27 2:32:41 140098250875968 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
|
2020-10-27 2:32:41 140098250875968 [ERROR] Aborting
|
|
--------------------------------------------------------------------------
|
The servers were restarted 1 times
|
Spent 0.000 of 4 seconds executing testcases
|
|
Completed: Failed 2/2 tests, 0.00% were successful.
|
|
Failing test(s): main.1st main.grant5
|
|
The log files in var/log may give you some hint of what went wrong.
|
|
If you want to report this error, please read first the documentation
|
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
|
|
Errors/warnings were found in logfiles during server shutdown after running the
|
following sequence(s) of tests:
|
main.1st
|
main.grant5
|
mysql-test-run: *** ERROR: there were failing test cases
|
|
Multiple workers
|
|
Show all
|
$ ./mtr --mysqld=--lock-wait-timeout=-xx 1st grant5 1st --force --parallel 2
|
Logging: ./mtr --mysqld=--lock-wait-timeout=-xx 1st grant5 1st --force --parallel 2
|
vardir: /home/anel/mariadb/in-source-10.1/mysql-test/var
|
Checking leftover processes...
|
Removing old var directory...
|
Creating var directory '/home/anel/mariadb/in-source-10.1/mysql-test/var'...
|
Checking supported features...
|
MariaDB Version 10.1.48-MariaDB-debug
|
- SSL connections supported
|
- binaries are debug compiled
|
Sphinx 'indexer' binary not found, sphinx suite will be skipped
|
Collecting tests...
|
Installing system database...
|
|
==============================================================================
|
|
TEST WORKER RESULT TIME (ms) or COMMENT
|
--------------------------------------------------------------------------
|
|
worker[1] Using MTR_BUILD_THREAD 301, with reserved ports 16020..16039
|
worker[2] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
|
worker[2] mysql-test-run: WARNING: Process [mysqld.1 - pid: 6124, winpid: 6124, exit: 256] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/in-source-10.1/mysql-test/var/2/run/mysqld.1.pid to be created.
|
main.1st w2 [ fail ]
|
Test ended at 2020-10-27 02:34:10
|
|
CURRENT_TEST: main.1st
|
|
|
Failed to start mysqld.1
|
mysqltest failed but provided no output
|
|
|
- saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/2/log/main.1st/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/'
|
***Warnings generated in error logs during shutdown after running tests: main.1st
|
|
2020-10-27 2:34:10 139893202371648 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
|
2020-10-27 2:34:10 139893202371648 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
|
2020-10-27 2:34:10 139893202371648 [ERROR] Aborting
|
|
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 6126, winpid: 6126, exit: 256] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/in-source-10.1/mysql-test/var/1/run/mysqld.1.pid to be created.
|
main.1st w1 [ fail ]
|
Test ended at 2020-10-27 02:34:10
|
|
CURRENT_TEST: main.1st
|
|
|
Failed to start mysqld.1
|
mysqltest failed but provided no output
|
|
|
- saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/1/log/main.1st/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.1st/'
|
***Warnings generated in error logs during shutdown after running tests: main.1st
|
|
2020-10-27 2:34:10 140193610740800 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
|
2020-10-27 2:34:10 140193610740800 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
|
2020-10-27 2:34:10 140193610740800 [ERROR] Aborting
|
|
worker[2] mysql-test-run: WARNING: Process [mysqld.1 - pid: 6132, winpid: 6132, exit: 256] died after mysql-test-run waited 0.3 seconds for /home/anel/mariadb/in-source-10.1/mysql-test/var/2/run/mysqld.1.pid to be created.
|
main.grant5 w2 [ fail ]
|
Test ended at 2020-10-27 02:34:10
|
|
CURRENT_TEST: main.grant5
|
|
|
Failed to start mysqld.1
|
mysqltest failed but provided no output
|
|
|
- saving '/home/anel/mariadb/in-source-10.1/mysql-test/var/2/log/main.grant5/' to '/home/anel/mariadb/in-source-10.1/mysql-test/var/log/main.grant5/'
|
***Warnings generated in error logs during shutdown after running tests: main.grant5
|
|
2020-10-27 2:34:10 140128548090944 [Warning] option 'lock_wait_timeout': unsigned value 0 adjusted to 1
|
2020-10-27 2:34:10 140128548090944 [ERROR] /home/anel/mariadb/in-source-10.1/sql/mysqld: Error while setting value '-xx' to 'lock_wait_timeout'
|
2020-10-27 2:34:10 140128548090944 [ERROR] Aborting
|
|
--------------------------------------------------------------------------
|
The servers were restarted 1 times
|
Spent 0.000 of 4 seconds executing testcases
|
|
Completed: Failed 3/3 tests, 0.00% were successful.
|
|
Failing test(s): main.1st main.grant5
|
|
The log files in var/log may give you some hint of what went wrong.
|
|
If you want to report this error, please read first the documentation
|
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
|
|
Errors/warnings were found in logfiles during server shutdown after running the
|
following sequence(s) of tests:
|
main.1st
|
main.1st
|
main.grant5
|
mysql-test-run: *** ERROR: there were failing test cases
|
|
|
Thanks! May be this description about how mtr server communicates with workers should be in the KB? Might be difficult to find it here in the future.
Anyway, it seems the bug is that the server sends a new TESTCASE command after WARNINGS. WARNINGS is used only for warnings-on-shutdown, so whenever the worker sends WARNINGS it should already know what to do and doesn't need a new test case.
Your new fix does exactly that — avoids sending a TESTCASE command as a reply to WARNINGS.
But why do you need $wait_on_result ? In what case a worker doesn't know what to do and needs a command (like BYE) after WARNINGS? I tried just
return ("Warnings in log", 1, $completed, $extra_warnings);
|
}
|
+ next;
|
}
|
and it seemed to work fine.
|
|
Thanks for the review Serg.
Sure I can write it i KB also .
Regarding the wait_on_result my guess was to wait if another worker is in phase of executing the test case, so this variable is just signal to guard that another worker has finished the test case, is my logic correct ?
I haven't dispatched behavior without variable, but I can take a look and give some conclusion about tomorrow.
Thanks.
|
|
Hi serg, yes indeed above line is sufficient meaning the server got the WARNINGS command and said to worker to finish its work (doesn't supply any other command), my patch was focusing on anything that can happen after BYE command (in this bug other worker didn't finish the work), but your patch is better.
So I guess this should be your patch I understood how things work .
|
|
Good, then go ahead and push it.
Thanks for helping me understand what's going on there!
|
|
Ok serg I prepared it for 10.2, just one time check.
I was also wondering when ending, when the socket is closed?
Based on example when remove is called, close is also needed and we don't use it.
So do we need something like this:
diff --git a/mysql-test/mysql-test-run.pl b/mysql-test/mysql-test-run.pl
|
index a493c813fea..5c2dc4bb5d0 100755
|
--- a/mysql-test/mysql-test-run.pl
|
+++ b/mysql-test/mysql-test-run.pl
|
@@ -802,6 +802,7 @@ sub run_test_server ($$$) {
|
# Client disconnected
|
mtr_verbose("Child closed socket");
|
$s->remove($sock);
|
+ $sock->close;
|
if (--$childs == 0){
|
return ("Completed", $test_failure, $completed, $extra_warnings);
|
}
|
I tested on some examples and it works.
|
|
Looks good too.
I think it's fine either way, clients don't reconnect — there's a small number of clients (<100) they connect once and stay connected up to the end. And at the end perl will close everything automatically. So whatever you prefer — you can close sockets there or keep the code as is.
|
|
Thanks Serg.
I decided to add line also .
Closing the MDEV with patch cd927dd34555a34e7 for 10.2.
Thanks for the review.
|