[MDEV-30221] rpl.rpl_ssl1, rpl.rpl_multi_engine fail in BB Valgrind builder Created: 2022-12-13  Updated: 2023-05-02  Resolved: 2023-05-01

Status: Closed
Project: MariaDB Server
Component/s: Replication, Tests
Affects Version/s: 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0
Fix Version/s: 11.1.1, 10.11.3, 11.0.2, 10.4.29, 10.5.20, 10.6.13, 10.9.6, 10.10.4

Type: Bug Priority: Major
Reporter: Angelique Sklavounos (Inactive) Assignee: Angelique Sklavounos (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates

 Description   

https://buildbot.mariadb.org/#/builders/192/builds/14741

10.6 a8a5c8a1b

rpl.rpl_ssl1 'mix'                       w34 [ fail ]
        Test ended at 2022-12-13 16:11:40
 
CURRENT_TEST: rpl.rpl_ssl1
mysqltest: In included file "./include/rpl_init.inc": 
included from ./include/master-slave.inc at line 38:
included from /buildbot/amd64-ubuntu-1804-valgrind/build/mysql-test/suite/rpl/t/rpl_ssl1.test at line 2:
At line 165: query 'SET GLOBAL gtid_slave_pos= ""' failed: ER_SLAVE_MUST_STOP (1198): This operation cannot be performed as you have a running slave ''; run STOP SLAVE '' first



 Comments   
Comment by Angelique Sklavounos (Inactive) [ 2023-04-25 ]

Looking at the tests that run before these failing tests:

rpl.rpl_ssl 'mix'                        w1 [ skipped ]  Need "--big-test" when running with Valgrind
binlog.binlog_server_id 'stmt'           w34 [ pass ]    115
binlog.binlog_stm_ctype_cp932 'mix'      w16 [ pass ]  13276
binlog.binlog_stm_drop_tbl 'mix'         w16 [ pass ]    148
binlog.binlog_spurious_ddl_errors 'innodb,stmt' w40 [ pass ]   2447
rpl.rpl_ps 'row'                         w24 [ pass ]   1308
rpl.rpl_ssl1 'mix'                       w1 [ fail ]

rpl.rpl_mdev12179 'innodb,mix'           w4 [ skipped ]  Need "--big-test" when running with Valgrind
sys_vars.explicit_defaults_for_timestamp_on w6 [ pass ]    908
oqgraph.general-innodb 'innodb'          w12 [ pass ]  14198
oqgraph.general-MyISAM                   w30 [ pass ]   3327
versioning.replace 'heap,innodb,pk'      w33 [ pass ]    214
main.log_state_bug33693                  w37 [ pass ]     11
rpl.rpl_temporary 'mix'                  w40 [ pass ]   4474
rpl.rpl_set_statement_default_master 'row' w8 [ pass ]    923
rpl.rpl_gtid_until 'innodb,stmt'         w11 [ pass ]  11348
versioning.replace 'heap,innodb,sec'     w33 [ skipped ]  pk or unique only
rpl.rpl_row_trig001 'row'                w34 [ pass ]   1818
oqgraph.invalid_operations               w30 [ pass ]    123
rpl.rpl_multi_engine 'innodb,mix'        w4 [ fail ]

Both rpl.rpl_ssl and rpl.rpl_mdev12179 call the macro no_valgrind_without_big.inc AFTER rpl_init.inc is called. So, the slave server is set up with START SLAVE, but then the test is skipped, yet the slave is never stopped:

CURRENT_TEST: rpl.rpl_ssl
2023-04-24 10:47:39 265 [Note] Deleted Master_info file '/buildbot/amd64-ubuntu-2204-valgrind/build/mysql-test/var/1/mysqld.2/data/master.info'.
2023-04-24 10:47:39 265 [Note] Deleted Master_info file '/buildbot/amd64-ubuntu-2204-valgrind/build/mysql-test/var/1/mysqld.2/data/relay-log.info'.
2023-04-24 10:47:39 265 [Note] Master connection name: ''  Master_info_file: 'master.info'  Relay_info_file: 'relay-log.info'
2023-04-24 10:47:39 265 [Note] 'CHANGE MASTER TO executed'. Previous state master_host='127.0.0.1', master_port='16000', master_log_file='', master_log_pos='4'. New state master_host='127.0.0.1', master_port='16000', master_log_file='master-bin.000001', master_log_pos='4'.
2023-04-24 10:47:39 267 [Note] Slave I/O thread: Start asynchronous replication to master 'root@127.0.0.1:16000' in log 'master-bin.000001' at position 4
2023-04-24 10:47:39 268 [Note] Slave SQL thread initialized, starting replication in log 'master-bin.000001' at position 4, relay log './slave-relay-bin.000001' position: 4
2023-04-24 10:47:39 267 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:16000',replication started in log 'master-bin.000001' at position 4
CURRENT_TEST: rpl.rpl_ssl1
2023-04-24 10:47:40 0 [Note] /buildbot/amd64-ubuntu-2204-valgrind/build/sql/mysqld (initiated by: unknown): Normal shutdown
2023-04-24 10:47:40 0 [Note] Event Scheduler: Purging the queue. 0 events
2023-04-24 10:47:40 268 [Note] Error reading relay log event: slave SQL thread was killed
2023-04-24 10:47:40 268 [Note] Slave SQL thread exiting, replication stopped in log 'master-bin.000001' at position 329

This can be reproduced locally with ./mtr rpl.rpl_ssl,mix rpl.rpl_ssl1,mix --no-reorder --valgrind (and a hack to the MTR script to disallow big tests to run automatically when test cases are specified).

The fix is to call no_valgrind_without_big.inc before the master and slave servers are set up.

Comment by Angelique Sklavounos (Inactive) [ 2023-04-28 ]

The fix was introduced, along with re-ordering to do other macros that check test environment capabilities before master/slave is set up.
In the test runs, rpl.rpl_ssl1 and rpl.rpl_multi_engine passed:
https://buildbot.mariadb.org/#/builders/551/builds/1671/steps/7/logs/stdio
https://buildbot.mariadb.org/#/builders/551/builds/1664/steps/7/logs/stdio

Generated at Thu Feb 08 10:14:36 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.