[MDEV-10149] sys_vars.rpl_init_slave_func fails sporadically in buildbot Created: 2016-05-29  Updated: 2020-10-27  Resolved: 2020-10-22

Status: Closed
Project: MariaDB Server
Component/s: Tests
Affects Version/s: 5.5, 10.0, 10.1, 10.2
Fix Version/s: 10.1.48, 10.2.35, 10.3.26, 10.4.16, 10.5.7

Type: Bug Priority: Critical
Reporter: Elena Stepanova Assignee: Sujatha Sivakumar (Inactive)
Resolution: Fixed Votes: 1
Labels: synchronization

Issue Links:
Blocks
blocks MDEV-7069 Fix buildbot failures in main server ... Stalled
Sprint: 10.2.1-5

 Description   

The failure has been happening, although rarely, for over an year.
It comes with two flavors:

http://buildbot.askmonty.org/buildbot/builders/bld-dan-release/builds/3581/steps/test/logs/stdio

@@global.max_connections = @start_max_connections

CURRENT_TEST: sys_vars.rpl_init_slave_func
mysqltest: In included file "./include/assert.inc": 
included from /opt/buildbot-slave/mariadb/dan_demeter2/build/mysql-test/suite/sys_vars/t/rpl_init_slave_func.test at line 69:
At line 168: Test assertion failed in assertion.inc
 
The result from queries just before the failure was:
< snip >
Log_name	File_size
master-bin.000001	322
 
**** SHOW BINLOG EVENTS on server_1 ****
binlog_name = 'master-bin.000001'
SHOW BINLOG EVENTS IN 'master-bin.000001';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
master-bin.000001	4	Format_desc	1	249	Server ver: 10.2.0-MariaDB, Binlog ver: 4
master-bin.000001	249	Gtid_list	1	278	[]
master-bin.000001	278	Binlog_checkpoint	1	322	master-bin.000001
 
**** SHOW RELAYLOG EVENTS on server_1 ****
relaylog_name = 'No such row'
SHOW RELAYLOG EVENTS IN 'No such row';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
connection slave;
Assertion text: '@@global.max_connections = @start_max_connections'
Assertion condition: '@@global.max_connections = @start_max_connections'
Assertion condition, interpolated: '@@global.max_connections = @start_max_connections'
Assertion result: '0'

http://buildbot.askmonty.org/buildbot/builders/p8-rhel7-bintar/builds/1222/steps/test/logs/stdio

@@global.max_connections = @start_max_connections + 1'

sys_vars.rpl_init_slave_func 'mix'       w3 [ fail ]
        Test ended at 2016-04-19 19:09:05
 
CURRENT_TEST: sys_vars.rpl_init_slave_func
mysqltest: In included file "./include/assert.inc": 
included from /home/buildbot/maria-slave/power8-vlp03-bintar/build/mysql-test/suite/sys_vars/t/rpl_init_slave_func.test at line 87:
At line 168: Test assertion failed in assertion.inc
 
The result from queries just before the failure was:
< snip >
 
**** SHOW BINARY LOGS on server_1 ****
SHOW BINARY LOGS;
Log_name	File_size
master-bin.000001	245
 
**** SHOW BINLOG EVENTS on server_1 ****
binlog_name = 'master-bin.000001'
SHOW BINLOG EVENTS IN 'master-bin.000001';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
master-bin.000001	4	Format_desc	1	245	Server ver: 5.5.48-MariaDB, Binlog ver: 4
 
**** SHOW RELAYLOG EVENTS on server_1 ****
relaylog_name = 'No such row'
SHOW RELAYLOG EVENTS IN 'No such row';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
Assertion text: '@@global.max_connections = @start_max_connections + 1'
Assertion condition: '@@global.max_connections = @start_max_connections + 1'
Assertion condition, interpolated: '@@global.max_connections = @start_max_connections + 1'
Assertion result: '0'



 Comments   
Comment by Laurynas Biveinis [ 2017-03-07 ]

In Percona Server and MySQL:

A slave SQL thread sets its Running state to Yes very early in its
initialisation, before the majority of initialisation actions,
including executing the init_slave command, are done. Thus the
testcase has a race condition where the initial replication setup
might finish executing later than the testcase SET GLOBAL init_slave,
making the testcase see its effect where it checks for its absence.

See https://github.com/percona/percona-server/pull/1464 for a fix

Comment by Michael Widenius [ 2020-08-26 ]

I upgraded this to critical as it constantly fails in buildbot

Comment by Sujatha Sivakumar (Inactive) [ 2020-10-13 ]

Hello Elkin,

Please review the fix for MDEV-10149.

Patch: https://github.com/MariaDB/server/commit/aa8f30f54fffc25bfc0048941d71ebc043a106be

BuildBot Test Results: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.2-sujatha

Thank you.

Comment by Sujatha Sivakumar (Inactive) [ 2020-10-21 ]

Hello Elkin

Thank you for the review comments. I have addressed them as part of

https://github.com/MariaDB/server/commit/773c99323684883a32af79124c55367371ce8045

Please review them.

Comment by Andrei Elkin [ 2020-10-21 ]

Reviewed on GH, to approve.

Comment by Sujatha Sivakumar (Inactive) [ 2020-10-22 ]

Fix is implemented in 10.1.48.

Patch has been tested on higher versions.

10.2 had minor merge conflicts. After resolving conflicts the result file needs re-recording.
10.2 patch: https://github.com/MariaDB/server/commit/d1be56e231a52c6b4e215f2ed4fec455b9b14513

The patch remains the same in all higher versions and no more merge conflicts.
10.3 patch: https://github.com/MariaDB/server/commit/f709a3069580fa48605f6502fea4956758593f74
10.4 patch: https://github.com/MariaDB/server/commit/1d0ace556b385a6dc8f7a30523366453150a33db
10.5 patch: https://github.com/MariaDB/server/commit/80bd0cec5168a7720d91b8b22013e859b76b3683

Generated at Thu Feb 08 07:40:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.