[MDEV-30929] Sporadic failure in spider.spider_fixes_part Created: 2023-03-26  Updated: 2023-11-28

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - Spider, Tests
Affects Version/s: 10.10, 10.11
Fix Version/s: 10.11

Type: Bug Priority: Critical
Reporter: Otto Kekäläinen Assignee: Yuchen Pei
Resolution: Unresolved Votes: 0
Labels: None


 Description   

While preparing MariaDB 10.11.2 for Debian I noticed in https://launchpadlibrarian.net/657664639/buildlog_ubuntu-lunar-ppc64el.mariadb_1%3A10.11.2-2~ubuntu23.04.1~1679807146.b55a2e269b8.dev.otto_BUILDING.txt.gz that test spider.spider_fixes_part failed with "mysqltest: At line 492: sync_with_master failed: 'select master_pos_wait('master-bin.000003', 692, 300, '')' returned NULL indicating slave SQL thread failure":

spider.spider_fixes_part                 w2 [ fail ]
        Test ended at 2023-03-26 05:57:41
 
CURRENT_TEST: spider.spider_fixes_part
analyze: sync_with_master
mysqltest: At line 492: sync_with_master failed: 'select master_pos_wait('master-bin.000003', 692, 300, '')' returned NULL indicating slave SQL thread failure
 
The result from queries just before the failure was:
< snip >
KEY idx1(b),
KEY idx2(c)
) MASTER_1_ENGINE MASTER_1_CHARSET MASTER_1_COMMENT3_P_2_1
INSERT INTO ta_l_int (a, b, c) VALUES (1, 2, 3);
INSERT INTO ta_l_int (a, b, c) SELECT a + 1, b + 1, c + 1 FROM ta_l_int;
INSERT INTO ta_l_int (a, b, c) SELECT a + 2, b + 2, c + 2 FROM ta_l_int;
INSERT INTO ta_l_int (a, b, c) SELECT a + 4, b + 4, c + 4 FROM ta_l_int;
INSERT INTO ta_l_int (a, b, c) SELECT a + 8, b + 8, c + 8 FROM ta_l_int;
connection master_1;
SELECT a, b, c FROM ta_l_int force index(primary, idx1, idx2)
WHERE a = 5 OR b = 5 OR c = 5 ORDER BY a;
a	b	c
3	4	5
4	5	6
5	6	7
 
2.26
auto_increment with partition
connection master_1;
connection slave1_1;
 
More results from queries before failure can be found in /<<PKGBUILDDIR>>/builddir/mysql-test/var/2/log/spider_fixes_part.log
 
 
 == /<<PKGBUILDDIR>>/builddir/mysql-test/var/2/tmp/analyze-sync_with_master-mysqld.1.1.err ==
 
############################## default ##############################
 
**** SHOW WARNINGS on default ****
SHOW WARNINGS;
Level	Code	Message
 
**** SELECT replication-related variables on default ****
SELECT NOW(), @@SERVER_ID;
NOW()	@@SERVER_ID
2023-03-26 05:57:41	1
 
**** SHOW SLAVE STATUS on default ****
SHOW SLAVE STATUS;
 
**** SHOW MASTER STATUS on default ****
SHOW MASTER STATUS;
File	master-bin.000003
Position	692
Binlog_Do_DB	
Binlog_Ignore_DB	
 
**** SHOW SLAVE HOSTS on default ****
SHOW SLAVE HOSTS;
Server_id	8
Host	localhost
Port	16007
Master_id	1
 
**** SHOW PROCESSLIST on default ****
SHOW PROCESSLIST;
Id	User	Host	db	Command	Time	State	Info	Progress
92	system user		NULL	Daemon	NULL	Spider table background cardinality action handler	NULL	0.000
93	system user		NULL	Daemon	NULL	Spider table background cardinality action handler	NULL	0.000
94	system user		NULL	Daemon	NULL	Spider table background cardinality action handler	NULL	0.000
95	system user		NULL	Daemon	NULL	Spider table background cardinality action handler	NULL	0.000
96	system user		NULL	Daemon	NULL	Spider table background cardinality action handler	NULL	0.000
97	system user		NULL	Daemon	NULL	Spider table background statistics action handler	NULL	0.000
98	system user		NULL	Daemon	NULL	Spider table background cardinality action handler	NULL	0.000
99	system user		NULL	Daemon	NULL	Spider table background statistics action handler	NULL	0.000
100	system user		NULL	Daemon	NULL	Spider table background cardinality action handler	NULL	0.000
101	system user		NULL	Daemon	NULL	Spider table background cardinality action handler	NULL	0.000
102	system user		NULL	Daemon	NULL	Spider table background cardinality action handler	NULL	0.000
103	system user		NULL	Daemon	NULL	Spider table background statistics action handler	NULL	0.000
104	system user		NULL	Daemon	NULL	Spider table background statistics action handler	NULL	0.000
105	system user		NULL	Daemon	NULL	Spider table background statistics action handler	NULL	0.000
106	system user		NULL	Daemon	NULL	Spider table background statistics action handler	NULL	0.000
107	system user		NULL	Daemon	NULL	Spider table background statistics action handler	NULL	0.000
108	system user		NULL	Sleep	1	Reset for next command	NULL	0.000
109	system user		NULL	Daemon	NULL	Spider table background statistics action handler	NULL	0.000
110	system user		NULL	Daemon	NULL	Spider table background statistics action handler	NULL	0.000
111	system user		NULL	Daemon	NULL	Spider table background cardinality action handler	NULL	0.000
112	root	localhost:57406	NULL	Binlog Dump	0	Master has sent all binlog to slave; waiting for more updates	NULL	0.000
113	root	localhost	NULL	Query	0	starting	SHOW PROCESSLIST	0.000
 
**** SHOW BINARY LOGS on default ****
SHOW BINARY LOGS;
Log_name	File_size
master-bin.000001	8704
master-bin.000002	419
master-bin.000003	692
 
**** SHOW BINLOG EVENTS on default ****
binlog_name = 'master-bin.000003'
SHOW BINLOG EVENTS IN 'master-bin.000003';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
master-bin.000003	4	Format_desc	1	256	Server ver: 10.11.2-MariaDB-2~ubuntu23.04.1~1679807146.b55a2e, Binlog ver: 4
master-bin.000003	256	Gtid_list	1	299	[0-1-39]
master-bin.000003	299	Binlog_checkpoint	1	343	master-bin.000002
master-bin.000003	343	Binlog_checkpoint	1	387	master-bin.000003
master-bin.000003	387	Gtid	1	429	BEGIN GTID 0-1-40
master-bin.000003	429	Query	1	620	use `mtr`; INSERT INTO test_suppressions (pattern) VALUES ( NAME_CONST('pattern',_latin1'unknown variable' COLLATE 'latin1_swedish_ci'))
master-bin.000003	620	Query	1	692	COMMIT
 
**** SHOW RELAYLOG EVENTS on default ****
relaylog_name = 'No such row'
SHOW RELAYLOG EVENTS IN 'No such row';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
connection default;



 Comments   
Comment by Sergei Golubchik [ 2023-05-19 ]

According to the buildbot xref, it fails on 10.10 and above (but not on 10.9) on various buildbot hosts, amd64 and aarch64 too.
It's not ppc64el specific.

Comment by Yuchen Pei [ 2023-11-08 ]

xref link:

https://buildbot.mariadb.net/ci/reports/cross_reference#branch=&revision=&platform=&fail_name=spider.spider_fixes_part&fail_variant=&fail_info_full=sync_with_master+failed&typ=&info=&dt=&limit=100&fail_info_short=

Comment by Oleksandr Byelkin [ 2023-11-15 ]

I made it higher priority because it fails on buildbot rather frequent

Generated at Thu Feb 08 10:19:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.