[MDEV-30676] rpl.parallel_backup* tests sometimes fail with Result length mismatch Created: 2023-02-17  Updated: 2024-01-22

Status: Open
Project: MariaDB Server
Component/s: Replication, Tests
Affects Version/s: 10.5, 10.6, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1
Fix Version/s: 10.5, 10.6, 10.11, 11.0, 11.1

Type: Bug Priority: Major
Reporter: Angelique Sklavounos (Inactive) Assignee: Andrei Elkin
Resolution: Unresolved Votes: 0
Labels: None


 Description   

https://buildbot.mariadb.org/#/builders/236/builds/8645

10.6 67a6ad0a4

rpl.parallel_backup_lsu_off 'innodb,mix' w14 [ fail ]
        Test ended at 2023-02-16 09:31:38
 
CURRENT_TEST: rpl.parallel_backup_lsu_off
--- /home/buildbot/amd64-ubuntu-2004-fulltest/build/mysql-test/suite/rpl/r/parallel_backup_lsu_off.result	2023-02-16 08:50:33.000000000 +0000
+++ /home/buildbot/amd64-ubuntu-2004-fulltest/build/mysql-test/suite/rpl/r/parallel_backup_lsu_off.reject	2023-02-16 09:31:37.703046027 +0000
@@ -187,6 +187,18 @@
 BACKUP STAGE START;
 BACKUP STAGE BLOCK_COMMIT;
 connection aux_slave;
+Timeout in wait_condition.inc for SELECT COUNT(*) = 0 FROM information_schema.processlist WHERE state = "Waiting for prior transaction to commit"
+Id	User	Host	db	Command	Time	State	Info	Progress
+5	root	localhost:56620	test	Sleep	36		NULL	0.000
+6	root	localhost:56626	test	Sleep	37		NULL	0.000
+9	root	localhost:56628	test	Sleep	32		NULL	0.000
+10	root	localhost:56636	test	Sleep	36		NULL	0.000
+11	root	localhost:56640	test	Execute	0	starting	show full processlist	0.000
+16	root	localhost:56656	test	Sleep	32		NULL	0.000
+33	system user		NULL	Slave_IO	32	Waiting for master to send event	NULL	0.000
+35	system user		NULL	Slave_worker	32	Waiting for backup lock	NULL	0.000
+36	system user		test	Slave_worker	32	Waiting for prior transaction to commit	XA COMMIT X'31',X'',1	0.000
+34	system user		NULL	Slave_SQL	32	Slave has read all relay log; waiting for more updates	NULL	0.000
 ROLLBACK;
 connection backup_slave;
 BACKUP STAGE END;
 
mysqltest: Result length mismatch

Tests affected
rpl.parallel_backup_lsu_off
rpl.parallel_backup
rpl.parallel_backup_slave_binlog_off

Seems to mostly happen on s390x.



 Comments   
Comment by Angelique Sklavounos (Inactive) [ 2023-02-17 ]

Output can vary:
https://buildbot.mariadb.org/#/builders/192/builds/16121

rpl.parallel_backup 'innodb,mix'         w29 [ fail ]
        Test ended at 2023-01-31 14:55:24
 
CURRENT_TEST: rpl.parallel_backup
--- /buildbot/amd64-ubuntu-1804-valgrind/build/mysql-test/suite/rpl/r/parallel_backup.result	2023-01-31 14:43:00.000000000 +0000
+++ /buildbot/amd64-ubuntu-1804-valgrind/build/mysql-test/suite/rpl/r/parallel_backup.reject	2023-01-31 14:55:20.151212882 +0000
@@ -176,6 +176,15 @@
 SET @@global.slave_transaction_retries=0;
 include/start_slave.inc
 connection aux_slave;
+Timeout in wait_condition.inc for SELECT COUNT(*) = 1 FROM information_schema.processlist WHERE state = "Waiting for prior transaction to commit"
+Id	User	Host	db	Command	Time	State	Info	Progress
+5	root	localhost:37062	test	Sleep	62		NULL	0.000
+6	root	localhost:37070	test	Sleep	63		NULL	0.000
+9	root	localhost:37072	test	Sleep	41		NULL	0.000
+10	root	localhost:37080	test	Sleep	61		NULL	0.000
+11	root	localhost:43078	test	Query	0	starting	show full processlist	0.000
+16	root	localhost:43086	test	Sleep	46		NULL	0.000
+33	system user		NULL	Slave_IO	44	Waiting for master to send event	NULL	0.000
 # Xid '1' must be in the output:
 XA RECOVER;
 formatID	gtrid_length	bqual_length	data
@@ -185,6 +194,15 @@
 BACKUP STAGE BLOCK_COMMIT;
 connection aux_slave;
 ROLLBACK;
+Timeout in wait_condition.inc for SELECT COUNT(*) = 1 FROM information_schema.processlist WHERE state = "Waiting for backup lock"
+Id	User	Host	db	Command	Time	State	Info	Progress
+5	root	localhost:37062	test	Sleep	104		NULL	0.000
+6	root	localhost:37070	test	Sleep	105		NULL	0.000
+9	root	localhost:37072	test	Sleep	83		NULL	0.000
+10	root	localhost:37080	test	Sleep	103		NULL	0.000
+11	root	localhost:43078	test	Query	0	starting	show full processlist	0.000
+16	root	localhost:43086	test	Sleep	41		NULL	0.000
+33	system user		NULL	Slave_IO	85	Waiting for master to send event	NULL	0.000
 connection backup_slave;
 BACKUP STAGE END;
 connection slave;
 
mysqltest: Result length mismatch

Comment by Otto Kekäläinen [ 2023-10-09 ]

Seen in https://launchpadlibrarian.net/691101423/buildlog_ubuntu-mantic-riscv64.mariadb_1%3A10.11.5-3~bpo23.10.1~1696830433.49186426d72+feature.re.enable.all.tests_BUILDING.txt.gz for MariaDB 10.11.5 on riscv64

rpl.parallel_backup_slave_binlog_off 'innodb,mix' w5 [ fail ]
        Test ended at 2023-10-09 11:21:39
 
CURRENT_TEST: rpl.parallel_backup_slave_binlog_off
--- /<<PKGBUILDDIR>>/mysql-test/suite/rpl/r/parallel_backup_slave_binlog_off.result	2023-08-11 06:35:40.000000000 +0000
+++ /<<PKGBUILDDIR>>/mysql-test/suite/rpl/r/parallel_backup_slave_binlog_off.reject	2023-10-09 11:21:35.273185963 +0000
@@ -145,6 +145,15 @@
 BACKUP STAGE BLOCK_COMMIT;
 connection aux_slave;
 ROLLBACK;
+Timeout in wait_condition.inc for SELECT COUNT(*) = 1 FROM information_schema.processlist WHERE state = "Waiting for backup lock"
+Id	User	Host	db	Command	Time	State	Info	Progress
+4	root	localhost:59878	test	Sleep	59		NULL	0.000
+5	root	localhost:59880	test	Sleep	61		NULL	0.000
+8	root	localhost:59882	test	Sleep	46		NULL	0.000
+9	root	localhost:59884	test	Sleep	59		NULL	0.000
+10	root	localhost:59886	test	Query	0	starting	show full processlist	0.000
+15	root	localhost:59888	test	Sleep	46		NULL	0.000
+24	system user		NULL	Slave_IO	47	Waiting for master to send event	NULL	0.000
 connection backup_slave;
 BACKUP STAGE END;
 connection slave;
 
mysqltest: Result length mismatch
 
 - saving '/<<PKGBUILDDIR>>/builddir/mysql-test/var/5/log/rpl.parallel_backup_slave_binlog_off-innodb,mix/' to '/<<PKGBUILDDIR>>/builddir/mysql-test/var/log/rpl.parallel_backup_slave_binlog_off-innodb,mix/'
 
Retrying test rpl.parallel_backup_slave_binlog_off, attempt(2/3)...
 
worker[5] > Restart  - not started
worker[5] > Restart  - not started
***Warnings generated in error logs during shutdown after running tests: rpl.parallel_backup_slave_binlog_off
 
2023-10-09 11:20:36 26 [Warning] Slave: Lock wait timeout exceeded; try restarting transaction Error_code: 1205
2023-10-09 11:20:36 26 [Warning] Slave: Deadlock found when trying to get lock; try restarting transaction Error_code: 1213
2023-10-09 11:20:36 27 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2023-10-09 11:21:26 34 [Warning] Slave: Lock wait timeout exceeded; try restarting transaction Error_code: 1205
2023-10-09 11:21:26 34 [Warning] Slave: Deadlock found when trying to get lock; try restarting transaction Error_code: 1213
2023-10-09 11:21:27 35 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964

It didn't repeat, seems like a sporadic issue.

Comment by Brandon Nesterenko [ 2024-01-22 ]

Also can occur with Slave connection was killed (seen on Ubuntu 20.04)

rpl.parallel_backup 'innodb,mix'         w4 [ fail ]  Found warnings/errors in server log file!
        Test ended at 2024-01-22 13:24:46
line
2024-01-22 13:24:45 35 [Warning] Slave: Connection was killed Error_code: 1927
2024-01-22 13:24:45 35 [Warning] Slave: Connection was killed Error_code: 1927
^ Found warnings in /home/buildbot/amd64-ubuntu-2004/build/mysql-test/var/4/log/mysqld.2.err

Generated at Thu Feb 08 10:18:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.