[MDEV-29369] rpl.rpl_semi_sync_shutdown_await_ack fails regularly with Result content mismatch Created: 2022-08-24  Updated: 2024-01-25

Status: In Review
Project: MariaDB Server
Component/s: Replication, Tests
Affects Version/s: 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Major
Reporter: Angelique Sklavounos (Inactive) Assignee: Andrei Elkin
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-11853 semisync thread can be killed after s... Closed
relates to MDEV-32551 "Read semi-sync reply magic number er... Closed

 Description   

This test fails at least every few days, with the following (or very similar) output:

10.5 55c648a73

rpl.rpl_semi_sync_shutdown_await_ack 'mix' w2 [ fail ]
        Test ended at 2022-08-23 06:27:25
 
CURRENT_TEST: rpl.rpl_semi_sync_shutdown_await_ack
--- /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/suite/rpl/r/rpl_semi_sync_shutdown_await_ack.result	2022-08-23 05:25:37.000000000 +0000
+++ /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/suite/rpl/r/rpl_semi_sync_shutdown_await_ack.reject	2022-08-23 06:27:14.537615682 +0000
@@ -213,7 +213,7 @@
 Rpl_semi_sync_master_no_tx	1
 connection server_1_con2;
 # Check logs to ensure shutdown was delayed
-FOUND 2 /Delaying shutdown to await semi-sync ACK/ in mysqld.1.err
+FOUND 1 /Delaying shutdown to await semi-sync ACK/ in mysqld.1.err
 # Validate slave data is in correct state
 connection server_2;
 select count(*)=0 from t1;
@@ -331,7 +331,7 @@
 Rpl_semi_sync_master_no_tx	0
 connection server_1_con2;
 # Check logs to ensure shutdown was delayed
-FOUND 3 /Delaying shutdown to await semi-sync ACK/ in mysqld.1.err
+FOUND 2 /Delaying shutdown to await semi-sync ACK/ in mysqld.1.err
 # Validate slave data is in correct state
 connection server_2;
 select count(*)=0 from t1;
@@ -455,7 +455,7 @@
 Rpl_semi_sync_master_no_tx	0
 connection server_1_con2;
 # Check logs to ensure shutdown was delayed
-FOUND 4 /Delaying shutdown to await semi-sync ACK/ in mysqld.1.err
+FOUND 3 /Delaying shutdown to await semi-sync ACK/ in mysqld.1.err
 # Validate slave data is in correct state
 connection server_2;
 select count(*)=0 from t1;
 
mysqltest: Result content mismatch

Sometimes the failure appears like:

10.10 90c3b2835

rpl.rpl_semi_sync_shutdown_await_ack 'row' w3 [ fail ]
        Test ended at 2022-06-10 17:42:00
 
CURRENT_TEST: rpl.rpl_semi_sync_shutdown_await_ack
--- /buildbot/amd64-ubuntu-1804-valgrind/build/mysql-test/suite/rpl/r/rpl_semi_sync_shutdown_await_ack.result	2022-06-08 12:15:43.000000000 +0000
+++ /buildbot/amd64-ubuntu-1804-valgrind/build/mysql-test/suite/rpl/r/rpl_semi_sync_shutdown_await_ack.reject	2022-06-10 17:41:59.807669171 +0000
@@ -449,10 +449,10 @@
 #-- Ensure either ACK was received (yes_tx=1) or timeout (no_tx=1)
 show status like 'Rpl_semi_sync_master_yes_tx';
 Variable_name	Value
-Rpl_semi_sync_master_yes_tx	1
+Rpl_semi_sync_master_yes_tx	0
 show status like 'Rpl_semi_sync_master_no_tx';
 Variable_name	Value
-Rpl_semi_sync_master_no_tx	0
+Rpl_semi_sync_master_no_tx	1
 connection server_1_con2;
 # Check logs to ensure shutdown was delayed
 FOUND 4 /Delaying shutdown to await semi-sync ACK/ in mysqld.1.err
 
mysqltest: Result content mismatch

The first fail appears to be https://buildbot.mariadb.org/#/builders/192/builds/9384 with a83c7ab1e.

The test tends to be retry-pass. Fails happen mostly on amd64-ubuntu-1804-valgrind and amd64-ubuntu-1804-clang10-asan, but sometimes on amd64-ubuntu-2004-msan and once on kvm-asan. So, the fails are probably due to slowness of the builders.



 Comments   
Comment by Brandon Nesterenko [ 2024-01-11 ]

Hi Elkin!

This is ready for review: PR-2998.

The changes are isolated to the test itself, rather than code.

Generated at Thu Feb 08 10:08:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.