[MDEV-12908] binlog_encryption.binlog_xa_recover, binlog.binlog_xa_recover failed in bb with extra checkpoint Created: 2017-05-24  Updated: 2022-12-07  Resolved: 2022-12-07

Status: Closed
Project: MariaDB Server
Component/s: Replication, Tests
Affects Version/s: 10.1, 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.2.38, 10.3.29, 10.4.19, 10.5.10

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Angelique Sklavounos (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates MDEV-8517 binlog.binlog_xa_recover fails sporad... Closed

 Description   

http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest2/builds/8256/steps/test_2/logs/stdio

binlog_encryption.binlog_xa_recover 'innodb_plugin' w4 [ fail ]
        Test ended at 2017-05-22 18:27:56
 
CURRENT_TEST: binlog_encryption.binlog_xa_recover
--- /mnt/buildbot/build/mariadb-10.1.24/mysql-test/suite/binlog_encryption/binlog_xa_recover.result	2017-05-22 16:54:43.000000000 +0300
+++ /mnt/buildbot/build/mariadb-10.1.24/mysql-test/suite/binlog_encryption/binlog_xa_recover.reject	2017-05-22 18:27:56.000000000 +0300
@@ -146,6 +146,7 @@
 master-bin.000004	#	Write_rows_v1	#	#	table_id: # flags: STMT_END_F
 master-bin.000004	#	Xid	#	#	COMMIT /* XID */
 master-bin.000004	#	Binlog_checkpoint	#	#	master-bin.000002
+master-bin.000004	#	Binlog_checkpoint	#	#	master-bin.000003
 master-bin.000004	#	Binlog_checkpoint	#	#	master-bin.000004
 Now crash the server
 SET SESSION debug_dbug="+d,crash_commit_after_log";
 
mysqltest: Result length mismatch



 Comments   
Comment by Elena Stepanova [ 2017-07-06 ]

http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest2/builds/8717/steps/test_2/logs/stdio

binlog_encryption.binlog_xa_recover 'innodb' w2 [ fail ]
        Test ended at 2017-07-05 13:30:35
 
CURRENT_TEST: binlog_encryption.binlog_xa_recover
--- /mnt/buildbot/build/mariadb-10.2.7/mysql-test/suite/binlog_encryption/binlog_xa_recover.result	2017-07-05 11:55:35.000000000 +0300
+++ /mnt/buildbot/build/mariadb-10.2.7/mysql-test/suite/binlog_encryption/binlog_xa_recover.reject	2017-07-05 13:30:35.000000000 +0300
@@ -155,11 +155,15 @@
 SET GLOBAL debug_dbug="+d,binlog_background_checkpoint_processed";
 SET DEBUG_SYNC= "now SIGNAL con12_cont";
 connection con12;
+Warnings:
+Warning	1639	debug sync point wait timed out
 connection default;
 SET DEBUG_SYNC= "now WAIT_FOR binlog_background_checkpoint_processed";
 SET GLOBAL debug_dbug= @old_dbug;
 SET DEBUG_SYNC= "now SIGNAL con11_cont";
 connection con11;
+Warnings:
+Warning	1639	debug sync point wait timed out
 connection default;
 Checking that master-bin.000004 is the last binlog checkpoint
 include/show_binlog_events.inc
@@ -174,6 +178,7 @@
 master-bin.000004	#	Write_rows_v1	#	#	table_id: # flags: STMT_END_F
 master-bin.000004	#	Xid	#	#	COMMIT /* XID */
 master-bin.000004	#	Binlog_checkpoint	#	#	master-bin.000002
+master-bin.000004	#	Binlog_checkpoint	#	#	master-bin.000003
 master-bin.000004	#	Binlog_checkpoint	#	#	master-bin.000004
 Now crash the server
 SET SESSION debug_dbug="+d,crash_commit_after_log";
 
mysqltest: Result length mismatch

Comment by Sergey Vojtovich [ 2019-05-08 ]

The reason for debug sync point wait timeout is as following:
> SET GLOBAL debug_dbug="+d,binlog_background_checkpoint_processed";
The above does internally "SET debug_sync= now SIGNAL binlog_background_checkpoint_processed". It is immediately followed by
> SET DEBUG_SYNC= "now SIGNAL con12_cont";

It may happen so that concurrent thread fails to consume binlog_background_checkpoint_processed by the time it gets overwritten by con12_cont. Which eventually means it is not going to wake up.

I guess there must be something like "now WAIT_FOR binlog_background_checkpoint_processed" in between.

But in this particular it most probably happens vice versa and con12_cont gets overwritten by binlog_background_checkpoint_processed. Which most probably needs some other fix.

Comment by Alice Sherepa [ 2021-01-12 ]

binlog.binlog_xa_recover also fails (it calls the same inc file inside the test)
http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos74-amd64-debug/builds/5455/steps/mtr/logs/stdio

10.3 c742346e5041d85ca

binlog.binlog_xa_recover 'innodb,row'    w3 [ fail ]
        Test ended at 2020-12-16 18:38:11
 
CURRENT_TEST: binlog.binlog_xa_recover
--- /usr/share/mysql-test/suite/binlog/r/binlog_xa_recover.result	2020-12-16 16:35:53.000000000 +0000
+++ /dev/shm/var/3/log/binlog_xa_recover.reject	2020-12-16 18:38:11.387734410 +0000
@@ -150,11 +150,15 @@
 SET GLOBAL debug_dbug="+d,binlog_background_checkpoint_processed";
 SET DEBUG_SYNC= "now SIGNAL con12_cont";
 connection con12;
+Warnings:
+Warning	1639	debug sync point wait timed out
 connection default;
 SET DEBUG_SYNC= "now WAIT_FOR binlog_background_checkpoint_processed";
 SET GLOBAL debug_dbug= @old_dbug;
 SET DEBUG_SYNC= "now SIGNAL con11_cont";
 connection con11;
+Warnings:
+Warning	1639	debug sync point wait timed out
 connection default;
 Checking that master-bin.000004 is the last binlog checkpoint
 include/show_binlog_events.inc
@@ -168,6 +172,7 @@
 master-bin.000004	#	Write_rows_v1	#	#	table_id: # flags: STMT_END_F
 master-bin.000004	#	Xid	#	#	COMMIT /* XID */
 master-bin.000004	#	Binlog_checkpoint	#	#	master-bin.000002
+master-bin.000004	#	Binlog_checkpoint	#	#	master-bin.000003
 master-bin.000004	#	Binlog_checkpoint	#	#	master-bin.000004
 Now crash the server
 SET SESSION debug_dbug="+d,crash_commit_after_log";
 
mysqltest: Result length mismatch

Comment by Andrei Elkin [ 2021-01-13 ]

This must be a pure test issue. The extra line is not prohibited from showing. It may occur non-deterministically. The test objective at this point is
the fact of master-bin.000004 is the last binlog checkpoint - see the paste below, and it's there.

--let $show_statement= SHOW BINLOG EVENTS IN "master-bin.000004"
--let $field= Info
--let $condition= = "master-bin.000004"
--source include/wait_show_condition.inc
 
--echo Checking that master-bin.000004 is the last binlog checkpoint
--source include/show_binlog_events.inc
 
--echo Now crash the server

So the check of the mere presence of master-bin.000004 in the last binlog checkpoint event is too coarse and needs refinement.

Comment by Angelique Sklavounos (Inactive) [ 2022-12-07 ]

This test failure has not been seen for over a year. I believe it was fixed by: https://github.com/MariaDB/server/commit/3c021485c913828b83510967c1ff277011a9d59a

The last fails of these tests with this output occurred on branches without this commit:
https://buildbot.mariadb.org/#/builders/195/builds/1067
https://buildbot.mariadb.org/#/builders/168/builds/618

Generated at Thu Feb 08 08:01:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.