Details
- 
    
Bug
 - 
    Status: Closed (View Workflow)
 - 
    
Critical
 - 
    Resolution: Fixed
 - 
    10.11, 11.4, 11.8, 12.0(EOL)
 - 
    None
 
Description
Test failure:
					                      main.mdl_sync                            w25 [ fail ]
			 | 
		
					        Test ended at 2025-04-21 01:50:53
			 | 
		
| 
					 | 
		
					CURRENT_TEST: main.mdl_sync
			 | 
		
					--- /home/buildbot/amd64-fedora-40-valgrind/build/mysql-test/main/mdl_sync.result	2025-04-19 16:41:39.000000000 +0000
			 | 
		
					+++ /home/buildbot/amd64-fedora-40-valgrind/build/mysql-test/main/mdl_sync.reject	2025-04-21 01:50:52.064219508 +0000
			 | 
		
					@@ -2655,6 +2655,13 @@
			 | 
		
					 SET debug_sync='now WAIT_FOR parked_flush';
			 | 
		
					 SET debug_sync='now SIGNAL go_truncate';
			 | 
		
					 # Ensure that truncate waits for a exclusive lock
			 | 
		
					+Timeout in wait_condition.inc for SELECT COUNT(*)=1 FROM information_schema.processlist
			 | 
		
					+WHERE state='Waiting for table metadata lock' AND info='TRUNCATE TABLE t1'
			 | 
		
					+Id	User	Host	db	Command	Time	State	Info	Progress
			 | 
		
					+4	root	localhost	test	Query	0	starting	show full processlist	0.000
			 | 
		
					+34	root	localhost	test	Sleep	33		NULL	0.000
			 | 
		
					+35	root	localhost	test	Sleep	33		NULL	0.000
			 | 
		
					+36	root	localhost	test	Query	450	Waiting for table metadata lock	FLUSH TABLES t1	0.000
			 | 
		
					 SET debug_sync= 'now SIGNAL go_show';
			 | 
		
					 connection con1;
			 | 
		
					 # Reaping...
			 | 
		
					@@ -2663,10 +2670,14 @@
			 | 
		
					 # Reaping...
			 | 
		
					 Field	Type	Null	Key	Default	Extra
			 | 
		
					 a	int(11)	YES		NULL	
			 | 
		
					+Warnings:
			 | 
		
					+Warning	1639	debug sync point wait timed out
			 | 
		
					 connection default;
			 | 
		
					 SET debug_sync= 'now SIGNAL go_flush';
			 | 
		
					 connection con3;
			 | 
		
					 # Reaping...
			 | 
		
					+Warnings:
			 | 
		
					+Warning	1639	debug sync point wait timed out
			 | 
		
					 disconnect con1;
			 | 
		
					 disconnect con2;
			 | 
		
					 disconnect con3;
			 | 
		
| 
					 | 
		
					Result length mismatch
			 | 
		
| 
					 | 
		
					 - saving '/home/buildbot/amd64-fedora-40-valgrind/build/mysql-test/var/25/log/main.mdl_sync/' to '/home/buildbot/amd64-fedora-40-valgrind/build/mysql-test/var/log/main.mdl_sync/'
			 | 
		
The problem here appears to be:
1. TRUNCATE (after "now SIGNAL go_truncate") starts waiting for MDL_EXCLUSIVE lock on t1
2. InnoDB purge thread chimes in, attempts taking MDL_SHARED lock on t1, fails, retries in a loop
Given valgrind scheduling specifics, InnoDB purge thread occupies the whole CPU and never yields CPU to user connections, causing sync point timeout.
Fix for this issue was proposed a while ago: https://github.com/MariaDB/server/commit/0c6c580137146492e234570df30d302cafd94131
Test for Bug#42643 consistently failing without the fix (mtr --repeat=100 --valgrind --parallel=20). No failures were observed with the fix.
Attachments
Issue Links
- is part of
 - 
                    
MDEV-36647 No red leaves in the forest
-         
 - Open
 
 -