[MDEV-20148] innodb.alter_large_dml failed in buildbot with debug sync point wait timeout Created: 2019-07-24  Updated: 2023-04-27

Status: Stalled
Project: MariaDB Server
Component/s: Tests
Affects Version/s: 10.3, 10.4, 10.5, 10.6
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Unresolved Votes: 0
Labels: need_rr


 Description   

http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest2-big/builds/3337

10.3 ef44ec4afaa70521789dbb7886a0a21d

innodb.alter_large_dml 'innodb'          w2 [ fail ]
        Test ended at 2019-07-20 01:00:41
 
CURRENT_TEST: innodb.alter_large_dml
--- /mnt/buildbot/build/mariadb-10.3.17/mysql-test/suite/innodb/r/alter_large_dml.result	2019-07-19 05:37:57.000000000 -0400
+++ /mnt/buildbot/build/mariadb-10.3.17/mysql-test/suite/innodb/r/alter_large_dml.reject	2019-07-20 01:00:40.703730940 -0400
@@ -28,8 +28,12 @@
 SET DEBUG_SYNC = 'now SIGNAL dml_done';
 connect con2, localhost,root,,test;
 SET DEBUG_SYNC = 'now WAIT_FOR ddl_start';
+Warnings:
+Warning	1639	debug sync point wait timed out
 CREATE TABLE t2(f1 INT NOT NULL)ENGINE=InnoDB;
 connection default;
+Warnings:
+Warning	1639	debug sync point wait timed out
 SHOW CREATE TABLE t1;
 Table	Create Table
 t1	CREATE TABLE `t1` (
 
mysqltest: Result length mismatch



 Comments   
Comment by Marko Mäkelä [ 2021-07-30 ]

I was suspecting that there might be wrong usage of DEBUG_SYNC. But I cannot find any, when I read the test carefully. The only peculiar thing is at the start of the test:

SET DEBUG_SYNC = 'inplace_after_index_build SIGNAL rebuilt WAIT_FOR dml_pause';
SET DEBUG_SYNC = 'alter_table_inplace_before_lock_upgrade SIGNAL dml_restart WAIT_FOR  dml_done';
SET DEBUG_SYNC = 'row_log_table_apply2_before SIGNAL ddl_start';
--send
ALTER TABLE t1 FORCE, ALGORITHM=INPLACE;

The SIGNAL will fire in that order. Subsequent statements in the test are performing now WAIT_FOR and now SIGNAL, so that at any point of time, at most one SIGNAL will be in flight. (There only is a queue for 1 signal.)

Also, I tried and failed to reproduce the hang on a local 10.4 build. I do not have further ideas what could be wrong. Maybe someone could reproduce this under rr record?

Generated at Thu Feb 08 08:57:11 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.