[MDEV-28682] gcol.gcol_purge contaminates further execution of innodb.gap_locks Created: 2022-05-27  Updated: 2023-12-05  Resolved: 2023-11-30

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, Tests
Affects Version/s: 10.5, 10.6, 10.7, 10.8
Fix Version/s: 10.5.24, 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Vladislav Lesin
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by MDEV-32859 gcol.gcol_purge causes innodb.gap_loc... Closed

 Description   

10.5 ea40c75c

$ perl ./mtr gcol.gcol_purge innodb.gap_locks --noreorder
 
gcol.gcol_purge 'innodb'                 [ pass ]  11825
innodb.gap_locks 'innodb'                [ fail ]
        Test ended at 2022-05-27 23:10:05
 
CURRENT_TEST: innodb.gap_locks
--- /mnt8t/bld/10.5-debug-nightly/mysql-test/suite/innodb/r/gap_locks.result	2021-09-14 19:07:00.000000000 +0300
+++ /mnt8t/bld/10.5-debug-nightly/mysql-test/suite/innodb/r/gap_locks.reject	2022-05-27 23:10:05.289693153 +0300
@@ -8,28 +8,28 @@
 DELETE FROM t1 WHERE b='2' AND c=2;
 SHOW ENGINE INNODB STATUS;
 Type	Name	Status
-InnoDB		2 lock struct(s), 1 row lock(s)
+InnoDB		0 lock struct(s), 0 row lock(s)
 ROLLBACK;
 SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
 BEGIN;
 DELETE FROM t1 WHERE b='2' AND c=2;
 SHOW ENGINE INNODB STATUS;
 Type	Name	Status
-InnoDB		2 lock struct(s), 1 row lock(s)
+InnoDB		0 lock struct(s), 0 row lock(s)
 ROLLBACK;
 SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
 BEGIN;
 DELETE FROM t1 WHERE b='2' AND c=2;
 SHOW ENGINE INNODB STATUS;
 Type	Name	Status
-InnoDB		1 lock struct(s), 0 row lock(s)
+InnoDB		0 lock struct(s), 0 row lock(s)
 ROLLBACK;
 SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
 BEGIN;
 DELETE FROM t1 WHERE b='2' AND c=2;
 SHOW ENGINE INNODB STATUS;
 Type	Name	Status
-InnoDB		1 lock struct(s), 0 row lock(s)
+InnoDB		0 lock struct(s), 0 row lock(s)
 ROLLBACK;
 SET GLOBAL INNODB_STATUS_OUTPUT_LOCKS = @save_locks;
 DROP TABLE t1;
 
mysqltest: Result content mismatch



 Comments   
Comment by Elena Stepanova [ 2022-05-27 ]

Apparently these tests happen to be executed by the same worker on amd64-ubuntu-2004-msan in the new buildbot, so it fails often there.

Comment by Thirunarayanan Balathandayuthapani [ 2022-06-03 ]

Purge thread does create the transaction when it tries to open the table virtual column computation.
The following stack trace shows the problem:

(rr) where
#0  trx_sys_t::register_trx (this=0x55b2c4b9c580 <trx_sys>, trx=0x7f4ca3c03690) at /home/thiru/mariarepo/server/10.5/storage/innobase/include/trx0sys.h:1121
#1  0x000055b2c303b146 in trx_create () at /home/thiru/mariarepo/server/10.5/storage/innobase/trx/trx0trx.cc:377
#2  0x000055b2c2db3637 in innobase_trx_allocate (thd=0x55b2c666fe98) at /home/thiru/mariarepo/server/10.5/storage/innobase/handler/ha_innodb.cc:2415
#3  0x000055b2c2db3716 in check_trx_exists (thd=0x55b2c666fe98) at /home/thiru/mariarepo/server/10.5/storage/innobase/handler/ha_innodb.cc:2440
#4  0x000055b2c2dd04e6 in ha_innobase::extra (this=0x7f4c40223e70, operation=HA_EXTRA_IS_ATTACHED_CHILDREN) at /home/thiru/mariarepo/server/10.5/storage/innobase/handler/ha_innodb.cc:15375
#5  0x000055b2c28cbb7e in tc_acquire_table (thd=0x55b2c666fe98, element=0x7f4c40041fd8) at /home/thiru/mariarepo/server/10.5/sql/table_cache.cc:401
#6  0x000055b2c28cd75d in tdc_acquire_share (thd=0x55b2c666fe98, tl=0x7f4c880028f0, flags=1, out_table=0x7f4ca0eef158) at /home/thiru/mariarepo/server/10.5/sql/table_cache.cc:862
#7  0x000055b2c2592960 in open_table (thd=0x55b2c666fe98, table_list=0x7f4c880028f0, ot_ctx=0x7f4ca0eef440) at /home/thiru/mariarepo/server/10.5/sql/sql_base.cc:1852
#8  0x000055b2c25ca138 in open_purge_table (thd=0x55b2c666fe98, db=0x7f4ca0eef4d0 "test", dblen=4, tb=0x7f4ca0eef5a0 "t1", tblen=2) at /home/thiru/mariarepo/server/10.5/sql/sql_class.cc:4860
#9  0x000055b2c2dd73c0 in innodb_find_table_for_vc (thd=0x55b2c666fe98, table=0x7f4c400412a8) at /home/thiru/mariarepo/server/10.5/storage/innobase/handler/ha_innodb.cc:20373
#10 0x000055b2c2dd7872 in innobase_allocate_row_for_vcol (thd=0x55b2c666fe98, index=0x7f4c401d07a8, heap=0x7f4ca0eef7b0, table=0x7f4ca0eef758, storage=0x7f4ca0eef790) at /home/thiru/mariarepo/server/10.5/storage/innobase/handler/ha_innodb.cc:20471
#11 0x000055b2c2f56594 in ib_vcol_row::record (this=0x7f4ca0eef790, thd=0x55b2c666fe98, index=0x7f4c401d07a8, table=0x7f4ca0eef758) at /home/thiru/mariarepo/server/10.5/storage/innobase/include/row0mysql.h:886
#12 0x000055b2c2fde5b9 in row_vers_build_clust_v_col (row=0x7f4c88002358, clust_index=0x7f4c40102e28, index=0x7f4c401d07a8, heap=0x7f4c880022b8) at /home/thiru/mariarepo/server/10.5/storage/innobase/row/row0vers.cc:463
#13 0x000055b2c2fdf8df in row_vers_old_has_index_entry (also_curr=true, rec=0x7f4ca390007d "\200", mtr=0x7f4ca0eef8f0, index=0x7f4c401d07a8, ientry=0x7f4c88001e98, roll_ptr=844424950514371, trx_id=29) at /home/thiru/mariarepo/server/10.5/storage/innobase/row/row0vers.cc:929
#14 0x000055b2c2fa0325 in row_purge_poss_sec (node=0x55b2c661eba8, index=0x7f4c401d07a8, entry=0x7f4c88001e98, sec_pcur=0x7f4ca0eefe20, sec_mtr=0x7f4ca0ef00b0, is_tree=false) at /home/thiru/mariarepo/server/10.5/storage/innobase/row/row0purge.cc:246
#15 0x000055b2c2fa0cc8 in row_purge_remove_sec_if_poss_leaf (node=0x55b2c661eba8, index=0x7f4c401d07a8, entry=0x7f4c88001e98) at /home/thiru/mariarepo/server/10.5/storage/innobase/row/row0purge.cc:467
#16 0x000055b2c2fa118e in row_purge_remove_sec_if_poss (node=0x55b2c661eba8, index=0x7f4c401d07a8, entry=0x7f4c88001e98) at /home/thiru/mariarepo/server/10.5/storage/innobase/row/row0purge.cc:568
#17 0x000055b2c2fa13a4 in row_purge_del_mark (node=0x55b2c661eba8) at /home/thiru/mariarepo/server/10.5/storage/innobase/row/row0purge.cc:638
#18 0x000055b2c2fa2db0 in row_purge_record_func (node=0x55b2c661eba8, undo_rec=0x55b2c662d388 "", thr=0x55b2c661e9c8, updated_extern=false) at /home/thiru/mariarepo/server/10.5/storage/innobase/row/row0purge.cc:1049
#19 0x000055b2c2fa307f in row_purge (node=0x55b2c661eba8, undo_rec=0x55b2c662d388 "", thr=0x55b2c661e9c8) at /home/thiru/mariarepo/server/10.5/storage/innobase/row/row0purge.cc:1110
#20 0x000055b2c2fa31fe in row_purge_step (thr=0x55b2c661e9c8) at /home/thiru/mariarepo/server/10.5/storage/innobase/row/row0purge.cc:1159
#21 0x000055b2c2f0c0e0 in que_thr_step (thr=0x55b2c661e9c8) at /home/thiru/mariarepo/server/10.5/storage/innobase/que/que0que.cc:946
#22 0x000055b2c2f0c377 in que_run_threads_low (thr=0x55b2c661e9c8) at /home/thiru/mariarepo/server/10.5/storage/innobase/que/que0que.cc:1008
#23 0x000055b2c2f0c5c5 in que_run_threads (thr=0x55b2c661e9c8) at /home/thiru/mariarepo/server/10.5/storage/innobase/que/que0que.cc:1048
#24 0x000055b2c30164b7 in trx_purge (n_tasks=1, truncate=false) at /home/thiru/mariarepo/server/10.5/storage/innobase/trx/trx0purge.cc:1306
#25 0x000055b2c2fe7b61 in srv_do_purge (n_total_purged=0x7f4ca0ef0cb8) at /home/thiru/mariarepo/server/10.5/storage/innobase/srv/srv0srv.cc:1946
#26 0x000055b2c2fe80f1 in purge_coordinator_callback_low () at /home/thiru/mariarepo/server/10.5/storage/innobase/srv/srv0srv.cc:2048
#27 0x000055b2c2fe81d2 in purge_coordinator_callback () at /home/thiru/mariarepo/server/10.5/storage/innobase/srv/srv0srv.cc:2077
#28 0x000055b2c31eeef6 in tpool::task_group::execute (this=0x55b2c4b92e60 <purge_coordinator_task_group>, t=0x55b2c4b92f00 <purge_coordinator_task>) at /home/thiru/mariarepo/server/10.5/tpool/task_group.cc:55
#29 0x000055b2c31ef244 in tpool::task::execute (this=0x55b2c4b92f00 <purge_coordinator_task>) at /home/thiru/mariarepo/server/10.5/tpool/task.cc:47
#30 0x000055b2c31e8652 in tpool::thread_pool_generic::worker_main (this=0x55b2c64537e0, thread_var=0x55b2c6463380) at /home/thiru/mariarepo/server/10.5/tpool/tpool_generic.cc:546
#31 0x000055b2c31ebe0f in std::__invoke_impl<void, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__f=@0x7f4c94001448: (void (tpool::thread_pool_generic::*)(tpool::thread_pool_generic * const, tpool::worker_data *)) 0x55b2c31e85ba <tpool::thread_pool_generic::worker_main(tpool::worker_data*)>, __t=@0x7f4c94001440: 0x55b2c64537e0) at /usr/include/c++/8/bits/invoke.h:73
#32 0x000055b2c31eafb1 in std::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__fn=@0x7f4c94001448: (void (tpool::thread_pool_generic::*)(tpool::thread_pool_generic * const, tpool::worker_data *)) 0x55b2c31e85ba <tpool::thread_pool_generic::worker_main(tpool::worker_data*)>) at /usr/include/c++/8/bits/invoke.h:95
#33 0x000055b2c31eec33 in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::_M_invoke<0ul, 1ul, 2ul> (this=0x7f4c94001438) at /usr/include/c++/8/thread:244
#34 0x000055b2c31eebd2 in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::operator() (this=0x7f4c94001438) at /usr/include/c++/8/thread:253
#35 0x000055b2c31eebb6 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run (this=0x7f4c94001430) at /usr/include/c++/8/thread:196
#36 0x00007f4cae7de4c0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#37 0x00007f4caed226db in start_thread (arg=0x7f4ca0ef1700) at pthread_create.c:463
#38 0x00007f4cade8461f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

we fail to free the transaction, it leads to failure of the reported issue.

Comment by Otto Kekäläinen [ 2023-04-02 ]

Current mainline 10.6 (https://github.com/MariaDB/server/tree/10.6) is failing on amd64-ubuntu-2204-msan with this. From commit dc1d621 Buildbot run https://buildbot.mariadb.org/#/builders/561/builds/138:

innodb.gap_locks 'innodb'                w2 [ fail ]
        Test ended at 2023-03-31 15:41:35
CURRENT_TEST: innodb.gap_locks
--- /home/buildbot/amd64-ubuntu-2204-msan/build/mysql-test/suite/innodb/r/gap_locks.result	2023-03-31 11:17:40.000000000 +0000
+++ /home/buildbot/amd64-ubuntu-2204-msan/build/mysql-test/suite/innodb/r/gap_locks.reject	2023-03-31 15:41:34.547696152 +0000
@@ -8,28 +8,28 @@
 DELETE FROM t1 WHERE b='2' AND c=2;
 SHOW ENGINE INNODB STATUS;
 Type	Name	Status
-InnoDB		2 lock struct(s), 1 row lock(s)
+InnoDB		0 lock struct(s), 0 row lock(s)
 ROLLBACK;
 SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
 BEGIN;
 DELETE FROM t1 WHERE b='2' AND c=2;
 SHOW ENGINE INNODB STATUS;
 Type	Name	Status
-InnoDB		2 lock struct(s), 1 row lock(s)
+InnoDB		0 lock struct(s), 0 row lock(s)
 ROLLBACK;
 SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
 BEGIN;
 DELETE FROM t1 WHERE b='2' AND c=2;
 SHOW ENGINE INNODB STATUS;
 Type	Name	Status
-InnoDB		1 lock struct(s), 0 row lock(s)
+InnoDB		0 lock struct(s), 0 row lock(s)
 ROLLBACK;
 SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
 BEGIN;
 DELETE FROM t1 WHERE b='2' AND c=2;
 SHOW ENGINE INNODB STATUS;
 Type	Name	Status
-InnoDB		1 lock struct(s), 0 row lock(s)
+InnoDB		0 lock struct(s), 0 row lock(s)
 ROLLBACK;
 SET GLOBAL INNODB_STATUS_OUTPUT_LOCKS = @save_locks;
 DROP TABLE t1;
mysqltest: Result content mismatch

I've also seen this same failure on amd64-ubuntu-2004-debug. Thus seems to be sporadic issue affecting Ubuntu 22.04+.

Comment by Marko Mäkelä [ 2023-05-15 ]

https://buildbot.mariadb.org/#/builders/369/builds/9559/steps/7/logs/stdio (a mandatory staging build) today:

10.6 a3e5b5c4db47cd29dc090d27db1d7f915cf774f6

gcol.gcol_purge 'innodb'                 w6 [ pass ]    215
… (skip 46 lines)
innodb.gap_locks 'innodb'                w6 [ fail ]

Comment by Yuchen Pei [ 2023-08-23 ]

Happening today in a build at 10.5 0d88365bd8f0f239ee8bb6389df2aa6a0e4e474c:

https://buildbot.mariadb.org/#/builders/534/builds/8833

Comment by Vladislav Lesin [ 2023-11-28 ]

See the comment for details.

Comment by Matthias Leich [ 2023-11-29 ]

origin/10.6-MDEV-28682-gap-locks d3bccbcac0a4dd412f88359bc9c347b51529883a 2023-11-28T16:56:26+03:00
origin/10.5-MDEV-28682-gap-locks 8bf6ac0ffebfebbee5a4b5de6df4775d8e57dafe 2023-11-28T16:54:05+03:00
performed well in RQG testing.

Generated at Thu Feb 08 10:02:40 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.