[MDEV-32938] Inconsistency in Galera caused by ALTER being aborted before entering TOI mode Created: 2023-12-04  Updated: 2023-12-06  Resolved: 2023-12-05

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4.32
Fix Version/s: 10.4.33, 10.5.24, 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3

Type: Bug Priority: Major
Reporter: Denis Protivensky Assignee: Julius Goryavsky
Resolution: Fixed Votes: 0
Labels: None


 Description   

There's a chance that ALTER may get BF-aborted by another running TOI operation before entering TOI mode because of MDL conflict:

2023-12-04 16:11:31 23 [Note] WSREP: MDL conflict·
schema:  test
request: (19    seqno 1627  wsrep (toi, exec, aborted) cmd 3 3  ALTER TABLE tt_1_fk ADD COLUMN mtN93 MEDIUMTEXT AFTER ipkey, LOCK=EXCLUSIVE, ALGORITHM=DEFAULT)
granted: (17    seqno -1    wsrep (local, exec, aborted) cmd 3 7    DELETE FROM tt_1 WHERE ipkey >= 13863 AND ipkey <= 77403)
2023-12-04 16:11:31 23 [Note] WSREP: MDL ticket: type: MDL_SHARED_READ space: TABLE db: test name: tt_1_fk (Waiting for table metadata lock)
2023-12-04 16:11:31 23 [Note] WSREP: MDL conflict-> BF abort
schema:  test
request: (19    seqno 1627  wsrep (toi, exec, aborted) cmd 3 3  ALTER TABLE tt_1_fk ADD COLUMN mtN93 MEDIUMTEXT AFTER ipkey, LOCK=EXCLUSIVE, ALGORITHM=DEFAULT)
granted: (17    seqno -1    wsrep (local, exec, aborted) cmd 3 7    DELETE FROM tt_1 WHERE ipkey >= 13863 AND ipkey <= 77403)
2023-12-04 16:11:31 23 [Note] WSREP: MDL ticket: type: MDL_SHARED_READ space: TABLE db: test name: tt_1_fk (Waiting for table metadata lock)
2023-12-04 16:11:31 23 [Note] WSREP: wsrep_abort_thd not effective: bf 139693854844672 victim 139693855459072 wsrep 1 wsrep_on 1 RSU 0 TOI 1 aborting 1
2023-12-04 16:11:31 23 [Note] WSREP: MDL conflict db=test table=tt_1_fk ticket=3 solved by abort
2023-12-04 16:11:31 23 [Note] WSREP: MDL conflict·
schema:  test
request: (19    seqno 1627  wsrep (toi, exec, aborted) cmd 3 3  ALTER TABLE tt_1_fk ADD COLUMN mtN93 MEDIUMTEXT AFTER ipkey, LOCK=EXCLUSIVE, ALGORITHM=DEFAULT)
granted: (16    seqno -1    wsrep (local, exec, aborted) cmd 3 3    ALTER TABLE tt_1_fk DROP COLUMN cN102, LOCK=SHARED, ALGORITHM=DEFAULT)
2023-12-04 16:11:31 23 [Note] WSREP: MDL ticket: type: MDL_SHARED_HIGH_PRIO space: TABLE db: test name: tt_1_fk (Waiting for table metadata lock)
2023-12-04 16:11:31 23 [Note] WSREP: MDL conflict-> BF abort
schema:  test
request: (19    seqno 1627  wsrep (toi, exec, aborted) cmd 3 3  ALTER TABLE tt_1_fk ADD COLUMN mtN93 MEDIUMTEXT AFTER ipkey, LOCK=EXCLUSIVE, ALGORITHM=DEFAULT)
granted: (16    seqno -1    wsrep (local, exec, aborted) cmd 3 3    ALTER TABLE tt_1_fk DROP COLUMN cN102, LOCK=SHARED, ALGORITHM=DEFAULT)
2023-12-04 16:11:31 23 [Note] WSREP: MDL ticket: type: MDL_SHARED_HIGH_PRIO space: TABLE db: test name: tt_1_fk (Waiting for table metadata lock)
2023-12-04 16:11:31 23 [Note] WSREP: MDL conflict db=test table=tt_1_fk ticket=2 solved by abort



 Comments   
Comment by Julius Goryavsky [ 2023-12-05 ]

Thanks, tests do not give errors and fix did not cause regressions

Comment by Julius Goryavsky [ 2023-12-05 ]

The fix has been added to the main branch: https://github.com/MariaDB/server/pull/2894/commits/4de415754f4b603ca012abc3a901646e7f30a94f

Generated at Thu Feb 08 10:35:10 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.