[MDEV-33133] GCF-1060 test causes a server crash Created: 2023-12-28  Updated: 2024-01-18

Status: In Testing
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.6.16
Fix Version/s: 10.6

Type: Bug Priority: Major
Reporter: Denis Protivensky Assignee: Julius Goryavsky
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File mysqld.1.err     File mysqld.2.err    
Issue Links:
Relates
relates to MDEV-32160 GCF-1060 test failure due to wsrep MD... Stalled

 Description   

The server crashes with:

2023-12-28 17:29:32 89 [Note] WSREP: ::rollback() thread: 89, client_state exec client_mode high priority trans_state aborted killed 0
2023-12-28 17:29:32 76 [Note] WSREP: MDL conflict·
schema:  test
request: (75    seqno 626   wsrep (toi, exec, aborted) cmd 3 8  TRUNCATE TABLE t1)
granted: (89    seqno -1    wsrep (high priority, exec, aborting) cmd 0 161     (null))
2023-12-28 17:29:32 76 [Note] WSREP: MDL ticket: type: MDL_SHARED_WRITE space: TABLE db: test name: t1 (Waiting for table metadata lock)
2023-12-28 17:29:32 76 [Note] WSREP: MDL BF-BF conflict
schema:  test
request: (75    seqno 626   wsrep (toi, exec, aborted) cmd 3 8  TRUNCATE TABLE t1)
granted: (89    seqno -1    wsrep (high priority, exec, aborted) cmd 0 161  (null))
2023-12-28 17:29:32 76 [Note] WSREP: MDL ticket: type: MDL_SHARED_WRITE space: TABLE db: test name: t1 (Waiting for table metadata lock)
2023-12-28 17:29:32 76 [ERROR] Aborting

The problem is as following:
Applied SR transaction gets BF-aborted, but there is a time window when it's already rolled back and its wsrep transaction's state is set to aborted, but it has not released MDL locks yet. This situation is incorrectly determined during MDL conflict handling and the server aborts thinking there are two high-priority transactions (TOI and high-prio applied) conflicting, which should never happen.


Generated at Thu Feb 08 10:36:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.