[MDEV-31075] KILL QUERY maintains nodes data consistency but breaks GTID sequence Created: 2023-04-18  Updated: 2023-06-07  Resolved: 2023-06-06

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.5.17
Fix Version/s: 10.5.22, 10.6.15, 10.9.8, 10.10.6, 10.11.5

Type: Bug Priority: Critical
Reporter: Claudio Nanni Assignee: Jan Lindström
Resolution: Fixed Votes: 0
Labels: None

Attachments: PNG File 1_Starting_Condition.png     PNG File 2_Starting_Coords_And_LastCommitted.png     PNG File 3_AfterFirstPlainSlapTest.png     PNG File 4_NextTestWillKillQueries.png     PNG File 5_End_SameGaleraLastCommitted_SkippedGTIDonSlaveNodes.png     Text File Alternate_Test-galera_gtid_drift_kill_query.txt    
Issue Links:
PartOf
is part of MDEV-29293 MariaDB stuck on starting commit stat... Closed

 Description   

GTID sequence drifts between Write node and appliers node by issueing some KILL QUERY statements on a Galera node which is operating user's transactions.
Data consistency is maintained and wsrep_last_committed is in sync.

NOTE: It's not always sufficient to issue one single KILL QUERY, I should do more fine grained research to determine what are the specific circumstances, in my current tests I reproduced it by killing some random mysqlslap queries.

GTID sequence de-alignment breaks among others maxscale auto-failover.
To make things more complicated in the current attached pictures test Writer is behind but in other tests Writer went ahead(see attached txt for 2nd test)



 Comments   
Comment by Jan Lindström [ 2023-05-22 ]

If there was some issue with KILL and GTID it was fixed on MDEV-29293. However there is repeatable test case with INSERT IGNORE with duplicate key causing GTID sequence breakage and there is also problem with INSERT with at least one successful write and one duplicate key.

Generated at Thu Feb 08 10:21:05 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.