[MDEV-31596] Client connection loss makes MariaDB commit a non semi-sync ACKed transaction Created: 2023-06-30  Updated: 2023-11-24

Status: Open
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.4.29, 11.2.0
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Critical
Reporter: Luca Pistolesi Assignee: Andrei Elkin
Resolution: Unresolved Votes: 0
Labels: replication
Environment:

Tested on both Amazon Linux and Docker.


Attachments: Text File error.log     File source_my.cnf    
Issue Links:
Relates
relates to MDEV-28700 MariaDB will produce binlog even thou... Open

 Description   

While testing the semi-sync replication with just two servers (one Source and a Replica), I noticed this interesting and worrying fact: if a client connection is closed while the Source is waiting for an ACK from a Replica, the transaction is committed regardless the ACK being received.

The experiment looks like the following:

  1. Start the Source with semi_sync enabled
    1. rpl_semi_sync_master_wait_no_slave = ON
    2. rpl_semi_sync_master_wait_point = AFTER_SYNC
    3. rpl_semi_sync_master_timeout = 3155695200000
    4. rpl_semi_sync_master_wait_no_slave = ON
  2. No replicas are connected to the source
  3. The client connects to the source (as root) and issue either an INSERT or an UPDATE
    1. I used both MySQL WorkBench and Mysql prompt
  4. The client hangs as the Source server is blocked on waiting for a replica to ACK the transaction
  5. I aborted the transaction on the client side
    1. Clicking the abort button on MySQL WorkBench or CTRL+C on the MySql prompt
  6. The table shows the new data immediately (as WorkBench does not close the connection) and even after reconnection.

The behavior I expected was that the transaction would not get committed and the DB would be stuck until a replica comes online to ACK it. Alternatively, a timeout for a long running query could be triggered kill the transaction. In both cases, I would not expect to see the transaction being committed.

Occasionally, when I repeat the experiment three or more time, the DB crashes and I can still see all the transactions being committed after a restart.

I searched for a similar problem and I found something close happened in MySQL. Here are the links to the issues I found:

  1. Innodb autocommits if query is aborted or killed midway executing UPDATE (Bug 45923)
  2. InnoDB does not rollback for delete and update queries if query was killed (Bug #45309)
  3. Semi-synchronous replication executes un-acknowledged transactions (Bug #99370)
  4. MDEV-28700

I attached the my.cnf file I used and the container log (in this case is MariaDB-11-latest while I tested also on 10.4.10).



 Comments   
Comment by Luca Pistolesi [ 2023-11-14 ]

It has been a while since the creation of the ticket: are there updates?
I see that the ticket is not assigned anymore.

Generated at Thu Feb 08 10:25:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.