[MDEV-742] LP:803649 - Xa recovery failed on client disconnection Created: 2011-06-29  Updated: 2024-01-08  Resolved: 2020-03-16

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Fix Version/s: 10.5.2

Type: Task Priority: Critical
Reporter: Stephane VAROQUI (Inactive) Assignee: Andrei Elkin
Resolution: Fixed Votes: 2
Labels: Launchpad, upstream-fixed, verified

Attachments: XML File LPexportBug803649.xml    
Issue Links:
Blocks
blocks MDEV-21168 Active XA transactions stop slave fro... Closed
blocks MDEV-21854 xa commit 'xid' one phase for already... Closed
is blocked by MDEV-21659 XA rollback 'foreign_xid' is allowed ... Closed
is blocked by MDEV-21856 XID_t::formatID has to be contrained ... Closed
Duplicate
duplicates MDEV-21304 kill session deletes XA prepared tran... Closed
is duplicated by MDEV-7974 backport fix for mysql bug#12161 (XA ... Closed
Problem/Incident
causes MDEV-22733 XA PREPARE breaks MDL in pseudo_slave... Stalled
causes MDEV-26652 xa transactions binlogged in wrong order Open
causes MDEV-26682 slave lock timeout with xa and gap locks Closed
causes MDEV-29410 abort-and-replay prepared XA transact... Open
causes MDEV-31949 slow parallel replication of user xa In Review
Relates
relates to MDEV-15532 XA: Assertion `!log->same_pk' failed ... Closed
relates to MDEV-21469 Implement crash-safe logging of the u... Stalled
relates to MDEV-21766 Forbid XID with empty 'gtrid' Closed
relates to MDEV-21777 Implement crash-safe execution the us... Open
relates to MDEV-25055 XA ROLLBACK reports ER_XAER_NOTA for ... Open
relates to MDEV-29819 Shutdown unexpectedly executes XA ROL... Open
relates to MDEV-32813 read-only xa prepare may disappear up... Open
relates to MDEV-11675 Lag Free Alter On Slave Closed
relates to MDEV-21168 Active XA transactions stop slave fro... Closed
relates to MDEV-21602 CREATE TABLE…PRIMARY KEY…SELECT worka... Closed
relates to MDEV-21644 Assertion `thd->transaction.xid_state... Closed
relates to MDEV-22445 Crash on HANDLER READ NEXT after XA P... Closed
relates to MDEV-22656 Document what rollback-xa mariabackup... Closed
relates to MDEV-25117 rpl.rpl_parallel_xa_same_xid failed i... Open
relates to MDEV-25616 Binlog event for XA COMMIT is generat... Closed
relates to MDEV-29642 Server Crash During XA Prepare Can Br... Closed
relates to MDEV-32020 XA transaction replicates incorrectly... In Progress

 Description   

Dispite prepare phase have been rich you will get nothing when doing a xa recover if the client get disconnected befor commit

In the following php script an error is generated to force the COMMIT to failed cf:XA COMMITEE and close the connection.

 
<?php
 
$dt = date_create();
$xid = date_timestamp_get($dt);
echo "xid : ".$xid;
 
$logger1 = new mysqli('localhost', 'root', 'xxxxx', 'test');
if (mysqli_connect_error()) {
    die('Erreur de connexion (' . mysqli_connect_errno() . ') '. mysqli_connect_error());
}
$potest = new mysqli('192.168.45.166', 'test', 'xxxxxxx', 'test');
if (mysqli_connect_error()) {
    $logger1->close();
    die('Erreur de connexion (' . mysqli_connect_errno() . ') '. mysqli_connect_error());
}
 
echo 'SuccËs logger1 ' . $logger1->host_info . "\n";
echo 'SuccËs potest ' . $potest->host_info . "\n";
 
 
##################################
$queryXA = 'XA START "'.$xid.'"';
 
$queryRes = $logger1->query($queryXA);
if (!$queryRes) {
    $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
    $message .= 'RequÍte complËte : ' . $queryXA;
    $logger1->close();
    $potest->close();
    die($message);
}
$queryRes = $potest->query($queryXA);
if (!$queryRes) {
    $message  = 'RequÍte invalide potest: ' . $potest->error . "\n";
    $message .= 'RequÍte complËte : ' . $queryXA;
    $logger1->close();
    $potest->close();
    die($message);
}
##################################
 
 
##################################
$queryXA = 'INSERT INTO mthoxa(xamtho_tc) values ("'.$xid.'")';
$queryRes = $logger1->query($queryXA);
if (!$queryRes) {
    $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
    $message .= 'RequÍte complËte : ' . $queryXA;
    $logger1->close();
    $potest->close();
    die($message);
}
 
$queryRes = $potest->query($queryXA);
if (!$queryRes) {
    $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
    $message .= 'RequÍte complËte : ' . $queryXA;
    $logger1->close();
    $potest->close();
    die($message);
}
##################################
 
 
##################################
$queryXA = 'XA END "'.$xid.'"';
 
$queryRes = $logger1->query($queryXA);
if (!$queryRes) {
    $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
    $message .= 'RequÍte complËte : ' . $queryXA;
    $logger1->close();
    $potest->close();
    die($message);
}
$queryRes = $potest->query($queryXA);
if (!$queryRes) {
    $message  = 'RequÍte invalide potest: ' . $potest->error . "\n";
    $message .= 'RequÍte complËte : ' . $queryXA;
    $logger1->close();
    $potest->close();
    die($message);
}
##################################
 
 
##################################
$queryXA = 'XA PREPARE "'.$xid.'"';
$queryRes = $logger1->query($queryXA);
if (!$queryRes) {
    $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
    $message .= 'RequÍte complËte : ' . $queryXA;
    $logger1->close();
    $potest->close();
    die($message);
}
 
$queryRes = $potest->query($queryXA);
if (!$queryRes) {
    $message  = 'RequÍte invalide potest: ' . $potest->error . "\n";
    $message .= 'RequÍte complËte : ' . $queryXA;
    $logger1->close();
    $potest->close();
    die($message);
}
 
##################################
 
 
##################################
$queryXA = 'XA COMMIT "'.$xid.'"';
$queryRes = $logger1->query($queryXA);
if (!$queryRes) {
    $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
    $message .= 'RequÍte complËte : ' . $queryXA;
    $logger1->close();
    $potest->close();
    die($message);
}
 
$queryXA = 'XA COMMITEE "'.$xid.'"';
$queryRes = $potest->query($queryXA);
if (!$queryRes) {
    $message  = 'RequÍte invalide potest: ' . $potest->error . "\n";
    $message .= 'RequÍte complËte : ' . $queryXA;
    
    $queryXA = 'XA RECOVER';
    $queryRes = $potest->query($queryXA);
    while ($r=$queryRes->fetch_assoc())
    	print_r($r);
    
    
    $logger1->close();
    $potest->close();
    die($message);
}
##################################
 
$logger1->close();
$potest->close();
 
echo "Finish !\n"
 
?>



 Comments   
Comment by Rasmus Johansson (Inactive) [ 2011-06-29 ]

Launchpad bug id: 803649

Comment by Elena Stepanova [ 2013-06-02 ]

Also reproducible on all MySQL versions (ancient bug http://bugs.mysql.com/bug.php?id=12161)

MTR test case

--source include/have_innodb.inc
 
--enable_connect_log
create table t1 (i int) engine=InnoDB;
 
--source include/count_sessions.inc
 
--connect (con1,localhost,root,,)
xa start 'xid';
insert into t1 values (1);
xa end 'xid';
xa prepare 'xid';
xa recover;
--disconnect con1
 
--connection default
--source include/wait_until_count_sessions.inc
 
xa recover;
 
drop table t1;

Comment by Andrei Elkin [ 2019-08-23 ]

maxmether, corrected. Thanx!

Comment by steen bartholdy [ 2019-12-12 ]

Hi
bug solved in mysql 5.7.7 :
https://bugs.mysql.com/bug.php?id=12161
[14 Jul 2015 19:55] Ant Kutschera
I had a problem with this bug, and I can verify that it is no longer a problem with 5.7 on Linux FC 21. See https://developer.jboss.org/message/935799 for the test case. Thank you!
I have tested on mysql 5.7.28 and : prepared transaction isnt deleted when session is killed.
kind regards
Steen

Comment by Sergei Petrunia [ 2020-01-13 ]

Take-aways from discussions in Frankfurt:

detach_prepared_tx() is ok
he transaction remains prepared inside RocksDB.
MyRocks' Rdb_transaction object is de-associated from RocksDB transaction and
is destroyed. This is fine.

replace_native_transaction_in_thd is redundant

The idea behind this method is that it is needed on the slave. A slave worker
may run the following sequence of operations:

XA BEGIN 'trx1'
... -- actions made by trx1. 
XA END 'trx1';
XA PREPARE 'trx1';
-- at this point, all actions by trx1 has been done 
-- the only two actions that are possible are commit or rollback.
 
XA BEGIN 'trx2';
... -- actions made by trx2
XA END 'trx2';
XA PREPARE 'trx2';
 
XA (COMMIT|ROLLBACK) 'trx1';
-- This is where we would want to "re-attach" to the prepared transaction.

The problem with re-attaching is that Rdb_transaction has a quite a bit of context about the transaction outside the m_rocksdb_tx member.
If one just replaces the m_rocksdb_tx, we end up with a mismatch between Rdb_transaction members.

It is much better if the SQL layer uses the existing API and calls

engine_hton->(commit|rollback)_by_xid(...);

There was another argument: both InnoDB and MyRocks try to re-use their internal transaction object. and this is why replace_trx_in_thd saves it away. And this is how we end up with method signatures like this:

+innodb_replace_trx_in_thd(
+       THD*    thd,
+       void*   new_trx_arg,
+       void**  ptr_trx_arg)
+{

My opinion is that this sort of caching should not be exposed through the storage engine API.

Comment by Andrei Elkin [ 2020-01-20 ]

psergey, thanks for your notes!

I think you have a good point to replace_native_transaction_in_thd which is indeed to arrange a form
of caching which SE actually manages on its own anyway. To use the SE caching instead may not cost much performance wise for the slave. We may measure impact anyway, which I'll try to do.
I am fine to remove this extension meanwhile (and permanently if the measurement proves it's worth of that).

Comment by Andrei Elkin [ 2020-02-03 ]

Howdy, Sergei!

Could you please check the patch. It's latest version is in bb-10.5-mdev_742.
I am also asking Marko, Sergei P., and Sergei Voitovich to take a look. Sergei V could take a close
look on usage of LF_HASH in the parallel slave code.

Thanks and cheers,

Andrei

Comment by Marko Mäkelä [ 2020-03-13 ]

I do not think that the XA functionality can meaningfully be tested with DDL operations before MDEV-15532 and MDEV-21602 have been fixed.

Comment by Sergey Vojtovich [ 2020-03-13 ]

marko, definitely. Although the problem is not new: recovery wouldn't recover MDL locks. This patch just extends the problem a bit further.

Generated at Thu Feb 08 06:31:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.