[MDEV-10532] XA (two-phase commit) transaction crash Galera clusters Created: 2016-08-10  Updated: 2019-05-17  Resolved: 2019-05-17

Status: Closed
Project: MariaDB Server
Component/s: Data Manipulation - Delete, Data Manipulation - Update, Galera, Platform RedHat, Storage Engine - InnoDB, XA
Affects Version/s: 10.1.14
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Don Wolford Assignee: Jan Lindström (Inactive)
Resolution: Incomplete Votes: 1
Labels: 2PC, XA, galera, two-phase_commit
Environment:

RHEL 6.7 (Santiago)



 Description   

If XA DML is submitted to an operating Galera cluster and innodb_support_xa is on, the cluster will crash as soon as an XA transaction is committed, and it must be rebuilt as a new cluster. If innodb_support_xa is off, the cluster won't crash, but XA transactions committed are not propagated to the other nodes, which puts them out of sync, causing the nodes to fail later when a contradictory change is posted.

I understand that XA is documented as not supported in a Galera cluster. The purpose of this issue is to have such transactions rejected when attempted, instead of causing the cluster to fail. (And, of course, it would be great if XA were supported in Galera, but that's not a bug...)



 Comments   
Comment by Ondra Chaloupka [ 2017-02-02 ]

I experience the same issue. Having run java application on Wildfly. Using MariaDB Cluster 10.1.10 with jdbc driver version 1.5.4.

I was trying to reproduce the issue by simply running SQL commands against the database but that way I wasn't succesful. My attempt was like

 XA START 0x00000000000000000000FFFF7F000001C7C103BA58934BC30000001C31,0x00000000000000000000FFFF7F000001C7C103BA58934BC3000000220000000100000000,131077
SELECT * from TEST where id = 1
update TEST set NAME='ondrej' where ID=1XA END 0x00000000000000000000FFFF7F000001C7C103BA58934BC30000001C31,0x00000000000000000000FFFF7F000001C7C103BA58934BC3000000220000000100000000,131077 2017-02-02 16:10:08,862
XA COMMIT 0x00000000000000000000FFFF7F000001C7C103BA58934BC30000001C31,0x00000000000000000000FFFF7F000001C7C103BA58934BC3000000220000000100000000,131077 ONE PHASE

When such sequence is run by the java app the MariaDB cluster node crashes and the commit returns exception

Caused by: org.mariadb.jdbc.internal.util.dao.QueryException: Could not read resultset: unexpected end of stream, read 0 bytes from 4Query is : XA COMMIT 0x00000000000000000000FFFF7F000001C7C103BA58934BC30000001C31,0x00000000000000000000FFFF7F000001C7C103BA58934BC3000000220000000100000000,131077 ONE PHASE
    at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.getResult(AbstractQueryProtocol.java:1063)
    at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.executeQuery(AbstractQueryProtocol.java:136)
    at org.mariadb.jdbc.MariaDbStatement.executeInternal(MariaDbStatement.java:251)
    ... 65 more
Caused by: java.io.EOFException: unexpected end of stream, read 0 bytes from 4
    at org.mariadb.jdbc.internal.packet.read.ReadPacketFetcher.getReusableBuffer(ReadPacketFetcher.java:178)
    at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.getResult(AbstractQueryProtocol.java:1054)
    ... 67 more

and any other next connection returns exception

Caused by: java.lang.RuntimeException: Sql execution operation failed. 
Status code 500, reason: java.sql.SQLNonTransientConnectionException: 
(conn:29) WSREP has not yet prepared node for application use
Query is : SELECT 1

Comment by Jan Lindström (Inactive) [ 2018-07-19 ]

Hi, I could not repeat your problem using given instructions:

jan@jan-laptop-asus:~/mysql/10.1-galera/mysql-test$ ./mtr galera_xa
Logging: ./mtr  galera_xa
vardir: /home/jan/mysql/10.1-galera/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/jan/mysql/10.1-galera/mysql-test/var'...
Checking supported features...
MariaDB Version 10.1.35-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
Sphinx 'indexer' binary not found, sphinx suite will be skipped
Collecting tests...
sh: 1: ctest: not found
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
create table test (id integer not null primary key,
name char(200)) engine=innodb;
insert into test values (1,'test');
XA START 'xid1';
SELECT * from test where id = 1;
id	name
1	test
update test set NAME='newtest' where ID=1;
XA END 'xid1';
XA PREPARE 'xid1';
XA COMMIT 'xid1';
SELECT * FROM test;
id	name
1	newtest
SELECT * FROM test;
id	name
1	test
DROP TABLE test;
galera.galera_xa 'innodb_plugin'         [ pass ]   2065
create table test (id integer not null primary key,
name char(200)) engine=innodb;
insert into test values (1,'test');
XA START 'xid1';
SELECT * from test where id = 1;
id	name
1	test
update test set NAME='newtest' where ID=1;
XA END 'xid1';
XA PREPARE 'xid1';
XA COMMIT 'xid1';
SELECT * FROM test;
id	name
1	newtest
SELECT * FROM test;
id	name
1	test
DROP TABLE test;
galera.galera_xa 'xtradb'                [ pass ]   2059

If you have error logs from server that crashes they could be useful to identify the issue.

Comment by Jan Lindström (Inactive) [ 2019-05-17 ]

XA-transactions are not really even supported on Galera 3, thus closing this bug.

Generated at Thu Feb 08 07:42:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.