[MDEV-4470] the replication broken when use the xa transcation Created: 2013-05-02  Updated: 2013-05-02  Resolved: 2013-05-02

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.2
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: boyce (Inactive) Assignee: Elena Stepanova
Resolution: Not a Bug Votes: 0
Labels: None


 Description   

the replication broken when use the xa transaction
master

mysql> create table t(id int auto_increment primary key, a int) engine=innodb;
mysql> xa start '111';
mysql> insert into t(a) values(1);
mysql> xa end '111';
mysql> xa prepare '111';
kill -9 master_pid

restart the master

mysql> xa recover;
+----------+--------------+--------------+------+
| formatID | gtrid_length | bqual_length | data |
+----------+--------------+--------------+------+
|        1 |            3 |            0 |  111 |
+----------+--------------+--------------+------+
mysql> xa commit '111';
mysql> select * from t;
Empty set (0.00 sec)

why? I did the xa commit '111', but no result return, maybe another bug?
do the test continue

mysql> xa start '222';
mysql> insert into t(a) values(2);
mysql> xa end '222';
mysql> prepare '222';
mysql> xa commit '111';
mysql> select * from t;
+----+------+
| id |    a |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+

value(1,1) return at this time, why?

what's happen on the slave?

mysql> select * from t;
+----+------+
| id |    a |
+----+------+
|  2 |    2 |
+----+------+

the replication is broken now!!



 Comments   
Comment by boyce (Inactive) [ 2013-05-02 ]

the replication broken when use the xa transaction
master
mysql> create table t(id int auto_increment primary key, a int) engine=innodb;
mysql> xa start '111';
mysql> insert into t(a) values(1);
mysql> xa end '111';
mysql> xa prepare '111';
kill -9 master_pid
restart the master
mysql> xa recover;
--------------------------------------+

formatID gtrid_length bqual_length data

--------------------------------------+

1 3 0 111

--------------------------------------+
mysql> xa commit '111';
mysql> select * from t;
Empty set (0.00 sec)
why? I did the xa commit '111', but no result return, maybe another bug?
do the test continue
mysql> xa start '222';
mysql> insert into t(a) values(2);
mysql> xa end '222';
mysql> prepare '222';
mysql> xa commit '111';
mysql> select * from t;
--------+

id a

--------+

1 1
2 2

--------+
value(1,1) return at this time, why?

what's hanppen on the slave?
mysql> select * from t;
--------+

id a

--------+

2 2

--------+
the replication is broken now!!

Comment by Elena Stepanova [ 2013-05-02 ]

Hi,

So, there were two questions in here:

  • why the record (1,1) didn't make it to the slave;
  • why the record (1,1) didn't show up on the first SELECT.

The answer to the first question is that it's a limitation of MySQL XA transactions, see MySQL manual (http://dev.mysql.com/doc/refman/5.6/en/xa-restrictions.html):
" If an XA transaction has reached the PREPARED state and the MySQL server is killed (for example, with kill -9 on Unix) or shuts down abnormally, the transaction can be continued after the server restarts. However, if the client reconnects and commits the transaction, the transaction will be absent from the binary log even though it has been committed. This means the data and the binary log have gone out of synchrony. An implication is that XA cannot be used safely together with replication. "

This is exactly your scenario.

The answer to the second question is less obvious. From all I see in your scenario, it's not an exact quote from your MySQL client, but rather a manual compilation of different bits and pieces. There are several indications of it:

a) prepare '222';
would have caused a syntax error (XA is missing)

b) the second "xa commit '111';" (after "prepare '222'") would have caused "Unknown XID" error or such, since you had already committed '111' before;

c) since you had never issued "xa commit '222'", there is no way (2,2) would have appeared on slave.

So, something is obviously missing here, and something is wrong. My best guess is that by the time of your first "select * from t" you hadn't actually issued "xa commit '111'" yet, in this case you would have received an empty set indeed, even although the transaction is in prepared state.

Please let us know whether the above answers your questions.

Thanks.

Comment by boyce (Inactive) [ 2013-05-02 ]
  • why the record (1,1) didn't make it to the slave
    Thanks for your answer.
  • why the record (1,1) didn't show up on the first SELECT
    sorry , It's my fault, there maybe some input errors. so I test again and copy the result blew:

mysql> truncate table t;
Query OK, 0 rows affected (0.00 sec)

mysql> xa start '111';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t(a) values(1);
Query OK, 1 row affected (0.00 sec)

mysql> xa end '111';
Query OK, 0 rows affected (0.00 sec)

mysql> xa prepare '111';
Query OK, 0 rows affected (0.00 sec)

kill -9 pid at this time

mysql> xa recover;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 2
Current database: mydb

--------------------------------------+

formatID gtrid_length bqual_length data

--------------------------------------+

1 3 0 111

--------------------------------------+
1 row in set (0.00 sec)

mysql> select * from t;
Empty set (0.00 sec)

mysql> xa commit '111';
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t;
Empty set (0.00 sec)

why? the value(1,1) not return!

mysql> xa start '222';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t(a) values(2);
Query OK, 1 row affected (0.00 sec)

mysql> xa end '222';
Query OK, 0 rows affected (0.00 sec)

mysql> xa prepare '222';
Query OK, 0 rows affected (0.01 sec)

mysql> xa commit '222';
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t;
--------+

id a

--------+

1 1
2 2

--------+
2 rows in set (0.00 sec)

why? return the value(1,1),(2,2)?

Comment by Elena Stepanova [ 2013-05-02 ]

Thanks, it makes sense now. The important difference is yet another SELECT which you ran between XA RECOVER '111' and XA COMMIT '111'.
I assume you have query cache enabled (you can check the value of query_cache_size in your config file or in SHOW VARIABLES).
With the query cache, I'm getting the problem as well:

MariaDB [test]> create table t(id int auto_increment primary key, a int) engine=innodb;
Query OK, 0 rows affected (0.24 sec)

MariaDB [test]> xa start '111';
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> insert into t(a) values(1);
Query OK, 1 row affected (0.00 sec)

MariaDB [test]> xa end '111';
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> xa prepare '111';
Query OK, 0 rows affected (0.27 sec)

MariaDB [test]> system killall -s 9 mysqld
MariaDB [test]> system start_server.sh --query-cache-size=1048576

MariaDB [test]> xa recover;
ERROR 2013 (HY000): Lost connection to MySQL server during query
MariaDB [test]> xa recover;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 2
Current database: test

--------------------------------------+

formatID gtrid_length bqual_length data

--------------------------------------+

1 3 0 111

--------------------------------------+
1 row in set (0.01 sec)

MariaDB [test]> select * from t;
Empty set (0.00 sec)

MariaDB [test]> xa commit '111';
Query OK, 0 rows affected (0.10 sec)

MariaDB [test]> select * from t;
Empty set (0.00 sec)

MariaDB [test]> select sql_no_cache * from t;
--------+

id a

--------+

1 1

--------+
1 row in set (0.00 sec)

MariaDB [test]> show status like 'Qcache_hits';
--------------------+

Variable_name Value

--------------------+

Qcache_hits 1

--------------------+
1 row in set (0.00 sec)

As you can see above, the SELECT hits the query cache. I'd say it's a bug, but I suggest to file it separately.

Comment by boyce (Inactive) [ 2013-05-02 ]

Thanks
I want to check all the data is ok, so do the select between 'xa recover' and 'xa commit' , to my surprise, I hit this bug, I'll create another issue for it.
do test without the select, it's right, thanks again.

mysql> truncate table t;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql>
mysql> select * from t;
Empty set (0.00 sec)

mysql> xa start '111';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t(a) values(1);
Query OK, 1 row affected (0.00 sec)

mysql> xa end '111';
Query OK, 0 rows affected (0.00 sec)

mysql> xa prepare '111';
Query OK, 0 rows affected (0.00 sec)

mysql> xa recover;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 2
Current database: mydb

--------------------------------------+

formatID gtrid_length bqual_length data

--------------------------------------+

1 3 0 111

--------------------------------------+
1 row in set (0.00 sec)

mysql> xa commit '111';
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t;
--------+

id a

--------+

1 1

--------+
1 row in set (0.00 sec)

Comment by Elena Stepanova [ 2013-05-02 ]

As discussed in the comments, the issue with replication is a known MySQL XA limitation, so I am closing it as 'Not a bug'.
A problem with SELECT has been filed separately as MDEV-4471.

Comment by Jeremy Cole [ 2013-05-02 ]

For tracking purposes maybe it makes sense to create a new resolution of "Known bug with upstream" and use that for such cases. This is very definitely a bug, just not a bug with MariaDB exclusively.

Comment by Elena Stepanova [ 2013-05-02 ]

For tracking purposes we have that, it's label 'upstream'.

But which part do you mean saying "definitely a bug" – replication issue or the SELECT with query cache?
For the replication issue, let me re-quote again the manual:
(http://dev.mysql.com/doc/refman/5.6/en/xa-restrictions.html):
" If an XA transaction has reached the PREPARED state and the MySQL server is killed (for example, with kill -9 on Unix) or shuts down abnormally, the transaction can be continued after the server restarts. However, if the client reconnects and commits the transaction, the transaction will be absent from the binary log even though it has been committed. This means the data and the binary log have gone out of synchrony. An implication is that XA cannot be used safely together with replication. "

I wouldn't even bother to file it as a bug at bugs.mysql.com, it will be closed immediately as a documented limitation.

SELECT is a bug, it's been filed separately, as said before.

Generated at Thu Feb 08 06:56:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.