[MDEV-633] LP:1024058 - mysqld XA crash in replication slave Created: 2012-07-12 Updated: 2013-01-21 Resolved: 2013-01-21 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 5.5.28 |
| Fix Version/s: | 5.5.29 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Rich Prohaska | Assignee: | Sergei Golubchik |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | Launchpad, replication | ||
| Attachments: |
|
| Description |
|
We found a simple XA transaction that crashes MySQL 5.5 replication. This simple transaction merely inserts into InnoDB and TokuDB tables. The bug was caused by a flaw in the logging code exposed by the transaction’s use of two XA storage engines (TokuDB and InnoDB) and was fixed in the TokuDB 6.0.1 release. Here are some details. Suppose that a database contains the following tables.
The following transaction
causes the replication slave to crash. The crash occurs when mysqld tries to dereference a NULL pointer.
the bug is fixed on lp:~prohaska7/5.5-xa-rpl-crash-fix also, see mariadb-developers email thread. |
| Comments |
| Comment by Rasmus Johansson (Inactive) [ 2012-07-12 ] |
|
Launchpad bug id: 1024058 |
| Comment by Sergei Golubchik [ 2012-11-14 ] |
|
Although none of your patch is present in the current MariaDB 5.5, I failed to reproduce the crash with InnoDB and PBXT and the your test case. If you could provide more info, so that I'd be able to reproduce it, feel free to reopen this bug. |
| Comment by Sergei Golubchik [ 2012-12-14 ] |
|
got more info from the reporter |
| Comment by Sergei Golubchik [ 2013-01-17 ] |
|
The problem here is very simple to explain. The server can use either mmap-based transaction coordinator for 2PC or a binary log. 2PC always uses binary log, if binary logging is enabled. But even if it is enabled globally, it is usually disabled in the replication slave thread unless --log-slave-updates is specified. One would probably get the same crash without replication, if one disables binary log manually with SET SQL_LOG_BIN=0; Possible fixes:
The last approach seems to be preferable. But in the future if we'll start recovering transactions from the binary log (doing only one sync per 2PC transaction), we'll have this problem again, because then we'll need the actual changes to be logged, not just the Xid. knielsen - opinion? |