[MDEV-13437] InnoDB fails to return error for XA COMMIT or XA ROLLBACK in read-only mode Created: 2017-08-03  Updated: 2023-05-01  Resolved: 2017-08-29

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, XA
Affects Version/s: 10.0, 10.1
Fix Version/s: 10.0.33, 10.1.27, 10.2.9, 10.3.2

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Andrei Elkin
Resolution: Fixed Votes: 0
Labels: mtr

Issue Links:
Relates
relates to MDEV-13438 Assertion `(&(&LOCK_log)->m_mutex)->c... Closed
relates to MDEV-13606 XA PREPARE transactions should surviv... Closed
relates to MDEV-13269 Upgrade tests for InnoDB undo log Closed

 Description   

--source include/have_innodb.inc
--source include/have_binlog_format_mixed.inc
CREATE TABLE t1 (i INT) ENGINE=InnoDB;
INSERT IGNORE INTO t1 VALUES (1);
--let $shutdown_timeout= 0
--let $restart_parameters= --innodb-force-recovery=3
--source include/restart_mysqld.inc
--let $shutdown_timeout= 30
--let $restart_parameters= --innodb-force-recovery=0 --innodb-read-only=1 --tc-heuristic-recover=ROLLBACK
--source include/restart_mysqld.inc

10.0 60c9485a8e5ebf410d9fa60caa7937f3fe3ca9b8

170803 17:26:09 [Note] Heuristic crash recovery mode
2017-08-03 17:26:09 7f3ac8727080  InnoDB: Starting recovery for XA transactions...
2017-08-03 17:26:09 7f3ac8727080  InnoDB: Transaction 1283 in prepared state after recovery
2017-08-03 17:26:09 7f3ac8727080  InnoDB: Transaction contains changes to 1 rows
2017-08-03 17:26:09 7f3ac8727080  InnoDB: 1 transactions in prepared state after recovery
170803 17:26:09 [Note] Found 1 prepared transaction(s) in InnoDB
170803 17:26:09 [Note] rollback xid 'MySQLXid\1\0\0\0\0\0\0\0\26\0\0\0\0\0\0\0'
2017-08-03 17:26:09 7f3ac8727080  InnoDB: Assertion failure in thread 139890447773824 in file mtr0mtr.cc line 322
InnoDB: Failing assertion: !srv_read_only_mode

#5  0x00007f3ac63a43fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007f3abebbcf9f in mtr_commit (mtr=0x7ffd79958de0) at /data/src/10.0/storage/innobase/mtr/mtr0mtr.cc:322
#7  0x00007f3abec67ec1 in btr_pcur_commit_specify_mtr (pcur=0x7f3ac587f6e8, mtr=0x7ffd79958de0) at /data/src/10.0/storage/innobase/include/btr0pcur.ic:386
#8  0x00007f3abec68735 in row_undo_ins_remove_clust_rec (node=0x7f3ac587f678) at /data/src/10.0/storage/innobase/row/row0uins.cc:170
#9  0x00007f3abec68f72 in row_undo_ins (node=0x7f3ac587f678) at /data/src/10.0/storage/innobase/row/row0uins.cc:461
#10 0x00007f3abec70af3 in row_undo (node=0x7f3ac587f678, thr=0x7f3ac583cc40) at /data/src/10.0/storage/innobase/row/row0undo.cc:304
#11 0x00007f3abec70c21 in row_undo_step (thr=0x7f3ac583cc40) at /data/src/10.0/storage/innobase/row/row0undo.cc:351
#12 0x00007f3abebf861f in que_thr_step (thr=0x7f3ac583cc40) at /data/src/10.0/storage/innobase/que/que0que.cc:1077
#13 0x00007f3abebf883f in que_run_threads_low (thr=0x7f3ac583cc40) at /data/src/10.0/storage/innobase/que/que0que.cc:1141
#14 0x00007f3abebf89cb in que_run_threads (thr=0x7f3ac583cc40) at /data/src/10.0/storage/innobase/que/que0que.cc:1182
#15 0x00007f3abecac7d0 in trx_rollback_to_savepoint_low (trx=0x7f3ac587cc78, savept=0x0) at /data/src/10.0/storage/innobase/trx/trx0roll.cc:113
#16 0x00007f3abecacb34 in trx_rollback_for_mysql_low (trx=0x7f3ac587cc78) at /data/src/10.0/storage/innobase/trx/trx0roll.cc:176
#17 0x00007f3abecaceac in trx_rollback_for_mysql (trx=0x7f3ac587cc78) at /data/src/10.0/storage/innobase/trx/trx0roll.cc:211
#18 0x00007f3abeb4141f in innobase_rollback_trx (trx=0x7f3ac587cc78) at /data/src/10.0/storage/innobase/handler/ha_innodb.cc:4056
#19 0x00007f3abeb54045 in innobase_rollback_by_xid (hton=0x7f3ac0d17270, xid=0x7f3abb000070) at /data/src/10.0/storage/innobase/handler/ha_innodb.cc:14479
#20 0x000000000083b2b2 in xarecover_handlerton (unused=0x0, plugin=0x7ffd79959888, arg=0x7ffd799598f0) at /data/src/10.0/sql/handler.cc:1855
#21 0x0000000000662379 in plugin_foreach_with_mask (thd=0x0, func=0x83afcf <xarecover_handlerton(THD*, plugin_ref, void*)>, type=1, state_mask=4294967287, arg=0x7ffd799598f0) at /data/src/10.0/sql/sql_plugin.cc:2336
#22 0x000000000083b522 in ha_recover (commit_list=0x0) at /data/src/10.0/sql/handler.cc:1898
#23 0x0000000000917fb1 in TC_LOG::using_heuristic_recover (this=0x17acd00 <mysql_bin_log>) at /data/src/10.0/sql/log.cc:8938
#24 0x000000000091808d in MYSQL_BIN_LOG::open (this=0x17acd00 <mysql_bin_log>, opt_name=0x7f3ac1bbf838 "master-bin") at /data/src/10.0/sql/log.cc:8961
#25 0x00000000005925e9 in init_server_components () at /data/src/10.0/sql/mysqld.cc:4905
#26 0x0000000000593167 in mysqld_main (argc=138, argv=0x7f3ac5878ac0) at /data/src/10.0/sql/mysqld.cc:5371
#27 0x0000000000589c76 in main (argc=23, argv=0x7ffd79959d78) at /data/src/10.0/sql/main.cc:25



 Comments   
Comment by Marko Mäkelä [ 2017-08-07 ]

Andrei, can you debug this further? I tried with the following patch to both storage/innobase and storage/xtradb, and it will fail in binlog:

diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index 526836646d2..031ab54d4c5 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -14446,6 +14446,10 @@ innobase_commit_by_xid(
 
 	DBUG_ASSERT(hton == innodb_hton_ptr);
 
+	if (high_level_read_only) {
+		return(XAER_RMFAIL);
+	}
+
 	trx = trx_get_trx_by_xid(xid);
 
 	if (trx) {
@@ -14473,6 +14477,10 @@ innobase_rollback_by_xid(
 
 	DBUG_ASSERT(hton == innodb_hton_ptr);
 
+	if (high_level_read_only) {
+		return(XAER_RMFAIL);
+	}
+
 	trx = trx_get_trx_by_xid(xid);
 
 	if (trx) {

The failure was like this:

170807 14:19:29 [Note] Heuristic crash recovery mode
2017-08-07 14:19:29 7fc95dcc0540  InnoDB: Starting recovery for XA transactions...
2017-08-07 14:19:29 7fc95dcc0540  InnoDB: Transaction 1283 in prepared state after recovery
2017-08-07 14:19:29 7fc95dcc0540  InnoDB: Transaction contains changes to 1 rows
2017-08-07 14:19:29 7fc95dcc0540  InnoDB: 1 transactions in prepared state after recovery
170807 14:19:29 [Note] Found 1 prepared transaction(s) in InnoDB
170807 14:19:29 [Note] rollback xid 'MySQLXid\1\0\0\0\0\0\0\0\26\0\0\0\0\0\0\0'
170807 14:19:29 [Note] Please restart mysqld without --tc-heuristic-recover
mysqld: /mariadb/10.0/sql/log.cc:3258: bool MYSQL_BIN_LOG::open(const char *, enum_log_type, const char *, enum cache_type, ulong, bool, bool): Assertion `(&(&LOCK_log)->m_mutex)->count > 0 && pthread_equal(pthread_self(), (&(&LOCK_log)->m_mutex)->thread)' failed.
170807 14:19:29 [ERROR] mysqld got signal 6 ;

Stack trace:

#7  0x00007fc95c446f12 in __GI___assert_fail (assertion=0x10c994a "(&(&LOCK_log)->m_mutex)->count > 0 && pthread_equal(pthread_self(), (&(&LOCK_log)->m_mutex)->thread)", file=0x10c8c0e "/mariadb/10.0/sql/log.cc", line=3258, function=0x10c99af "bool MYSQL_BIN_LOG::open(const char *, enum_log_type, const char *, enum cache_type, ulong, bool, bool)") at assert.c:101
#8  0x0000000000934a7c in MYSQL_BIN_LOG::open (this=0x18b7820 <mysql_bin_log>, log_name=0x3c13d98 "master-bin", log_type_arg=LOG_BIN, new_name=0x0, io_cache_type_arg=WRITE_CACHE, max_size_arg=1073741824, null_created_arg=false, need_mutex=true) at /mariadb/10.0/sql/log.cc:3258
#9  0x0000000000943bde in MYSQL_BIN_LOG::open (this=0x18b7820 <mysql_bin_log>, opt_name=0x3c13d98 "master-bin") at /mariadb/10.0/sql/log.cc:8964
#10 0x000000000057e8f2 in init_server_components () at /mariadb/10.0/sql/mysqld.cc:4905
#11 0x000000000057b5c4 in mysqld_main (argc=137, argv=0x3af0388) at /mariadb/10.0/sql/mysqld.cc:5371
#12 0x00000000005782c2 in main (argc=22, argv=0x7ffc17d97628) at /mariadb/10.0/sql/main.cc:25

It appears that a mutex is not being released in an error handling code path.

Comment by Marko Mäkelä [ 2017-08-15 ]

Also, it looks like we should revert part of this InnoDB change in MySQL 5.0.3 (making innodb_force_recovery=1 and innodb_force_recovery=2 initiate rollback of a transaction that reached the XA PREPARE state before previous crash or shutdown).
It seems that the option tc_heuristic_recover=ROLLBACK was introduced for exactly this purpose; there is no need to convolute the logic of an InnoDB parameter.

Comment by Andrei Elkin [ 2017-08-21 ]

Elena, hello.

There are small changes to mtr that's why you.

I am asking Marko to review the rest of this work.
The latest patch is pushed to bb-10.0-mdev_13437-andrei.

Thanks!

Andrei

Comment by Elena Stepanova [ 2017-08-21 ]

The only change I see in MTR is exporting of an extra variable. It's not dangerous, so if it works for you, I have no objections. In general, we should invent a better way to fail-start the server, but it doesn't have to be done now.

Re-assigning to Marko for real review.

Comment by Marko Mäkelä [ 2017-08-21 ]

Elkin, the new logic looks OK to me.
I only requested some minor clean-up to both the code and the test case.

Comment by Marko Mäkelä [ 2017-08-24 ]

Looks OK now; the newest revision to the test even works with Valgrind:

main.tc_heuristic_recover 'innodb_plugin' [ pass ]  54201
main.tc_heuristic_recover 'xtradb'       [ pass ]  73606
valgrind_report                          [ pass ]

Generated at Thu Feb 08 08:05:36 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.