[MDEV-22770] trx_undo_report_rename() fails to release page latches Created: 2020-06-02  Updated: 2020-06-02  Resolved: 2020-06-02

Status: Closed
Project: MariaDB Server
Component/s: Locking, Storage Engine - InnoDB
Affects Version/s: 10.3.5, 10.4.0, 10.5.0
Fix Version/s: 10.5.4, 10.3.24, 10.4.14

Type: Bug Priority: Blocker
Reporter: Roel Van de Paar Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: not-10.1, not-10.2, regression

Issue Links:
Problem/Incident
is caused by MDEV-15090 Reduce the overhead of writing undo l... Closed

 Description   

USE test;
SET SQL_MODE='';
SET @@GLOBAL.innodb_trx_rseg_n_slots_debug=1;
CREATE TABLE t (b VARCHAR(10) NOT NULL UNIQUE) ENGINE=InnoDB;
INSERT INTO t VALUES (12662),(54592);
SET GLOBAL innodb_monitor_enable='buffer_flush_batches';
INSERT INTO t VALUES (2822.75);
CREATE TABLE t2(a INT NOT NULL PRIMARY KEY, b INT) ENGINE=InnoDB SELECT * FROM t LOCK IN SHARE MODE;
INSERT INTO t VALUES (25215);
CREATE TEMPORARY TABLE m(a INT) ENGINE=INNODB;
SELECT SLEEP(5);
SELECT SLEEP(5);
# Then exit CLI, and shutdown using mysqladmin shutdown

Leads to:

10.5.4 8569dac1ec9f6853a0b2f3ea9bcbda67644ead24

2020-06-02 10:09:01 0 [Note] /test/MD260520-mariadb-10.5.4-linux-x86_64-dbg/bin/mysqld: ready for connections.
Version: '10.5.4-MariaDB-debug'  socket: '/test/MD260520-mariadb-10.5.4-linux-x86_64-dbg/socket.sock'  port: 13236  MariaDB Server
2020-06-02 10:09:16 4 [Warning] InnoDB: Cannot find a free slot for an undo log. Do you have too many active transactions running concurrently?
2020-06-02 10:09:16 4 [Warning] InnoDB: Unable to delete statistics for table test.t2: Too many concurrent transactions. They can be deleted later using DELETE FROM mysql.innodb_index_stats WHERE database_name = 'test' AND table_name = 't2'; DELETE FROM mysql.innodb_table_stats WHERE database_name = 'test' AND table_name = 't2';
2020-06-02 10:09:16 4 [Note] InnoDB: Deferring DROP TABLE `test`.`t2`; renaming to test/#sql-ib21
2020-06-02 10:09:16 4 [Warning] InnoDB: Cannot find a free slot for an undo log. Do you have too many active transactions running concurrently?
########################################
DEADLOCK of threads detected!
rw-lock 0x1553ab400ff8 --Thread 23449040844544 has waited at trx0rseg.ic line 44 for 0.00 seconds the semaphore:
X-lock on RW-latch at 0x1553ab400ff8 created in file buf0buf.cc line 1325
a writer (thread id 23449576269568) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time write locked in file trx0rseg.ic line 44
Locked: thread 23449576269568 file trx0rseg.ic line 44  X-LOCK
Locked: thread 23449576269568 file trx0rseg.ic line 44  X-LOCK
2020-06-02 10:09:26 4 [Note] InnoDB: Mutex 0x562fdc4be6c0 owned by thread 23449040844544 file /test/10.5_dbg/storage/innobase/dict/dict0stats.cc line 2434
--Thread 23449576269568 has waited at dict0dict.cc line 316 for 0.00 seconds the semaphore:
Mutex at 0x562fdc4be6c0, Mutex DICT_SYS created dict0dict.cc:1016, lock var 2
Last time reserved in file /test/10.5_dbg/storage/innobase/dict/dict0stats.cc line 2434
2020-06-02 10:09:26 4 [ERROR] [FATAL] InnoDB: ######################################## Deadlock Detected!
200602 10:09:26 [ERROR] mysqld got signal 6 ;

10.5.4 8569dac1ec9f6853a0b2f3ea9bcbda67644ead24

Core was generated by `/test/MD260520-mariadb-10.5.4-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
    at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
[Current thread is 1 (Thread 0x154acf08f700 (LWP 76783))]
(gdb) bt
(gdb) (gdb) #0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
#1  0x0000558a24bbcd7a in my_write_core (sig=sig@entry=6) at /test/10.5_dbg/mysys/stacktrace.c:518
#2  0x0000558a24362385 in handle_fatal_signal (sig=6) at /test/10.5_dbg/sql/signal_handler.cc:330
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#5  0x0000154acd7d3801 in __GI_abort () at abort.c:79
#6  0x0000558a249795f6 in ib::fatal::~fatal (this=<optimized out>, __in_chrg=<optimized out>) at /test/10.5_dbg/storage/innobase/ut/ut0ut.cc:601
#7  0x0000558a249182e0 in sync_array_wait_event (arr=arr@entry=0x154acced3218, cell=@0x154acf08e738: 0x154accf810d8) at /test/10.5_dbg/storage/innobase/sync/sync0arr.cc:463
#8  0x0000558a2473edb8 in TTASEventMutex<GenericPolicy>::enter (line=<optimized out>, filename=<optimized out>, max_delay=<optimized out>, max_spins=60, this=<optimized out>) at /test/10.5_dbg/storage/innobase/include/ib0mutex.h:477
#9  PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=0x558a2588a6c0 <dict_sys>, n_spins=30, n_delay=4, name=0x558a2506b0b0 "/test/10.5_dbg/storage/innobase/dict/dict0dict.cc", line=316) at /test/10.5_dbg/storage/innobase/include/ib0mutex.h:594
#10 0x0000558a24a4b007 in dict_table_close (table=0x154aac057098, dict_locked=dict_locked@entry=false, try_drop=try_drop@entry=false, thd=thd@entry=0x0, mdl=mdl@entry=0x0) at /test/10.5_dbg/storage/innobase/dict/dict0dict.cc:316
#11 0x0000558a24899ac0 in row_prebuilt_free (prebuilt=0x154aac17c098, dict_locked=dict_locked@entry=0) at /test/10.5_dbg/storage/innobase/row/row0mysql.cc:1041
#12 0x0000558a24711ad5 in ha_innobase::close (this=0x154aac1550a0) at /test/10.5_dbg/storage/innobase/handler/ha_innodb.cc:6171
#13 0x0000558a243695d8 in handler::ha_close (this=0x154aac1550a0) at /test/10.5_dbg/sql/handler.cc:2975
#14 0x0000558a241b957d in closefrm (table=table@entry=0x154aac173088) at /test/10.5_dbg/sql/table.cc:4271
#15 0x0000558a242c806d in THD::close_temporary_table (this=this@entry=0x154aac015088, table=table@entry=0x154aac173088) at /test/10.5_dbg/sql/temporary_tables.cc:1240
#16 0x0000558a242c8ca0 in THD::free_temporary_table (this=this@entry=0x154aac015088, table=table@entry=0x154aac173088) at /test/10.5_dbg/sql/temporary_tables.cc:1490
#17 0x0000558a242c8e00 in THD::close_temporary_tables (this=this@entry=0x154aac015088) at /test/10.5_dbg/sql/temporary_tables.cc:531
#18 0x0000558a2405603e in THD::cleanup (this=this@entry=0x154aac015088) at /test/10.5_dbg/sql/sql_class.cc:1543
#19 0x0000558a23fb4c6b in unlink_thd (thd=0x154aac015088) at /test/10.5_dbg/sql/mysqld.cc:2600
#20 0x0000558a2420ac98 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x154aadc453a8, put_in_cache=put_in_cache@entry=true) at /test/10.5_dbg/sql/sql_connect.cc:1422
#21 0x0000558a2420ae5b in handle_one_connection (arg=arg@entry=0x154aadc453a8) at /test/10.5_dbg/sql/sql_connect.cc:1313
#22 0x0000558a2466a14e in pfs_spawn_thread (arg=0x154accc45888) at /test/10.5_dbg/storage/perfschema/pfs.cc:2201
#23 0x0000154ace4b66db in start_thread (arg=0x154acf08f700) at pthread_create.c:463
#24 0x0000154acd8b488f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bug confirmed present in:
MariaDB: 10.5.4 (dbg)

Bug confirmed not present in:
MariaDB: 10.1.46 (dbg), 10.1.46 (opt), 10.2.33 (dbg), 10.2.33 (opt), 10.3.24 (dbg), 10.3.24 (opt), 10.4.14 (dbg), 10.4.14 (opt), 10.5.4 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.47 (dbg), 5.6.47 (opt), 5.7.29 (dbg), 5.7.29 (opt), 8.0.19 (dbg), 8.0.19 (opt)

MariaDB 10.4.14 (dbg) does not crash in any way;

10.4.14 ea7830eef48333e28f98a9b91f05a95735b465a3

2020-06-02 10:15:25 0 [Note] /test/MD260520-mariadb-10.4.14-linux-x86_64-dbg/bin/mysqld: ready for connections.
Version: '10.4.14-MariaDB-debug'  socket: '/test/MD260520-mariadb-10.4.14-linux-x86_64-dbg/socket.sock'  port: 14979  MariaDB Server
2020-06-02 10:15:26 9 [Note] InnoDB: Deferring DROP TABLE `test`.`t2`; renaming to test/#sql-ib21
2020-06-02 10:16:10 0 [Note] /test/MD260520-mariadb-10.4.14-linux-x86_64-dbg/bin/mysqld (initiated by: root[root] @ localhost []): Normal shutdown
2020-06-02 10:16:10 0 [Note] InnoDB: FTS optimize thread exiting.
2020-06-02 10:16:10 0 [Note] Event Scheduler: Purging the queue. 0 events
2020-06-02 10:16:11 0 [Note] InnoDB: Starting shutdown...
2020-06-02 10:16:11 0 [Note] InnoDB: Dumping buffer pool(s) to /test/MD260520-mariadb-10.4.14-linux-x86_64-dbg/data/ib_buffer_pool
2020-06-02 10:16:11 0 [Note] InnoDB: Buffer pool(s) dump completed at 200602 10:16:11
2020-06-02 10:16:12 0 [Note] InnoDB: Shutdown completed; log sequence number 76351; transaction id 64
2020-06-02 10:16:12 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2020-06-02 10:16:12 0 [Note] /test/MD260520-mariadb-10.4.14-linux-x86_64-dbg/bin/mysqld: Shutdown complete



 Comments   
Comment by Marko Mäkelä [ 2020-06-02 ]

I can repeat this with both settings of innodb_stats_persistent. I debugged it with rr replay as follows:

  1. logs_empty_and_mark_files_at_shutdown() was blocked by buf_page_cleaner_is_active
  2. buf_flush_page() was blocked by a page latch on the TRX_SYS page (0:6)
  3. watch -l block.debug_list.count revealed the problem:

diff --git a/storage/innobase/trx/trx0rec.cc b/storage/innobase/trx/trx0rec.cc
index 940c90ee3e3..b282bb177af 100644
--- a/storage/innobase/trx/trx0rec.cc
+++ b/storage/innobase/trx/trx0rec.cc
@@ -1934,10 +1934,9 @@ dberr_t trx_undo_report_rename(trx_t* trx, const dict_table_t* table)
 				}
 			}
 		}
-
-		mtr.commit();
 	}
 
+	mtr.commit();
 	return err;
 }

The bug was introduced to the logging of RENAME operations (MDEV-14717) by MDEV-15090. I did not test 10.3 or 10.4 (and did not investigate why the test did not fail there), but I verified the above fix on 10.5.

Comment by Roel Van de Paar [ 2020-06-02 ]

Double checked 10.3.24 (dbg) and 10.4.14 (dbg) manually, and confirmed they do not crash and that shutdown completes cleanly.

Comment by Marko Mäkelä [ 2020-06-02 ]

Roel, in your test, when the duplicate key error handling for the CREATE TABLE t2…SELECT invokes ha_innobase::delete_table(), the transaction is still open and the work-around mentioned in MDEV-21602 should kick in.

A more reliable version of the test might invoke RENAME TABLE directly while innodb_trx_rseg_n_slots_debug=1 is in effect.

A possible explanation why the test might not be effective before 10.5 might be MDEV-16678.

Generated at Thu Feb 08 09:17:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.