[MDEV-24410] Deadlock between freeing and allocating undo pages Created: 2020-12-15  Updated: 2020-12-15  Resolved: 2020-12-15

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6
Fix Version/s: 10.6.0

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: hang, performance
Environment:

Debian GNU/Linux Sid, GCC 10.2.1, CMAKE_BUILD_TYPE=RelWithDebInfo CMAKE_CXX_FLAGS=-DSRW_LOCK_DUMMY


Issue Links:
Relates
relates to MDEV-24142 rw_lock_t has unnecessarily complex w... Closed
relates to MDEV-24167 InnoDB unnecessarily uses complex rw-... Closed

 Description   

The work done in MDEV-24142 and MDEV-24167 seems to have revealed a pre-existing deadlock condition in the InnoDB undo log code.

I observed a hang where a DML statement was waiting for a U-lock (before MDEV-24142 it was called SX-lock) on a newly allocated undo log page, while holding the allocation bitmap page U-lock. At the same time, the undo page latch was held by another thread that was freeing undo log pages:

10.6 43d3dad11468bf45d2098ebca70002c49104f298

#8  0x000055b32ae2734d in buf_page_get_low (page_id={m_id = 0}, zip_size=<optimized out>, rw_latch=<optimized out>, guess=<optimized out>, mode=10, mtr=<optimized out>, err=<optimized out>, 
    allow_ibuf_merge=<optimized out>) at /mariadb/10.6/storage/innobase/buf/buf0buf.cc:3280
#9  0x000055b32ae7132c in xdes_get_descriptor (space=space@entry=0x55b32cf53c28, offset=offset@entry=397, xdes=xdes@entry=0x7f0984ff8008, mtr=mtr@entry=0x7f0984ff8790)
    at /mariadb/10.6/storage/innobase/include/buf0types.h:127
#10 0x000055b32ae73d27 in fseg_free_step (header=0x7f09fb34803c "", mtr=mtr@entry=0x7f0984ff8790) at /mariadb/10.6/storage/innobase/fsp/fsp0fsp.cc:2665
#11 0x000055b32add9370 in trx_purge_free_segment (hdr_addr=<optimized out>, hdr_addr=<optimized out>, rseg=0x55b32cf8b8c8) at /mariadb/10.6/storage/innobase/trx/trx0purge.cc:412
#12 trx_purge_truncate_rseg_history (limit=<optimized out>, rseg=
      @0x55b32cf8b8c8: {id = 47, mutex = {m_impl = {m_lock_word = {<std::__atomic_base<unsigned int>> = {_M_i = 1}, <No data fields>}, m_event = 0x55b32cf8b9c8, m_policy = {m_filename = 0x55b32b146990 "/mariadb/10.6/storage/innobase/trx/trx0rseg.cc", m_line = 403, m_count = {m_spins = 0, m_waits = 0, m_calls = 0, m_enabled = false}, m_id = LATCH_ID_REDO_RSEG}}, m_ptr = 0x7f0a05ae7d40}, space = 0x55b32cf53c28, page_no = 219, curr_size = 2, undo_list = {count = 0, start = 0x0, end = 0x0, node = &trx_undo_t::undo_list}, undo_cached = {count = 0, start = 0x0, end = 0x0, node = &trx_undo_t::undo_list}, old_insert_list = {count = 0, start = 0x0, end = 0x0, node = &trx_undo_t::undo_list}, last_page_no = 4294967295, last_offset = 12164, last_commit = 89679, needs_purge = true, trx_ref_count = 0, skip_allocation = false})
    at /mariadb/10.6/storage/innobase/trx/trx0purge.cc:495
#13 trx_purge_truncate_history () at /mariadb/10.6/storage/innobase/trx/trx0purge.cc:572

I reliably but nondeterministically repeated this for a few tests of a --big-test run. On one occasion, the following tests hung:

CURRENT_TEST: versioning.auto_increment
CURRENT_TEST: innodb.innodb-alter-table
CURRENT_TEST: gcol.innodb_virtual_basic
CURRENT_TEST: innodb_gis.kill_server
CURRENT_TEST: stress.ddl_innodb

After I applied the following patch, no tests would hang, except one:

diff --git a/storage/innobase/include/trx0sys.h b/storage/innobase/include/trx0sys.h
index ef23f620c49..014d6816d2f 100644
--- a/storage/innobase/include/trx0sys.h
+++ b/storage/innobase/include/trx0sys.h
@@ -159,7 +159,7 @@ from older MySQL or MariaDB versions. */
 
 /** The number of rollback segments; rollback segment id must fit in
 the 7 bits reserved for it in DB_ROLL_PTR. */
-#define	TRX_SYS_N_RSEGS			128
+#define	TRX_SYS_N_RSEGS			1
 /** Maximum number of undo tablespaces (not counting the system tablespace) */
 #define TRX_SYS_MAX_UNDO_SPACES		(TRX_SYS_N_RSEGS - 1)
 
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index 0aedc9c0bd4..99754fb2fa2 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -1960,7 +1960,7 @@ bool recv_sys_t::parse(lsn_t checkpoint_lsn, store_t *store, bool apply)
           if (!srv_is_undo_tablespace(space_id) ||
               page_no != SRV_UNDO_TABLESPACE_SIZE_IN_PAGES)
             goto record_corrupted;
-          static_assert(UT_ARR_SIZE(truncated_undo_spaces) ==
+          static_assert(UT_ARR_SIZE(truncated_undo_spaces) >=
                         TRX_SYS_MAX_UNDO_SPACES, "compatibility");
           truncated_undo_spaces[space_id - srv_undo_space_id_start]=
             { recovered_lsn, page_no };

This suggests to me that algorithm in trx_purge_free_segment() actually relies on rseg->mutex to prevent deadlocks. With the default setting (all 128 rollback segments being located in the system tablespace), we can get this kind of deadlocks. I do not think that the deadlocks are caused by MDEV-24142 or MDEV-24167; the simpler rw-lock implementation is merely improving the chances of this occurring.

With the above patch, one test still hung in both redundant and compact combinations:

CURRENT_TEST: innodb.temp_table_savepoint
mysqltest: At line 107: query 'reap' failed: 2013: Lost connection to MySQL server during query

Each time, this would occur between two ROLLBACK. One thread was executing row_undo_try_truncate().

If I compile with the default settings, then everything would seem to work. I only got these hangs with the settings CMAKE_BUILD_TYPE=RelWithDebInfo and CMAKE_CXX_FLAGS=-'DSRW_LOCK_DUMMY -Og -march=native -mtune=native.

My full build invocation was as follows:

cmake -DCMAKE_C{,XX}_FLAGS='-DSRW_LOCK_DUMMY -Og -march=native -mtune=native' -DWITH_DBUG_TRACE=OFF -DCONC_WITH_{UNITTEST,SSL}=OFF -DWITH_EMBEDDED_SERVER=OFF -DWITH_UNIT_TESTS=OFF -DCMAKE_BUILD_TYPE=RelWithDebInfo -DPLUGIN_{ARCHIVE,TOKUDB,MROONGA,OQGRAPH,ROCKSDB,CONNECT,SPIDER}=NO -DWITH_{SAFEMALLOC,NUMA}=OFF -DWITH_{SSL,ZLIB}=system -G Ninja /mariadb/10.6
ninja



 Comments   
Comment by Marko Mäkelä [ 2020-12-15 ]

Sorry, this could be a false alarm, or a problem with 10.6 only. The debug build is triggering a lot of assertion failures when the SRW_LOCK_DUMMY implementation is being used.

Comment by Marko Mäkelä [ 2020-12-15 ]

It turns out that this problem was caused by the very commit that I indicated: rw_tryrdlock() and rw_trywrlock() would return 0 on success and nonzero on failure, and not the other way around.

My bad, I should have tested SRW_LOCK_DUMMY also in the debug build. On our CI systems, everything worked, because we only have Windows and Linux, not any BSD or anything else.

Generated at Thu Feb 08 09:29:47 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.