Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23888

Potential server hang on replication with InnoDB




      On an InnoDB lock conflict, the call to thd_lock_wait_timeout() could cause a server hang. In 5.5, the code appeared to be still fine in srv_suspend_mysql_thread():

      	/* InnoDB system transactions (such as the purge, and
      	incomplete transactions that are being rolled back after crash
      	recovery) will use the global value of
      	innodb_lock_wait_timeout, because trx->mysql_thd == NULL. */
      	lock_wait_timeout = thd_lock_wait_timeout(trx->mysql_thd);

      In MySQL 5.6.2, this function was replaced with lock_wait_suspend_thread(), which is holding InnoDB mutexes while calling the function that may potentially acquire LOCK_global_system_variables:

      	lock_wait_timeout = thd_lock_wait_timeout(trx->mysql_thd);

      At some later point, the code was cleaned up so that the above call appears to occur while no InnoDB mutexes are being held.

      In MySQL 5.7.3, the bug was probably fixed outside InnoDB. A claim in the commit message about InnoDB does hold as of that point of time:

      According to InnoDB team this doesn't happen in the current code, as InnoDB tries to avoid accessing session variables while holding internal locks, but this has happened in the past.

      This change was never merged to MariaDB server.

      This bug was flagged by SAFE_MUTEX in the MDEV-21452 branch. Specifically, an inconsistent ordering of lock_sys.mutex, LOCK_global_system_variables and LOCK_commit_ordered was identified.

      The warnings are avoided if we cache the value to trx_t in innobase_trx_init(), like we do for a few other session variables. There does not seem to be any other possible fix. Even though the caller of thd_lock_wait_timeout() is not holding any InnoDB mutexes, we have the problem that the caller of InnoDB transaction commit may be holding LOCK_commit_ordered.

      A deadlock with 3 threads ought to be possible, with one thread committing and another being in a lock wait, and a third thread doing something that involves at least one of the 3 mutexes.

      With a small patch to the debug build, we can see that LOCK_commit_ordered can indeed be held during a transaction commit (which would acquire trx_t::mutex or lock_sys_t::mutex):

      diff --git a/storage/innobase/trx/trx0trx.cc b/storage/innobase/trx/trx0trx.cc
      index 13b4efb973b..3c917568d8f 100644
      --- a/storage/innobase/trx/trx0trx.cc
      +++ b/storage/innobase/trx/trx0trx.cc
      @@ -1671,6 +1671,8 @@ trx_erase_lists(
      +extern mysql_mutex_t LOCK_commit_ordered;
       Commits a transaction in memory. */
      @@ -1685,6 +1687,7 @@ trx_commit_in_memory(
       				/*!< in: true if serialisation log was
       				written */
      +	mysql_mutex_assert_not_owner(&LOCK_commit_ordered);
       	trx->must_flush_log_later = false;
       	if (trx_is_autocommit_non_locking(trx)) {

      10.2 295e2d500b31819422c97ad77beb6226b961c207 with the above patch

      CURRENT_TEST: binlog.binlog_commit_wait
      mysqltest: At line 48: query 'reap' failed: 2013: Lost connection to MySQL server during query
      #7  0x0000560141e54aa5 in trx_commit_in_memory (trx=0x7f28350e8268, mtr=0x7f2834471a88, serialised=true) at /mariadb/10.2o/storage/innobase/trx/trx0trx.cc:1690
      #8  trx_commit_low (trx=0x7f28350e8268, mtr=<optimized out>) at /mariadb/10.2o/storage/innobase/trx/trx0trx.cc:1964
      #9  0x0000560141e55314 in trx_commit (trx=0x7f28350e8268) at /mariadb/10.2o/storage/innobase/trx/trx0trx.cc:1988
      #10 0x0000560141e567f4 in trx_commit_for_mysql (trx=0x7f28350e8268) at /mariadb/10.2o/storage/innobase/trx/trx0trx.cc:2197
      #11 0x0000560141ca056b in innobase_commit_low (trx=0x7f28350e8268) at /mariadb/10.2o/storage/innobase/handler/ha_innodb.cc:4420
      #12 innobase_commit_ordered_2 (trx=0x7f28350e8268, thd=<optimized out>) at /mariadb/10.2o/storage/innobase/handler/ha_innodb.cc:4544
      #13 0x0000560141ca0d98 in innobase_commit_ordered (hton=<optimized out>, thd=0x7f27d4000d28, all=true) at /mariadb/10.2o/storage/innobase/handler/ha_innodb.cc:4604
      #14 0x0000560141bcb1ea in TC_LOG::run_commit_ordered (this=<optimized out>, thd=0x7f27d4000d28, all=<optimized out>) at /mariadb/10.2o/sql/log.cc:8838
      #15 0x0000560141bcae47 in MYSQL_BIN_LOG::trx_group_commit_leader (this=<optimized out>, leader=<optimized out>) at /mariadb/10.2o/sql/log.cc:8065
      #16 0x0000560141bc95b3 in MYSQL_BIN_LOG::write_transaction_to_binlog_events (this=0x5601428ee288 <mysql_bin_log>, entry=0x7f28344721c8) at /mariadb/10.2o/sql/log.cc:7641
      #17 0x0000560141bc955c in MYSQL_BIN_LOG::write_transaction_to_binlog (this=0x5601428ee288 <mysql_bin_log>, thd=<optimized out>, cache_mngr=0x0, end_ev=0x7f283a2b1db1 <__GI_raise+321>, all=false, using_stmt_cache=80, using_trx_cache=<optimized out>) at /mariadb/10.2o/sql/log.cc:7315
      #18 0x0000560141bd0a0b in binlog_flush_cache (thd=0x7f27d4000d28, cache_mngr=0x7f27d401dce8, end_ev=0x7f2834472308, all=true, using_stmt=true, using_trx=true) at /mariadb/10.2o/sql/log.cc:1743
      #19 0x0000560141bced24 in binlog_commit_flush_xid_caches (thd=0x7f27d4000d28, cache_mngr=0x7f27d401dce8, all=true, xid=32) at /mariadb/10.2o/sql/log.cc:1851
      #20 MYSQL_BIN_LOG::log_and_order (this=<optimized out>, thd=0x7f27d4000d28, xid=32, all=true, need_prepare_ordered=<optimized out>, need_commit_ordered=<optimized out>) at /mariadb/10.2o/sql/log.cc:9621
      #21 0x0000560141adfb4c in ha_commit_trans (thd=0x7f27d4000d28, all=<optimized out>) at /mariadb/10.2o/sql/handler.cc:1496
      #22 0x0000560141a15a1d in trans_commit (thd=0x7f27d4000d28) at /mariadb/10.2o/sql/transaction.cc:307
      #23 0x000056014191f273 in mysql_execute_command (thd=<optimized out>) at /mariadb/10.2o/sql/sql_parse.cc:5299
      #24 0x0000560141919807 in mysql_parse (thd=0x7f27d4000d28, rawbuf=0x7f27d4010870 "COMMIT", length=<optimized out>, parser_state=0x7f28344736f0, is_com_multi=<optimized out>, is_next_command=false) at /mariadb/10.2o/sql/sql_parse.cc:7733
      #25 0x0000560141915743 in dispatch_command (command=<optimized out>, thd=0x7f27d4000d28, packet=0x7f27d4010870 "COMMIT", packet_length=0, is_com_multi=<optimized out>, is_next_command=<optimized out>) at /mariadb/10.2o/sql/sql_parse.cc:1823
      #26 0x0000560141918105 in do_command (thd=0x7f27d4000d28) at /mariadb/10.2o/sql/sql_parse.cc:1377
      #27 0x0000560141a06cda in do_handle_one_connection (connect=<optimized out>) at /mariadb/10.2o/sql/sql_connect.cc:1336
      #28 0x0000560141a06a53 in handle_one_connection (arg=0x56014526d058) at /mariadb/10.2o/sql/sql_connect.cc:1241
      #29 0x00005601420c1c16 in pfs_spawn_thread (arg=0x5601451d5508) at /mariadb/10.2o/storage/perfschema/pfs.cc:1869
      #30 0x00007f283ad17ea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #31 0x00007f283a373eaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

      Because LOCK_commit_ordered is being acquired outside InnoDB, we can only fix this potential hang by avoiding the thd_lock_wait_timeout() in the lock conflict resolution. It turns out the SAFE_MUTEX warnings will be silenced if we cache the THDVAR() value in trx_t, like we do for a few other session variables in innobase_trx_init().

      This ordering violation might explain some server hangs in replication, such as MDEV-10067.


          Issue Links



              marko Marko Mäkelä
              marko Marko Mäkelä
              0 Vote for this issue
              1 Start watching this issue