Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33136

GCF-1060 test hangs when BF-abort logic mistreats transactions with explicit MDL locks

    XMLWordPrintable

Details

    Description

      The if branch of wsrep_handle_mdl_conflict():

          else if (granted_thd->lex->sql_command == SQLCOM_FLUSH ||
                   granted_thd->mdl_context.has_explicit_locks())
          {
            WSREP_DEBUG("BF thread waiting for FLUSH");
      

      doesn't consider that there may be regular transactions having explicit MDL locks.

      Example output:

      2023-12-28 17:44:17 2 [Note] WSREP: Wsrep_high_priority_service::apply_toi: 1831
      2023-12-28 17:44:17 2 [Note] WSREP: assigned new next query and  trx id: 4379
      T@4    : 17:44:17.637771 Query_log_event::do_apply_event: query: TRUNCATE TABLE t1
      T@4    : 17:44:17.637797 reset_current_stmt_binlog_format_row: debug: temporary_tables: no, in_sub_stmt: no, system_thread: SYSTEM_THREAD_SLAVE_SQL
      2023-12-28 17:44:17 2 [Note] WSREP: MDL conflict·
      schema:  test
      request: (2     seqno 1831  wsrep (toi, exec, committed) cmd 0 8    TRUNCATE TABLE t1)
      granted: (237   seqno -1    wsrep (local, exec, executing) cmd 3 5  INSERT INTO t1 VALUE (4, 'z'))
      2023-12-28 17:44:17 2 [Note] WSREP: MDL ticket: type: MDL_SHARED_WRITE space: TABLE db: test name: t1 (Waiting for table metadata lock)
      2023-12-28 17:44:17 2 [Note] WSREP: BF thread waiting for FLUSH
      2023-12-28 17:44:17 2 [Note] WSREP: MDL ticket: type: MDL_SHARED_WRITE space: TABLE db: test name: t1 (Waiting for table metadata lock)
      

      Fixed debug output printing SQL query in the mentioned if branch results in:

      2023-12-28 20:09:11 2 [Note] WSREP: Wsrep_high_priority_service::apply_toi: 6599
      2023-12-28 20:09:11 2 [Note] WSREP: assigned new next query and  trx id: 16942
      T@4    : 20:09:11.867236 Query_log_event::do_apply_event: query: TRUNCATE TABLE t1
      T@4    : 20:09:11.867266 reset_current_stmt_binlog_format_row: debug: temporary_tables: no, in_sub_stmt: no, system_thread: SYSTEM_THREAD_SLAVE_SQL
      2023-12-28 20:09:11 2 [Note] WSREP: MDL conflict·
      schema:  test
      request: (2     seqno 6599  wsrep (toi, exec, committed) cmd 0 8    TRUNCATE TABLE t1)
      granted: (907   seqno -1    wsrep (local, exec, executing) cmd 3 5  INSERT INTO t1 VALUE (4, 'z'))
      2023-12-28 20:09:11 2 [Note] WSREP: MDL ticket: type: MDL_SHARED_WRITE space: TABLE db: test name: t1 (Waiting for table metadata lock)
      2023-12-28 20:09:11 2 [Note] WSREP: BF thread waiting for INSERT INTO t1 VALUE (4, 'z')
      2023-12-28 20:09:11 2 [Note] WSREP: MDL ticket: type: MDL_SHARED_WRITE space: TABLE db: test name: t1 (Waiting for table metadata lock)
      

      In this case no BF-abort happens as the DML operation INSERT INTO t1 VALUE (4, 'z') holding explicit MDL locks is treated as FLUSH TABLES, which is not the case. This prevents such an operation to be aborted.

      The reason why a DML operation may hold explicit locks is an open question.

      Attachments

        1. backtrace_all
          48 kB
        2. mysqld.1.err
          2.48 MB
        3. mysqld.2.err
          2.33 MB

        Issue Links

          Activity

            People

              sysprg Julius Goryavsky
              denis.protivensky Denis Protivensky
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.