Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33798

ROW base optimistic deadlock with concurrent writes on same table row and multi domain

Details

    Description

      Following Brandon advice to open a ticket after Zulip (the initial conversation was done via DM, but an official Zulip topic has been created for this since with more details (e.g. variables, show slave status, etc): link)

      Issue:
      Replication sql thread stopped in optimistic but ok in conservative

      Context:
      Concurrent writes on 2 tables with same name but different DB

      Workaround
      Moving to conservative fixed the replication sql thread regular stop

      select * from api_check;
      +----+---------------------+
      | id | last_check |
      +----+---------------------+
      | 1 | 2024-03-29 09:49:10 |
      +----+---------------------+
      1 row in set (0.001 sec)
       
      MariaDB [mixr_user]> show create table api_check;
      +-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Table | Create Table |
      +-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | api_check | CREATE TABLE api_check (
      id tinyint(3) unsigned NOT NULL,
      last_check datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
      PRIMARY KEY (id)
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
      +-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      1 row in set (0.000 sec)
      

      Processlist:

      | 3455587 | system user | | wpmixrgen | Slave_worker | 20 | Update_rows_log_event::find_row(-1) on table api_check | update api_check set last_check = '2024-03-29 00:34:28' where id = 1 | 0.000 |
      | 3455588 | system user | | mixr_user | Slave_worker | 20 | Update_rows_log_event::find_row(-1) on table api_check | update api_check set last_check = '2024-03-29 00:34:27' where id = 1 | 0.000 |
      |
      

      No views in the middle

      MariaDB [wpmixrgen]> show full tables like 'api_check';
      +---------------------------------+------------+
      | Tables_in_wpmixrgen (api_check) | Table_type |
      +---------------------------------+------------+
      | api_check                       | BASE TABLE |
      +---------------------------------+------------+
      1 row in set (0.002 sec)
      Database changed
      MariaDB [mixr_user]> show full tables like 'api_check';
      +---------------------------------+------------+
      | Tables_in_mixr_user (api_check) | Table_type |
      +---------------------------------+------------+
      | api_check                       | BASE TABLE |
      +---------------------------------+------------+
      1 row in set (0.001 sec)
      

      Bonus:
      Stop all slaves could stale when in such deadlock

      | 3460647 | root | 10.48.0.1:55828 | NULL | Killed | 186 | Killing slave | stop all slaves
      

      Metadata lock state:

       
      +-----------+----------------------+---------------+---------------------+--------------+----------------+
      | THREAD_ID | LOCK_MODE | LOCK_DURATION | LOCK_TYPE | TABLE_SCHEMA | TABLE_NAME |
      +-----------+----------------------+---------------+---------------------+--------------+----------------+
      | 3470443 | MDL_BACKUP_TRANS_DML | NULL | Backup lock | | |
      | 3470438 | MDL_BACKUP_TRANS_DML | NULL | Backup lock | | |
      | 3470445 | MDL_SHARED_WRITE | NULL | Table metadata lock | mixr_user | api_check |
      | 3470438 | MDL_SHARED_WRITE | NULL | Table metadata lock | mixr_user | api_check |
      | 3470442 | MDL_SHARED_WRITE | NULL | Table metadata lock | wpmixrgen | api_check |
      | 3470443 | MDL_SHARED_WRITE | NULL | Table metadata lock | wpmixrgen | api_check |
      | 3470442 | MDL_SHARED_WRITE | NULL | Table metadata lock | mysql | gtid_slave_pos |
      | 3470445 | MDL_SHARED_WRITE | NULL | Table metadata lock | mysql | gtid_slave_pos |
      | 3470444 | MDL_SHARED_WRITE | NULL | Table metadata lock | mysql | gtid_slave_pos |
      | 3470444 | MDL_SHARED_WRITE | NULL | Table metadata lock | wpmixrgen | schedule_log |
      +-----------+----------------------+---------------+---------------------+--------------+----------------+
      10 rows in set (0.001 sec)
      

      INNODB_TRX:

      trx_id: 2669046603
      trx_state: LOCK WAIT
      trx_started: 2024-03-29 15:48:18
      trx_requested_lock_id: 2669046603:27938:3:2
      trx_wait_started: 2024-03-29 15:48:18
      trx_weight: 2
      trx_mysql_thread_id: 3470443
      trx_query: update api_check set last_check = '2024-03-29 04:44:09' where id = 1
      trx_operation_state: starting index read
      trx_tables_in_use: 1
      trx_tables_locked: 1
      trx_lock_structs: 2
      trx_lock_memory_bytes: 1128
      trx_rows_locked: 1
      trx_rows_modified: 0
      trx_concurrency_tickets: 0
      trx_isolation_level: REPEATABLE READ
      trx_unique_checks: 1
      trx_foreign_key_checks: 1
      trx_last_foreign_key_error: NULL
      trx_is_read_only: 0
      trx_autocommit_non_locking: 0
      ---
      trx_id: 2669046602
                       trx_state: LOCK WAIT
                     trx_started: 2024-03-29 15:48:18
           trx_requested_lock_id: 2669046602:27887:3:2
                trx_wait_started: 2024-03-29 15:48:18
                      trx_weight: 2
             trx_mysql_thread_id: 3470438
                       trx_query: update `api_check` set `last_check` = '2024-03-29 04:44:13' where `id` = 1
             trx_operation_state: starting index read
               trx_tables_in_use: 1
               trx_tables_locked: 1
                trx_lock_structs: 2
           trx_lock_memory_bytes: 1128
                 trx_rows_locked: 1
               trx_rows_modified: 0
         trx_concurrency_tickets: 0
             trx_isolation_level: REPEATABLE READ
               trx_unique_checks: 1
          trx_foreign_key_checks: 1
      trx_last_foreign_key_error: NULL
                trx_is_read_only: 0
      trx_autocommit_non_locking: 0
      
      

      InnoDB Status:

      MariaDB [information_schema]> show engine innodb status\G
      *************************** 1. row ***************************
        Type: InnoDB
        Name:
      Status:
      =====================================
      2024-03-29 15:51:38 0x7f696c107640 INNODB MONITOR OUTPUT
      =====================================
      Per second averages calculated from the last 1 seconds
      -----------------
      BACKGROUND THREAD
      -----------------
      srv_master_thread loops: 2 srv_active, 0 srv_shutdown, 3704919 srv_idle
      srv_master_thread log flush and writes: 3704803
      ----------
      SEMAPHORES
      ----------
      ------------
      TRANSACTIONS
      ------------
      Trx id counter 2669046610
      Purge done for trx's n:o < 2669046590 undo n:o < 0 state: running
      History list length 2
      LIST OF TRANSACTIONS FOR EACH SESSION:
      ---TRANSACTION 2669046609, ACTIVE 50 sec starting index read
      mysql tables in use 1, locked 1
      LOCK WAIT 2 lock struct(s), heap size 1128, 1 row lock(s)
      MariaDB thread id 3470438, OS thread handle 140090653918784, query id 380459332 Update_rows_log_event::find_row(-1) on table `api_check`
      update `api_check` set `last_check` = '2024-03-29 04:44:13' where `id` = 1
      ------- TRX HAS BEEN WAITING 49554452 ns FOR THIS LOCK TO BE GRANTED:
      RECORD LOCKS space id 27887 page no 3 n bits 8 index PRIMARY of table `mixr_user`.`api_check` trx id 2669046609 lock_mode X locks rec but not gap waiting
      Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
       0: len 1; hex 01; asc  ;;
       1: len 6; hex 00009f166b2e; asc     k.;;
       2: len 7; hex 350000c0030110; asc 5      ;;
       3: len 5; hex 99b2fa4b0e; asc    K ;;
       
      ------------------
      ---TRANSACTION (0x7f6aa4544380), not started
      0 lock struct(s), heap size 1128, 0 row lock(s)
      ---TRANSACTION (0x7f6aa4543880), not started
      0 lock struct(s), heap size 1128, 0 row lock(s)
      ---TRANSACTION (0x7f6aa4542d80), not started
      0 lock struct(s), heap size 1128, 0 row lock(s)
      ---TRANSACTION 2669046575, ACTIVE (PREPARED) 550 sec
      3 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 2
      MariaDB thread id 3470442, OS thread handle 140090649925184, query id 380457775 Waiting for prior transaction to commit
      ---TRANSACTION 2669046608, ACTIVE 50 sec starting index read
      mysql tables in use 1, locked 1
      LOCK WAIT 2 lock struct(s), heap size 1128, 1 row lock(s)
      MariaDB thread id 3470443, OS thread handle 140090649310784, query id 380459331 Update_rows_log_event::find_row(-1) on table `api_check`
      update `api_check` set `last_check` = '2024-03-29 04:44:09' where `id` = 1
      ------- TRX HAS BEEN WAITING 49554452 ns FOR THIS LOCK TO BE GRANTED:
      RECORD LOCKS space id 27938 page no 3 n bits 8 index PRIMARY of table `wpmixrgen`.`api_check` trx id 2669046608 lock_mode X locks rec but not gap waiting
      Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
       0: len 1; hex 01; asc  ;;
       1: len 6; hex 00009f166b2f; asc     k/;;
       2: len 7; hex 360000c0020110; asc 6      ;;
       3: len 5; hex 99b2fa4b0d; asc    K ;;
       
      ------------------
      ---TRANSACTION 2669046585, ACTIVE (PREPARED) 550 sec
      2 lock struct(s), heap size 1128, 0 row lock(s), undo log entries 2
      MariaDB thread id 3470444, OS thread handle 140090649003584, query id 380457783 Waiting for prior transaction to commit
      ---TRANSACTION 2669046574, ACTIVE (PREPARED) 550 sec
      3 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 2
      MariaDB thread id 3470445, OS thread handle 140090648389184, query id 380457773 Waiting for prior transaction to commit
      ---TRANSACTION (0x7f6aa453f680), not started
      0 lock struct(s), heap size 1128, 0 row lock(s)
      ---TRANSACTION (0x7f6aa453eb80), not started
      0 lock struct(s), heap size 1128, 0 row lock(s)
      --------
      FILE I/O
      --------
      Pending flushes (fsync): 0
      73352 OS file reads, 300770765 OS file writes, 2527252 OS fsyncs
      0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
      ---
      LOG
      ---
      Log sequence number 3210682942432
      Log flushed up to   3210682942432
      Pages flushed up to 3210682942432
      Last checkpoint at  3210682942408
      ----------------------
      BUFFER POOL AND MEMORY
      ----------------------
      Total large memory allocated 3808428032
      Dictionary memory allocated 23218440
      Buffer pool size   228160
      Free buffers       129541
      Database pages     98619
      Old database pages 36423
      Modified db pages  0
      Percent of dirty pages(LRU & free pages): 0.000
      Max dirty pages percent: 40.000
      Pending reads 0
      Pending writes: LRU 0, flush list 0
      Pages made young 170619, not young 32960930
      0.00 youngs/s, 0.00 non-youngs/s
      Pages read 72866, created 30503592, written 115666167
      0.00 reads/s, 0.00 creates/s, 0.00 writes/s
      Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
      Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
      LRU len: 98619, unzip_LRU len: 0
      I/O sum[0]:cur[0], unzip sum[0]:cur[0]
      --------------
      ROW OPERATIONS
      --------------
      0 read views open inside InnoDB
      state: sleeping
      ----------------------------
      END OF INNODB MONITOR OUTPUT
      ============================
       
      1 row in set (0.001 sec)
      

      Attachments

        Issue Links

          Activity

            The master is 10.3.32-MariaDB and possibly affected by MDEV-21953

            stephane@skysql.com VAROQUI Stephane added a comment - The master is 10.3.32-MariaDB and possibly affected by MDEV-21953

            After upgrading the master to 10.11.7 we still observe the deadlock

            stephane@skysql.com VAROQUI Stephane added a comment - After upgrading the master to 10.11.7 we still observe the deadlock

            Thanks to Stephane's helpful assistance with reproducing this on a test server, I found the root cause for this.

            The problem occurs when there is a conflict between transactions that are replicated in different domain_id and/or different master connections. The deadlock kill mechanism is currently disabled when either of these differ.

            In the test environment, this caused a hang because T1 and T2 both modified the same table, but had different domain id. T1 would do a row lock wait on T2, and T2 a wait_for_prior_commit on T1 (GTID was disabled on the test server).

            I believe the production issue may be slightly different due to using GTID, but can hang with transactions T1,T2 in one domain/master connection and transactions U1, U2 in another:

            T1 row low wait on U2
            U2 wait_for_prior_commit on U1
            U1 row lock wait on T2
            T2 wait_for_prior_commit on T1

            My idea for a fix is that we can extend to deadlock kill T2 and U2 in these cases. In general, we will deadlock kill a transaction if it blocks a transaction in a different domain/master connection, if that transaction is being speculatively replicated (SPECULATE_OPTIMISTIC).

            This should prevent these kinds of hangs, because a blocking transaction will now be deadlock killed whenever a cycle occurs where wait_for_prior_commit is called while holding row locks.

            In the above example, killing both T2 and U2 is overly conservative (just one of them would be sufficient). But conflicts between different domains are expected to be rare (as they are technically not supported / are not replicated in deterministic order), so this I think is not an issue.

            I have pushed an initial patch along these ideas to branch bb-10.11-MDEV-33798-knielsen-pkgtest

            knielsen Kristian Nielsen added a comment - Thanks to Stephane's helpful assistance with reproducing this on a test server, I found the root cause for this. The problem occurs when there is a conflict between transactions that are replicated in different domain_id and/or different master connections. The deadlock kill mechanism is currently disabled when either of these differ. In the test environment, this caused a hang because T1 and T2 both modified the same table, but had different domain id. T1 would do a row lock wait on T2, and T2 a wait_for_prior_commit on T1 (GTID was disabled on the test server). I believe the production issue may be slightly different due to using GTID, but can hang with transactions T1,T2 in one domain/master connection and transactions U1, U2 in another: T1 row low wait on U2 U2 wait_for_prior_commit on U1 U1 row lock wait on T2 T2 wait_for_prior_commit on T1 My idea for a fix is that we can extend to deadlock kill T2 and U2 in these cases. In general, we will deadlock kill a transaction if it blocks a transaction in a different domain/master connection, if that transaction is being speculatively replicated (SPECULATE_OPTIMISTIC). This should prevent these kinds of hangs, because a blocking transaction will now be deadlock killed whenever a cycle occurs where wait_for_prior_commit is called while holding row locks. In the above example, killing both T2 and U2 is overly conservative (just one of them would be sufficient). But conflicts between different domains are expected to be rare (as they are technically not supported / are not replicated in deterministic order), so this I think is not an issue. I have pushed an initial patch along these ideas to branch bb-10.11- MDEV-33798 -knielsen-pkgtest
            danblack Daniel Black added a comment -

            Packages containing this fix can be obtained using the the .repo / .list files under the appropriate platform under the URL https://ci.mariadb.org/46168/

            A container containing this fix can be obtained as: quay.io/mariadb-foundation/mariadb-devel:10.11-mdev-33798-knielsen-pkgtest

            Feedback for knielsen would be much appreciated.

            danblack Daniel Black added a comment - Packages containing this fix can be obtained using the the .repo / .list files under the appropriate platform under the URL https://ci.mariadb.org/46168/ A container containing this fix can be obtained as: quay.io/mariadb-foundation/mariadb-devel:10.11-mdev-33798-knielsen-pkgtest Feedback for knielsen would be much appreciated.

            @Daniel Black in production looks good so far let's wait 1 week or 2, so great to have quay

            stephane@skysql.com VAROQUI Stephane added a comment - @Daniel Black in production looks good so far let's wait 1 week or 2, so great to have quay

            I've pushed a fix for this to 10.6, but we can keep it open while we wait and see if the patch runs stable in Stephane's environment.

            knielsen Kristian Nielsen added a comment - I've pushed a fix for this to 10.6, but we can keep it open while we wait and see if the patch runs stable in Stephane's environment.

            People

              knielsen Kristian Nielsen
              stephane@skysql.com VAROQUI Stephane
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.