Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30423

Deadlock on Replica during BACKUP STAGE BLOCK_COMMIT on XA transactions

    XMLWordPrintable

Details

    Description

      Note: This bug fix is not complete. To get a complete fix for this issue, MDEV-35110 also needs to be fixed

      We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11

      show processlist;

      | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT ... | 0.000 |
      | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 |
      | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 |
      | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 |
      | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 |
      | 5114 | ...... | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 |
      | 715112 | ..oper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 |
      | 724545 | ....frm | 10.93.97.49:44948 | ....frm | Query | 2291 | Waiting for backup lock | XA ROLLBACK ... | 0.000 |
      | 751381 | ....frm | 10.93.97.50:46208 | ....frm | Query | 1310 | Waiting for backup lock | XA ROLLBACK ... | 0.000 |
      | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 |
      

      Show replica status\G
       
           Connection_name: 
                     Slave_SQL_State: Slave has read all relay log; waiting for more updates
                      Slave_IO_State: Waiting for master to send event
                         Master_Host: 10.93.99.101
                         Master_User: ......
                         Master_Port: 6603
                       Connect_Retry: 10
                     Master_Log_File: bin_log.001019
                 Read_Master_Log_Pos: 30985295
                      Relay_Log_File: relay_log.000131
                       Relay_Log_Pos: 2570503
               Relay_Master_Log_File: bin_log.001019
                    Slave_IO_Running: Yes
                   Slave_SQL_Running: Yes
                     Replicate_Do_DB: 
                 Replicate_Ignore_DB: 
                  Replicate_Do_Table: 
              Replicate_Ignore_Table: 
             Replicate_Wild_Do_Table: 
         Replicate_Wild_Ignore_Table: 
                          Last_Errno: 0
                          Last_Error: 
                        Skip_Counter: 0
                 Exec_Master_Log_Pos: 2570206
                     Relay_Log_Space: 31275705
                     Until_Condition: None
                      Until_Log_File: 
                       Until_Log_Pos: 0
                  Master_SSL_Allowed: No
                  Master_SSL_CA_File: 
                  Master_SSL_CA_Path: 
                     Master_SSL_Cert: 
                   Master_SSL_Cipher: 
                      Master_SSL_Key: 
               Seconds_Behind_Master: 162
       Master_SSL_Verify_Server_Cert: No
                       Last_IO_Errno: 0
                       Last_IO_Error: 
                      Last_SQL_Errno: 0
                      Last_SQL_Error: 
         Replicate_Ignore_Server_Ids: 
                    Master_Server_Id: 2
                      Master_SSL_Crl: 
                  Master_SSL_Crlpath: 
                          Using_Gtid: Slave_Pos
                         Gtid_IO_Pos: 1-2-6301093730
             Replicate_Do_Domain_Ids: 
         Replicate_Ignore_Domain_Ids: 
                       Parallel_Mode: optimistic
                           SQL_Delay: 0
                 SQL_Remaining_Delay: NULL
             Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
                    Slave_DDL_Groups: 46
      Slave_Non_Transactional_Groups: 19
          Slave_Transactional_Groups: 23249343
                Retried_transactions: 1
                  Max_relay_log_size: 268435456
                Executed_log_entries: 86984839
           Slave_received_heartbeats: 78647
              Slave_heartbeat_period: 5.000
                      Gtid_Slave_Pos: 1-2-6301025975
      

      +------------------------------------------+
      | WhoLocksWho                              |
      +------------------------------------------+
      | Thread 715112 IS LOCKED BY Thread 715112 |
      | Thread 715112 IS LOCKED BY Thread 3994   |
      | Thread 715112 IS LOCKED BY Thread 3993   |
      | Thread 3993 IS LOCKED BY Thread 715112   |
      | Thread 3993 IS LOCKED BY Thread 3994     |
      | Thread 3993 IS LOCKED BY Thread 3993     |
      +------------------------------------------+
      

      Attachments

        Issue Links

          Activity

            People

              monty Michael Widenius
              pandi.gurusamy Pandikrishnan Gurusamy
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.