Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Incomplete
-
10.6.11
-
None
Description
Note: This bug fix is not complete. To get a complete fix for this issue, MDEV-35110 also needs to be fixed
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11
show processlist;
| 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT ... | 0.000 |
|
| 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 |
|
| 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 |
|
| 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 |
|
| 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 |
|
| 5114 | ...... | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 |
|
| 715112 | ..oper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 |
|
| 724545 | ....frm | 10.93.97.49:44948 | ....frm | Query | 2291 | Waiting for backup lock | XA ROLLBACK ... | 0.000 |
|
| 751381 | ....frm | 10.93.97.50:46208 | ....frm | Query | 1310 | Waiting for backup lock | XA ROLLBACK ... | 0.000 |
|
| 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 |
|
Show replica status\G
|
|
Connection_name:
|
Slave_SQL_State: Slave has read all relay log; waiting for more updates
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 10.93.99.101
|
Master_User: ......
|
Master_Port: 6603
|
Connect_Retry: 10
|
Master_Log_File: bin_log.001019
|
Read_Master_Log_Pos: 30985295
|
Relay_Log_File: relay_log.000131
|
Relay_Log_Pos: 2570503
|
Relay_Master_Log_File: bin_log.001019
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 2570206
|
Relay_Log_Space: 31275705
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 162
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-2-6301093730
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: optimistic
|
SQL_Delay: 0
|
SQL_Remaining_Delay: NULL
|
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
|
Slave_DDL_Groups: 46
|
Slave_Non_Transactional_Groups: 19
|
Slave_Transactional_Groups: 23249343
|
Retried_transactions: 1
|
Max_relay_log_size: 268435456
|
Executed_log_entries: 86984839
|
Slave_received_heartbeats: 78647
|
Slave_heartbeat_period: 5.000
|
Gtid_Slave_Pos: 1-2-6301025975
|
+------------------------------------------+
|
| WhoLocksWho |
|
+------------------------------------------+
|
| Thread 715112 IS LOCKED BY Thread 715112 |
|
| Thread 715112 IS LOCKED BY Thread 3994 |
|
| Thread 715112 IS LOCKED BY Thread 3993 |
|
| Thread 3993 IS LOCKED BY Thread 715112 |
|
| Thread 3993 IS LOCKED BY Thread 3994 |
|
| Thread 3993 IS LOCKED BY Thread 3993 |
|
+------------------------------------------+
|
Attachments
Issue Links
- is part of
-
MDEV-35110 Deadlock on Replica during BACKUP STAGE BLOCK_COMMIT on XA transactions
-
- Closed
-
- relates to
-
MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel replication
-
- Closed
-
-
MDEV-33921 Replication fails when XA transactions are used where the slave has replicate_do_db set and the client has touched a different database when running DML such as inserts.
-
- Closed
-
- split to
-
MDEV-30459 XID_cache_element can be modified after deletion
-
- Open
-
Activity
Field | Original Value | New Value |
---|---|---|
Link |
This issue blocks |
Summary | Deadlock on Slave during BACKUP STAGE BLOCK_COMMIT on XA transactions | Deadlock on Replica during BACKUP STAGE BLOCK_COMMIT on XA transactions |
Assignee | Andrei Elkin [ elkin ] |
Link |
This issue relates to |
Description |
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck.
show processlist; ================================================== | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT X'00000000000000000000ffff0a5d63a10605f71c63bef4d3002d89d931',X'00000000000000000000ffff0a | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | repuser | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | myoper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | bankfrm | 10.93.97.49:44948 | bankfrm | Query | 2291 | Waiting for backup lock | XA ROLLBACK 0x00000000000000000000ffff0a5d639e1845417663bef4d2002d4ffd31,0x00000000000000000000ffff0 | 0.000 | | 751381 | bankfrm | 10.93.97.50:46208 | bankfrm | Query | 1310 | Waiting for backup lock | XA ROLLBACK 0x00000000000000000000FFFF0A5D61255072B02963BEF55C002F14D131,0x00000000000000000000FFFF0 | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | ================================================== Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: repuser Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 ================================================== +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ ============================================= *********************************************************************************************************** 20230112_033302 : show engine innodb status *************************** 1. row *************************** Type: InnoDB Name: Status: ===================================== 2023-01-12 03:33:02 0x7fac80a1d700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 0 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops: 0 srv_active, 0 srv_shutdown, 476268 srv_idle srv_master_thread log flush and writes: 476243 ---------- SEMAPHORES ---------- ------------ TRANSACTIONS ------------ Trx id counter 12672207462 Purge done for trx's n:o < 12672207461 undo n:o < 0 state: running but idle History list length 0 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 12672207325, ACTIVE 129 sec 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 MariaDB thread id 3995, OS thread handle 140387326506752, query id 65321900 Waiting for prior transaction to commit ---TRANSACTION 12672207307, ACTIVE 129 sec 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 MariaDB thread id 3997, OS thread handle 140387327047424, query id 65321892 Waiting for prior transaction to commit ---TRANSACTION 12672207456, ACTIVE 129 sec 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 MariaDB thread id 3992, OS thread handle 140378919474944, query id 65321958 Waiting for prior transaction to commit ---TRANSACTION 12672207461, ACTIVE 129 sec mysql tables in use 1, locked 1 1 lock struct(s), heap size 1128, 0 row lock(s), undo log entries 1 MariaDB thread id 3993, OS thread handle 140387326236416, query id 65321959 Waiting for backup lock ---TRANSACTION 12672207049, ACTIVE (PREPARED) 129 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672207040, ACTIVE (PREPARED) 129 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672207037, ACTIVE (PREPARED) 129 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672207034, ACTIVE (PREPARED) 129 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672207185, ACTIVE (PREPARED) 129 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672207029, ACTIVE (PREPARED) 129 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672207025, ACTIVE (PREPARED) 129 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION (0x7fb2b42a6c80), not started 0 lock struct(s), heap size 1128, 0 row lock(s) ---TRANSACTION 12672206985, ACTIVE (PREPARED) 130 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672206978, ACTIVE (PREPARED) 130 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672207428, ACTIVE (PREPARED) 129 sec 3 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 2 MariaDB thread id 3996, OS thread handle 140387327317760, query id 65321941 Waiting for prior transaction to commit ---TRANSACTION 12672206714, ACTIVE (PREPARED) 130 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672206696, ACTIVE (PREPARED) 130 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672207266, ACTIVE (PREPARED) 129 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672206663, ACTIVE (PREPARED) 130 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672206637, ACTIVE (PREPARED) 130 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672206543, ACTIVE (PREPARED) 130 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672206501, ACTIVE (PREPARED) 130 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672206498, ACTIVE (PREPARED) 130 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION 12672207022, ACTIVE (PREPARED) 129 sec recovered trx 2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1 ---TRANSACTION (0x7fb2b42a5680), not started 0 lock struct(s), heap size 1128, 0 row lock(s) ---TRANSACTION (0x7fb2b42a4b80), not started 0 lock struct(s), heap size 1128, 0 row lock(s) -------- FILE I/O -------- Pending flushes (fsync) log: 0; buffer pool: 0 425519 OS file reads, 34974886 OS file writes, 34851466 OS fsyncs 0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 1, free list len 56, seg size 58, 0 merges merged operations: insert 0, delete mark 0, delete 0 discarded operations: insert 0, delete mark 0, delete 0 0.00 hash searches/s, 0.00 non-hash searches/s --- LOG --- Log sequence number 6625444758596 Log flushed up to 6625444758596 Pages flushed up to 6625270847160 Last checkpoint at 6625270847160 0 pending log flushes, 0 pending chkp writes 34849934 log i/o's done, 0.00 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total large memory allocated 17314086912 Dictionary memory allocated 106833720 Buffer pool size 1038336 Free buffers 592737 Database pages 445599 Old database pages 164468 Modified db pages 20616 Percent of dirty pages(LRU & free pages): 1.985 Max dirty pages percent: 50.000 Pending reads 0 Pending writes: LRU 0, flush list 0 Pages made young 877, not young 19 0.00 youngs/s, 0.00 non-youngs/s Pages read 425248, created 35045, written 124879 0.00 reads/s, 0.00 creates/s, 0.00 writes/s No buffer pool page gets since the last printout Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s LRU len: 445599, unzip_LRU len: 0 I/O sum[0]:cur[0], unzip sum[0]:cur[0] -------------- ROW OPERATIONS -------------- 0 read views open inside InnoDB Process ID=0, Main thread ID=0, state: sleeping Number of rows inserted 3578883, updated 10127835, deleted 3569689, read 112264653 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s Number of system rows inserted 23249404, updated 0, deleted 23249342, read 23249343 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ |
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11
show processlist; ================================================== | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT X'00000000000000000000ffff0a5d63a10605f71c63bef4d3002d89d931',X'00000000000000000000ffff0a | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | repuser | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | myoper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | bankfrm | 10.93.97.49:44948 | bankfrm | Query | 2291 | Waiting for backup lock | XA ROLLBACK 0x00000000000000000000ffff0a5d639e1845417663bef4d2002d4ffd31,0x00000000000000000000ffff0 | 0.000 | | 751381 | bankfrm | 10.93.97.50:46208 | bankfrm | Query | 1310 | Waiting for backup lock | XA ROLLBACK 0x00000000000000000000FFFF0A5D61255072B02963BEF55C002F14D131,0x00000000000000000000FFFF0 | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | ================================================== Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: repuser Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 ================================================== +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ |
Link |
This issue blocks |
Attachment | perf_mon_20230112_033302.log [ 67793 ] |
Component/s | Replication [ 14976 ] | |
Component/s | Replication [ 10100 ] | |
Key |
|
|
Issue Type | Task [ 3 ] | Bug [ 1 ] |
Project | MariaDB Server [ 10000 ] | MariaDB Enterprise [ 11500 ] |
Affects Version/s | 10.6.11-6 [ 28428 ] |
Fix Version/s | 10.6 [ 24027 ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
Component/s | Replication [ 10100 ] | |
Component/s | Replication [ 14976 ] | |
Fix Version/s | 10.6 [ 24028 ] | |
Fix Version/s | 10.6 [ 24027 ] | |
Key |
|
|
Affects Version/s | 10.6.11-6 [ 28428 ] | |
Project | MariaDB Enterprise [ 11500 ] | MariaDB Server [ 10000 ] |
Affects Version/s | 10.6.11 [ 28441 ] |
Priority | Major [ 3 ] | Blocker [ 1 ] |
Description |
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11
show processlist; ================================================== | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT X'00000000000000000000ffff0a5d63a10605f71c63bef4d3002d89d931',X'00000000000000000000ffff0a | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | repuser | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | myoper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | bankfrm | 10.93.97.49:44948 | bankfrm | Query | 2291 | Waiting for backup lock | XA ROLLBACK 0x00000000000000000000ffff0a5d639e1845417663bef4d2002d4ffd31,0x00000000000000000000ffff0 | 0.000 | | 751381 | bankfrm | 10.93.97.50:46208 | bankfrm | Query | 1310 | Waiting for backup lock | XA ROLLBACK 0x00000000000000000000FFFF0A5D61255072B02963BEF55C002F14D131,0x00000000000000000000FFFF0 | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | ================================================== Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: repuser Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 ================================================== +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ |
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11
show processlist; ================================================== | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT ... | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | repuser | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | myoper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | bankfrm | 10.93.97.49:44948 | bankfrm | Query | 2291 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 751381 | bankfrm | 10.93.97.50:46208 | bankfrm | Query | 1310 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | ================================================== Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: repuser Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 ================================================== +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ |
Description |
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11
show processlist; ================================================== | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT ... | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | repuser | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | myoper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | bankfrm | 10.93.97.49:44948 | bankfrm | Query | 2291 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 751381 | bankfrm | 10.93.97.50:46208 | bankfrm | Query | 1310 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | ================================================== Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: repuser Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 ================================================== +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ |
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11
show processlist; ================================================== | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT ... | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | ...... | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | ..oper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | ....frm | 10.93.97.49:44948 | bankfrm | Query | 2291 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 751381 | ....frm | 10.93.97.50:46208 | bankfrm | Query | 1310 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | ================================================== Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: ...... Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 ================================================== +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ |
Description |
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11
show processlist; ================================================== | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT ... | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | ...... | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | ..oper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | ....frm | 10.93.97.49:44948 | bankfrm | Query | 2291 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 751381 | ....frm | 10.93.97.50:46208 | bankfrm | Query | 1310 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | ================================================== Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: ...... Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 ================================================== +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ |
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11
show processlist; ================================================== | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT ... | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | ...... | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | ..oper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | ....frm | 10.93.97.49:44948 | ....frm | Query | 2291 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 751381 | ....frm | 10.93.97.50:46208 | ....frm | Query | 1310 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | ================================================== Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: ...... Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 ================================================== +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ |
Assignee | Andrei Elkin [ elkin ] | Brandon Nesterenko [ JIRAUSER48702 ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Assignee | Brandon Nesterenko [ JIRAUSER48702 ] | Andrei Elkin [ elkin ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Fix Version/s | 10.5.19 [ 28511 ] | |
Fix Version/s | 10.6.12 [ 28513 ] | |
Fix Version/s | 10.7.8 [ 28515 ] | |
Fix Version/s | 10.8.7 [ 28517 ] | |
Fix Version/s | 10.9.5 [ 28519 ] | |
Fix Version/s | 10.10.3 [ 28521 ] | |
Fix Version/s | 10.6 [ 24028 ] | |
Resolution | Fixed [ 1 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Link | This issue split to MDEV-30459 [ MDEV-30459 ] |
Link | This issue blocks MENT-1713 [ MENT-1713 ] |
Link |
This issue relates to |
Zendesk Related Tickets | 115837 |
Assignee | Andrei Elkin [ elkin ] | Michael Widenius [ monty ] |
Resolution | Fixed [ 1 ] | |
Status | Closed [ 6 ] | Stalled [ 10000 ] |
Description |
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11
show processlist; ================================================== | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT ... | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | ...... | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | ..oper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | ....frm | 10.93.97.49:44948 | ....frm | Query | 2291 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 751381 | ....frm | 10.93.97.50:46208 | ....frm | Query | 1310 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | ================================================== Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: ...... Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 ================================================== +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ |
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11
show processlist; {noformat} | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT ... | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | ...... | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | ..oper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | ....frm | 10.93.97.49:44948 | ....frm | Query | 2291 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 751381 | ....frm | 10.93.97.50:46208 | ....frm | Query | 1310 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | {noformat} {noformat} Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: ...... Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 {noformat} {noformat} +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ {noformat} |
Link |
This issue is part of |
Description |
We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11
show processlist; {noformat} | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT ... | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | ...... | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | ..oper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | ....frm | 10.93.97.49:44948 | ....frm | Query | 2291 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 751381 | ....frm | 10.93.97.50:46208 | ....frm | Query | 1310 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | {noformat} {noformat} Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: ...... Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 {noformat} {noformat} +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ {noformat} |
Note: This bug fix is not complete. To get a complete fix for this issue, We are seeing deadlocks on slave sql thread, during the backup, it causes the slave_sql_thread to stuck. Affected version is 10.6.11 show processlist; {noformat} | 3994 | system user | | bankfrm | Slave_worker | 47515 | Waiting for prior transaction to commit | XA COMMIT ... | 0.000 | | 3996 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3995 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3997 | system user | | NULL | Slave_worker | 47515 | Waiting for prior transaction to commit | NULL | 0.000 | | 3991 | system user | | NULL | Slave_SQL | 44523 | Waiting for room in worker thread event queue | NULL | 0.000 | | 5114 | ...... | 10.93.99.158:52012 | NULL | Query | 0 | Optimizing | SELECT Event_schema, Event_name FROM information_schema.EVENTS WHERE Status = 'ENABLED' | 0.000 | | 715112 | ..oper | localhost | NULL | Query | 47515 | Waiting for backup lock | BACKUP STAGE BLOCK_COMMIT | 0.000 | | 724545 | ....frm | 10.93.97.49:44948 | ....frm | Query | 2291 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 751381 | ....frm | 10.93.97.50:46208 | ....frm | Query | 1310 | Waiting for backup lock | XA ROLLBACK ... | 0.000 | | 752581 | myoper | localhost | NULL | Query | 0 | starting | show processlist | 0.000 | {noformat} {noformat} Show replica status\G Connection_name: Slave_SQL_State: Slave has read all relay log; waiting for more updates Slave_IO_State: Waiting for master to send event Master_Host: 10.93.99.101 Master_User: ...... Master_Port: 6603 Connect_Retry: 10 Master_Log_File: bin_log.001019 Read_Master_Log_Pos: 30985295 Relay_Log_File: relay_log.000131 Relay_Log_Pos: 2570503 Relay_Master_Log_File: bin_log.001019 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2570206 Relay_Log_Space: 31275705 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 162 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-2-6301093730 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Slave_DDL_Groups: 46 Slave_Non_Transactional_Groups: 19 Slave_Transactional_Groups: 23249343 Retried_transactions: 1 Max_relay_log_size: 268435456 Executed_log_entries: 86984839 Slave_received_heartbeats: 78647 Slave_heartbeat_period: 5.000 Gtid_Slave_Pos: 1-2-6301025975 {noformat} {noformat} +------------------------------------------+ | WhoLocksWho | +------------------------------------------+ | Thread 715112 IS LOCKED BY Thread 715112 | | Thread 715112 IS LOCKED BY Thread 3994 | | Thread 715112 IS LOCKED BY Thread 3993 | | Thread 3993 IS LOCKED BY Thread 715112 | | Thread 3993 IS LOCKED BY Thread 3994 | | Thread 3993 IS LOCKED BY Thread 3993 | +------------------------------------------+ {noformat} |
Resolution | Incomplete [ 4 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Howdy Brandon.
The patch is pushed {{012c8120399 HEAD -> bb-10.5-andrei }} having passed
only regression tests.
Please take on review sooner while I'll be watching BB processing.
Cheers,
Andrei