[MDEV-21953] deadlock between BACKUP STAGE BLOCK_COMMIT and parallel replication Created: 2020-03-16  Updated: 2023-01-23  Resolved: 2020-07-21

Status: Closed
Project: MariaDB Server
Component/s: mariabackup, Replication
Affects Version/s: 10.4.12
Fix Version/s: 10.4.14, 10.5.5

Type: Bug Priority: Critical
Reporter: Rick Pizzi Assignee: Michael Widenius
Resolution: Fixed Votes: 6
Labels: None

Issue Links:
Relates
relates to MDEV-23586 Mariabackup: GTID saved for replicati... Closed
relates to MDEV-30423 Deadlock on Replica during BACKUP STA... Closed

 Description   

It looks like there is a race condition between BACKUP STAGE BLOCK_COMMIT in mariabackup and parallel replication.

Replication got deadlocked and we needed to kill the backup process in order to recover.
Show processlist and show engine innodb below.

Rick

+----------+--------------+--------------------+------------------------+--------------+--------+-----------------------------------------------+----------------------------------------------------------------------------------------------+----------+
| Id       | User         | Host               | db                     | Command      | Time   | State                                         | Info                                                                                         | Progress |
+----------+--------------+--------------------+------------------------+--------------+--------+-----------------------------------------------+----------------------------------------------------------------------------------------------+----------+
|        1 | system user  |                    | NULL                   | Daemon       |   NULL | InnoDB purge coordinator                      | NULL                                                                                         |    0.000 |
|        2 | system user  |                    | NULL                   | Daemon       |   NULL | InnoDB purge worker                           | NULL                                                                                         |    0.000 |
|        3 | system user  |                    | NULL                   | Daemon       |   NULL | InnoDB purge worker                           | NULL                                                                                         |    0.000 |
|        4 | system user  |                    | NULL                   | Daemon       |   NULL | InnoDB purge worker                           | NULL                                                                                         |    0.000 |
|        5 | system user  |                    | NULL                   | Daemon       |   NULL | InnoDB shutdown handler                       | NULL                                                                                         |    0.000 |
|       10 | system user  |                    | NULL                   | Slave_IO     | 913176 | Waiting for master to send event              | NULL                                                                                         |    0.000 |
|       12 | system user  |                    | NULL                   | Slave_worker | 913176 | Waiting for work from SQL thread              | NULL                                                                                         |    0.000 |
|       13 | system user  |                    | NULL                   | Slave_worker | 913176 | Waiting for work from SQL thread              | NULL                                                                                         |    0.000 |
|       14 | system user  |                    | NULL                   | Slave_worker | 913176 | Waiting for work from SQL thread              | NULL                                                                                         |    0.000 |
|       15 | system user  |                    | NULL                   | Slave_worker | 913176 | Waiting for work from SQL thread              | NULL                                                                                         |    0.000 |
|       16 | system user  |                    | NULL                   | Slave_worker | 913176 | Waiting for work from SQL thread              | NULL                                                                                         |    0.000 |
|       17 | system user  |                    | NULL                   | Slave_worker | 913176 | Waiting for work from SQL thread              | NULL                                                                                         |    0.000 |
|       18 | system user  |                    | NULL                   | Slave_worker | 913176 | Waiting for work from SQL thread              | NULL                                                                                         |    0.000 |
|       19 | system user  |                    | NULL                   | Slave_worker | 913176 | Waiting for work from SQL thread              | NULL                                                                                         |    0.000 |
|       20 | system user  |                    | NULL                   | Slave_worker |  31214 | Waiting for backup lock                       | NULL                                                                                         |    0.000 |
|       21 | system user  |                    | NULL                   | Slave_worker |  31215 | Waiting for backup lock                       | UPDATE `customer_schema`.`heartbeat` SET ts='2020-03-16T01:12:54.009860' WHERE id='1' |    0.000 |
|       22 | system user  |                    | NULL                   | Slave_worker |  31215 | Waiting for backup lock                       | NULL                                                                                         |    0.000 |
|       23 | system user  |                    | NULL                   | Slave_worker |  31215 | Waiting for prior transaction to commit       | NULL                                                                                         |    0.000 |
|       25 | system user  |                    | NULL                   | Slave_worker |  31215 | Waiting for backup lock                       | NULL                                                                                         |    0.000 |
|       26 | system user  |                    | NULL                   | Slave_worker |  31215 | Waiting for backup lock                       | NULL                                                                                         |    0.000 |
|       24 | system user  |                    | NULL                   | Slave_worker |  31215 | Waiting for prior transaction to commit       | NULL                                                                                         |    0.000 |
|       27 | system user  |                    | NULL                   | Slave_worker |  31215 | Waiting for backup lock                       | NULL                                                                                         |    0.000 |
|       11 | system user  |                    | NULL                   | Slave_SQL    |  31224 | Waiting for room in worker thread event queue | NULL                                                                                         |    0.000 |
|       63 | monyogmon    | 192.168.4.5:32729  | NULL                   | Sleep        |      0 |                                               | NULL                                                                                         |    0.000 |
|     2473 | newrelic     | 127.0.0.1:39359    | NULL                   | Sleep        |      5 |                                               | NULL                                                                                         |    0.000 |
|     5259 | monyogmon    | 192.168.4.5:32790  | NULL                   | Sleep        |     18 |                                               | NULL                                                                                         |    0.000 |
| 29385185 | mariabackup  | localhost          | NULL                   | Query        |  31215 | Waiting for backup lock                       | BACKUP STAGE BLOCK_COMMIT                                                                    |    0.000 |
| 29941894 | user | 192.168.4.40:21990 | customer_schema | Sleep        |      0 |                                               | NULL                                                                                         |    0.000 |
| 31892219 | root         | localhost          | mysql                  | Query        |   2552 | Waiting for backup lock                       | CREATE USER IF NOT EXISTS 'user'@'192.168.4.40'                                     |    0.000 |
| 31994422 | root         | localhost          | NULL                   | Sleep        |   1282 |                                               | NULL                                                                                         |    0.000 |
| 31994463 | mariadbadmin | localhost          | NULL                   | Query        |      0 | Init                                          | show processlist                                                                             |    0.000 |
+----------+--------------+--------------------+------------------------+--------------+--------+-----------------------------------------------+----------------------------------------------------------------------------------------------+----------+

---TRANSACTION 99732954594, ACTIVE (PREPARED) 31143 sec
6 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 2
MySQL thread id 24, OS thread handle 139941516424960, query id 2210114407 Waiting for prior transaction to commit
---TRANSACTION 421513941321144, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 99732954650, ACTIVE 31143 sec
3 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 2
MySQL thread id 26, OS thread handle 139941516629760, query id 2210114591 Waiting for backup lock
---TRANSACTION 99732954630, ACTIVE 31143 sec
3 lock struct(s), heap size 1136, 0 row lock(s), undo log entries 3
MySQL thread id 25, OS thread handle 139941516220160, query id 2210114496 Waiting for backup lock
---TRANSACTION 99732954619, ACTIVE 31143 sec
3 lock struct(s), heap size 1136, 0 row lock(s), undo log entries 3
MySQL thread id 22, OS thread handle 139941517039360, query id 2210114460 Waiting for backup lock
---TRANSACTION 99732954598, ACTIVE (PREPARED) 31143 sec
2 lock struct(s), heap size 1136, 0 row lock(s), undo log entries 2
MySQL thread id 23, OS thread handle 139941516834560, query id 2210114412 Waiting for prior transaction to commit
---TRANSACTION 99732954623, ACTIVE 31143 sec
3 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 2
MySQL thread id 27, OS thread handle 139941516015360, query id 2210114468 Waiting for backup lock
---TRANSACTION 99732954626, ACTIVE 31143 sec
2 lock struct(s), heap size 1136, 0 row lock(s), undo log entries 4451
MySQL thread id 20, OS thread handle 139941517448960, query id 2210114475 Waiting for backup lock
---TRANSACTION 421513941291352, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421513941287096, not started
0 lock struct(s), heap size 1136, 0 row lock(s)



 Comments   
Comment by Rick Pizzi [ 2020-07-10 ]

It appears BACKUP LOCKS implementation has serious bugs, this is affecting multiple customers of ours.

Please have a look, in the meantime we are reverting everybody to good old FTWRL.

Thanks
Rick

Comment by Rick Pizzi [ 2020-07-10 ]

Looking at source code, FTWRL has been removed completely....

 --no-backup-locks   This option controls if backup locks should be used
                      instead of FLUSH TABLES WITH READ LOCK on the backup
                      stage. The option has no effect when backup locks are not
                      default, disable with --no-backup-locks.

The above option has no meaning anymore.

So we have no workaround for this issue.

Comment by Rick Pizzi [ 2020-07-10 ]

I would suggest to reintroduce FTWRL as an option.... like it used to be in 10.3

Comment by Rick Pizzi [ 2020-07-13 ]

Here's the all threads backtrace while the problem is occurring

     16 epoll_wait,io_poll_wait,get_event,worker_main,pfs_spawn_thread,start_thread,clone
     10 __io_getevents_0_4,LinuxAIOHandler::collect,LinuxAIOHandler::poll,os_aio_handler,fil_aio_wait,io_handler_thread,start_thread,clone
      3 pthread_cond_wait,os_event_wait_low,srv_resume_thread[clone.prop.71],srv_worker_thread,start_thread,clone
      3 pthread_cond_wait,os_event_wait_low,buf_flush_page_cleaner_worker,start_thread,clone
      3 pthread_cond_timedwait,MDL_wait::timed_wait,MDL_context::acquire_lock,ha_commit_trans,trans_commit,Xid_log_event::do_apply_event,apply_event_and_update_pos_apply,rpt_handle_event[clone.isra.272],handle_rpl_parallel_thread,pfs_spawn_thread,start_thread,clone
      1 pthread_cond_wait,wait_for_commit::wait_for_prior_commit2,ha_commit_one_phase,ha_commit_trans,trans_commit,Xid_log_event::do_apply_event,apply_event_and_update_pos_apply,rpt_handle_event[clone.isra.272],handle_rpl_parallel_thread,pfs_spawn_thread,start_thread,clone
      1 pthread_cond_wait,thd_destructor_proxy,pfs_spawn_thread,start_thread,clone
      1 pthread_cond_wait,rpl_parallel_entry::choose_thread,rpl_parallel::do_event,handle_slave_sql,pfs_spawn_thread,start_thread,clone
      1 pthread_cond_wait,os_event_wait_low,srv_purge_coordinator_thread,start_thread,clone
      1 pthread_cond_wait,os_event_wait_low,buf_resize_thread,start_thread,clone
      1 pthread_cond_wait,os_event_wait_low,buf_dump_thread,start_thread,clone
      1 pthread_cond_wait,handle_slave_background,pfs_spawn_thread,start_thread,clone
      1 pthread_cond_timedwait,timer_thread,pfs_spawn_thread,start_thread,clone
      1 pthread_cond_timedwait,timer_handler,pfs_spawn_thread,start_thread,clone
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,srv_monitor_thread,start_thread,clone
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,srv_error_monitor_thread,start_thread,clone
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,lock_wait_timeout_thread,start_thread,clone
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,ib_wqueue_timedwait,fts_optimize_thread,start_thread,clone
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,dict_stats_thread,start_thread,clone
      1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,buf_flush_page_cleaner_coordinator,start_thread,clone
      1 pthread_cond_timedwait,my_service_thread_sleep,ma_checkpoint_background,pfs_spawn_thread,start_thread,clone
      1 pthread_cond_timedwait,MDL_wait::timed_wait,MDL_context::acquire_lock,MDL_context::upgrade_shared_lock,run_backup_stage,mysql_execute_command,mysql_parse,dispatch_command,do_command,tp_callback,worker_main,pfs_spawn_thread,start_thread,clone
      1 pthread_cond_timedwait,get_event,worker_main,pfs_spawn_thread,start_thread,clone
      1 poll,vio_io_wait,vio_socket_io_wait,vio_read,vio_read_buff,my_real_read,my_net_read_packet_reallen,cli_safe_read_reallen,handle_slave_io,pfs_spawn_thread,start_thread,clone
      1 poll,handle_connections_sockets,mysqld_main,__libc_start_main,_start
      1 nanosleep,os_thread_sleep,srv_master_thread,start_thread,clone
      1 nanosleep,os_thread_sleep,btr_defragment_thread,start_thread,clone
      1 do_sigwaitinfo,__GI___sigwaitinfo,signal_hand,pfs_spawn_thread,start_thread,clone

show processlist;
+------+--------------+-----------------------+--------+--------------+--------+-----------------------------------------------+---------------------------+----------+
| Id   | User         | Host                  | db     | Command      | Time   | State                                         | Info                      | Progress |
+------+--------------+-----------------------+--------+--------------+--------+-----------------------------------------------+---------------------------+----------+
|    3 | system user  |                       | NULL   | Daemon       |   NULL | InnoDB purge worker                           | NULL                      |    0.000 |
|    1 | system user  |                       | NULL   | Daemon       |   NULL | InnoDB purge coordinator                      | NULL                      |    0.000 |
|    2 | system user  |                       | NULL   | Daemon       |   NULL | InnoDB purge worker                           | NULL                      |    0.000 |
|    4 | system user  |                       | NULL   | Daemon       |   NULL | InnoDB purge worker                           | NULL                      |    0.000 |
|    5 | system user  |                       | NULL   | Daemon       |   NULL | InnoDB shutdown handler                       | NULL                      |    0.000 |
|   52 | system user  |                       | NULL   | Slave_IO     | 870885 | Waiting for master to send event              | NULL                      |    0.000 |
|   54 | system user  |                       | NULL   | Slave_worker |   2506 | Waiting for backup lock                       | NULL                      |    0.000 |
|   55 | system user  |                       | NULL   | Slave_worker |   2506 | Waiting for prior transaction to commit       | NULL                      |    0.000 |
|   56 | system user  |                       | NULL   | Slave_worker |   2506 | Waiting for backup lock                       | NULL                      |    0.000 |
|   57 | system user  |                       | NULL   | Slave_worker |   2506 | Waiting for backup lock                       | NULL                      |    0.000 |
|   53 | system user  |                       | NULL   | Slave_SQL    |   3559 | Waiting for room in worker thread event queue | NULL                      |    0.000 |
|  180 | monyog_rdba  | 10.2.0.5:47098        | NULL   | Sleep        |      0 |                                               | NULL                      |    0.000 |
|  181 | monyog_rdba  | 10.2.0.5:47104        | NULL   | Sleep        |    125 |                                               | NULL                      |    0.000 |
| 1237 | user123      | 10.90.1.100:59360     | one-db | Sleep        |    275 |                                               | NULL                      |    0.000 |
| 1238 | user123      | 10.90.1.100:59363     | one-db | Sleep        |    275 |                                               | NULL                      |    0.000 |
| 1247 | mariadbadmin | localhost             | NULL   | Sleep        |   9766 |                                               | NULL                      |    0.000 |
| 1251 | user123      | 10.90.1.100:52294     | one-db | Sleep        |    164 |                                               | NULL                      |    0.000 |
| 1292 | backupuser   | localhost             | NULL   | Query        |   2506 | Waiting for backup lock                       | BACKUP STAGE BLOCK_COMMIT |    0.000 |
| 1298 | devlite      | 192.168.150.114:64120 | one-db | Sleep        |      8 |                                               | NULL                      |    0.000 |
| 1303 | mariadbadmin | localhost             | NULL   | Query        |      0 | Init                                          | show processlist          |    0.000 |
+------+--------------+-----------------------+--------+--------------+--------+-----------------------------------------------+---------------------------+----------+
20 rows in set (0.000 sec)

------------
TRANSACTIONS
------------
Trx id counter 37772378328
Purge done for trx's n:o < 37772378274 undo n:o < 0 state: running but idle
History list length 0
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421184685828312, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421184685824056, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421184685819800, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421184685794264, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421184685815544, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421184685798520, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421184685790008, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421184685811288, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 37772378221, ACTIVE 2608 sec
2 lock struct(s), heap size 1128, 0 row lock(s), undo log entries 2
MySQL thread id 54, OS thread handle 139698558646016, query id 1086641617 Waiting for backup lock
---TRANSACTION 37772378171, ACTIVE 2608 sec
2 lock struct(s), heap size 1128, 0 row lock(s), undo log entries 2
MySQL thread id 55, OS thread handle 139698555881216, query id 1086641623 Waiting for prior transaction to commit
---TRANSACTION 37772378220, ACTIVE 2608 sec
2 lock struct(s), heap size 1128, 0 row lock(s), undo log entries 2
MySQL thread id 56, OS thread handle 139709649602304, query id 1086641619 Waiting for backup lock
---TRANSACTION 37772378265, ACTIVE 2608 sec
2 lock struct(s), heap size 1128, 0 row lock(s), undo log entries 2
MySQL thread id 57, OS thread handle 139698561103616, query id 1086641621 Waiting for backup lock
---TRANSACTION 421184685772984, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421184685785752, not started
0 lock struct(s), heap size 1128, 0 row lock(s)

select @@slave_parallel_mode, @@slave_parallel_threads;
+-----------------------+--------------------------+
| @@slave_parallel_mode | @@slave_parallel_threads |
+-----------------------+--------------------------+
| conservative          |                        4 |
+-----------------------+--------------------------+
1 row in set (0.000 sec)

Comment by Michael Widenius [ 2020-07-17 ]

I checked the latest 10.4.14 code and the comment for the option no-backup-locks is:
--no-backup-locks This option controls if backup locks should be used
instead of FLUSH TABLES WITH READ LOCK on the backup
stage. The option has no effect when backup locks are not
supported by the server. This option is enabled by
default, disable with --no-backup-locks.

In other words, if the --no-backup-locks option is used or the server doesn't support backup locks, then FLUSH TABLES WITH READ LOCK is used.

The old comment tried to say the same thing, but the above updated comment is hopefully more clear.

There are no plans to remove FLUSH TABLES WITH READ LOCK support

Comment by Rick Pizzi [ 2020-07-17 ]

monty I checked the source code and FTWRL has been removed long ago it seems.

Only the option you mention remained, which does nothing.

Comment by Michael Widenius [ 2020-07-20 ]

The issue was:
T1, a parallel slave worker thread, is waiting for another worker thread to
commit. While waiting, it has the MDL_BACKUP_COMMIT lock.
T2, working for mariabackup, is doing BACKUP STAGE BLOCK_COMMIT and blocks
all commits.
This causes a deadlock as the thread T1 is waiting for can't commit and T2 can't never get the
MDL LOCK.

Fixed by moving locking of MDL_BACKUP_COMMIT from ha_commit_trans() to
commit_one_phase_2()

Comment by Michael Widenius [ 2020-07-21 ]

Merging & re-testing
Pushed into 10.4

Comment by Marko Mäkelä [ 2020-07-21 ]

While merging this to 10.5, I omitted the changes to sql_class.cc:

diff --git a/sql/sql_class.cc b/sql/sql_class.cc
index 40e606425c5..15088148e02 100644
--- a/sql/sql_class.cc
+++ b/sql/sql_class.cc
@@ -1383,7 +1383,11 @@ void THD::update_all_stats()
 void THD::init_for_queries()
 {
   set_time(); 
-  ha_enable_transaction(this,TRUE);
+  /*
+    We don't need to call ha_enable_transaction() as we can't have
+    any active transactions that has to be commited
+  */
+  transaction.on= TRUE;
 
   reset_root_defaults(mem_root, variables.query_alloc_block_size,
                       variables.query_prealloc_size);

With the above change (or transaction->on instead of transaction.on), replication XA tests would crash in 10.5. I believed that the change is not wanted in 10.5 due to MDEV-22531 and related changes. All tests passed with that omission.

Comment by Marko Mäkelä [ 2020-08-10 ]

An after-merge fix was part of the 10.5.5 release.

Comment by Rick Pizzi [ 2020-08-12 ]

marko is this fixed in 10.4.14? The release notes do not mention it.

Comment by Rick Pizzi [ 2020-08-26 ]

This fix has introduced a regression and now the GTID position is wrong when using 10.4.14. Filing a new bug.

Comment by Marko Mäkelä [ 2020-08-26 ]

Yes, the change is included of the MariaDB 10.4.14 and 10.5.5 releases.

Generated at Thu Feb 08 09:11:05 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.