[MDEV-20867] Perform careful review of "Server crashes with BACKUP STAGE and FLUSH TABLE table_name" Created: 2019-10-20  Updated: 2023-05-16  Resolved: 2020-02-12

Status: Closed
Project: MariaDB Server
Component/s: Backup
Fix Version/s: 10.5.1, 10.4.13

Type: Task Priority: Critical
Reporter: Sergey Vojtovich Assignee: Sergey Vojtovich
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
Problem/Incident
is caused by MDEV-18067 Server crash in backup_end or asserti... Closed
is caused by MDEV-18068 Assertion `this == ticket->get_ctx()'... Closed
is caused by MDEV-18069 Server hang or crash in MDL_lock::inc... Closed

 Description   

Revision: https://github.com/MariaDB/server/commit/c2e0a0b17544bf984e21094252528110e5a323f6

It definitely has a few minor glitches, doesn't seem to go well inline with the design.



 Comments   
Comment by Sergey Vojtovich [ 2019-10-22 ]

The following diff crashes the server:

diff --git a/mysql-test/main/backup_interaction.test b/mysql-test/main/backup_interaction.test
index 8ac905c..cdf30d3 100644
--- a/mysql-test/main/backup_interaction.test
+++ b/mysql-test/main/backup_interaction.test
@@ -509,8 +509,10 @@ BACKUP STAGE END;
 
 CREATE OR REPLACE TABLE t1 (pk INT PRIMARY KEY, f INT);
 BACKUP STAGE START;
+BACKUP LOCK t1;
 FLUSH TABLE t1 FOR EXPORT;
 UNLOCK TABLES;
+BACKUP LOCK t1;
 BACKUP STAGE END;
 DROP TABLE t1;

Comment by Sergey Vojtovich [ 2019-10-22 ]

FLUSH TABLE t1 FOR EXPORT is allowed under BACKUP STAGE while LOCK TABLE t1 READ is forbidden.
It sounds inconsistent because the former takes stronger lock.

Comment by Sergey Vojtovich [ 2019-10-22 ]

mdl.cc change doesn't go inline with what we have in THD::leave_locked_tables_mode(). To be inline with that code the fix should look as following:

diff --git a/sql/sql_class.cc b/sql/sql_class.cc
index 1ffb7fe..ab4d7d5 100644
--- a/sql/sql_class.cc
+++ b/sql/sql_class.cc
@@ -5591,6 +5591,8 @@ void THD::leave_locked_tables_mode()
       when leaving LTM.
     */
     global_read_lock.set_explicit_lock_duration(this);
+    if (backup_flush_ticket)
+      mdl_context.set_lock_duration(backup_flush_ticket, MDL_EXPLICIT);
     /* Also ensure that we don't release metadata locks for open HANDLERs. */
     if (handler_tables_hash.records)
       mysql_ha_set_explicit_lock_duration(this);

Comment by Michael Widenius [ 2020-02-11 ]

The above patch will not work, as backup_flush_ticket is set once when backup is started and never reset. At end of backup it will point to something that doesn't exist anymore.

I agree with Svoj's first comment that if one has a backup lock, one should not be able to do FLUSH TABLES FOR EXPORT when one has backup stage's active. A better fix is to give an error for this command if thd->read_only_protection() is set.

Comment by Michael Widenius [ 2020-02-11 ]

Need a better fix

Comment by Sergey Vojtovich [ 2020-02-11 ]

Which patch was reviewed, this one?

commit 5c508f09bc10b3d24638142eb71967b7a1fd36f4
Author: Sergey Vojtovich <svoj@mariadb.org>
Date:   Fri Nov 1 19:18:47 2019 +0400
 
    MDEV-20867 - Perform careful review of "Server crashes with BACKUP STAGE and FLUSH TABLE table_name"
 
    Reverted original patch (c2e0a0b).
 
    For consistency with "LOCK TABLE <table_name> READ" and "FLUSH TABLES
    WITH READ LOCK", which are forbidden under "BACKUP STAGE", forbid "FLUSH
    TABLE <table_name> FOR EXPORT" and "FLUSH TABLE <table_name> WITH READ
    LOCK" as well.
 
    It'd allow consistent fixes for problems like MDEV-18643, which make
    BACKUP hardly useful.

Comment by Michael Widenius [ 2020-02-12 ]

I reviewed the patch that was part of the Jira entry (2 line patch for sql_class.cc)
Will now review the full commit

Generated at Thu Feb 08 09:02:47 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.