[MDEV-18643] 'BACKUP STAGE BLOCK_DDL ' failed: 1213: Deadlock found Created: 2019-02-19  Updated: 2023-05-16

Status: Open
Project: MariaDB Server
Component/s: Locking
Affects Version/s: 10.4.6
Fix Version/s: 10.4

Type: Bug Priority: Critical
Reporter: Matthias Leich Assignee: Michael Widenius
Resolution: Unresolved Votes: 0
Labels: affects-tests

Attachments: File bs_2.log     File bs_2.tgz     File ml_31_M.test     HTML File prt    
Issue Links:
Blocks
blocks MDEV-18213 Unexpected ER_LOCK_DEADLOCK upon BACK... Open

 Description   

Problem found during RQG testing with Mariabackup.
 
1. Connection con_backup executes 
     BACKUP STAGE START;
     BACKUP STAGE FLUSH;
2. Some observing connection "sees"
     CONNECTION  DB  COMMAND STATE   INFO    STAGE   MAX_STAGE   PROGRESS    EXAMINED_ROWS   INFO_BINARY
     con_backup  test    Sleep       NULL    0   0   0.000   0   NULL
     con_ddl test    Sleep       NULL    0   0   0.000   0   NULL
     CONNECTION  LOCK_MODE   LOCK_TYPE   TABLE_SCHEMA    TABLE_NAME
    con_backup  MDL_BACKUP_FLUSH    Backup lock 
3. Connection con_ddl sends to the server a
     ALTER TABLE t1 COMMENT = " <some value> "
     t1 uses the storage engine MyISAM.
     I assume the problem is valid for storage engines which do not offer atomic statements.
     At least neither InnoDB nor Aria showed the bad effect.
4. The connection con_observer waits some seconds so that con_ddl can do as much as it could
     do in the moment and "sees"
     CONNECTION  DB  COMMAND STATE   INFO    STAGE   MAX_STAGE   PROGRESS    EXAMINED_ROWS   INFO_BINARY
    con_backup  test    Sleep       NULL    0   0   0.000   0   NULL
    con_ddl test    Query   Waiting for backup lock ALTER TABLE t1 COMMENT = " 5 "  0   0   0.000   0   ALTER TABLE t1 COMMENT = " 5 "
   CONNECTION  LOCK_MODE   LOCK_TYPE   TABLE_SCHEMA    TABLE_NAME
   con_backup  MDL_BACKUP_FLUSH    Backup lock
   con_ddl MDL_SHARED_UPGRADABLE   Table metadata lock test    t1
   con_ddl MDL_INTENTION_EXCLUSIVE Schema metadata lock    test
   con_ddl MDL_BACKUP_DDL  Backup lock
5. Now the connection con_backup issues a
    BACKUP STAGE BLOCK_DDL ;
    which harvests immediate a
    ERROR 40001: Deadlock found when trying to get lock; try restarting transaction
 
Please use the uploaded ml31_M.test for replaying that problem.
     
MariaDB
10.4.6 commit 592fe954ef82be1bc08b29a8e54f7729eb1e1343
   
MariaDB Enterprise
 10.3.16 commit 2da55440ee4c83219b030b1f217f5c273a2541e8
 10.2.25 commit fd21ff94a2136f4c1ebb896987b7607b1d161cad
 
MariaDB (non enterprise) 10.3.15 is not affected because it does not support BACKUP STAGE ...



 Comments   
Comment by Matthias Leich [ 2019-02-19 ]

 
Please ignore this outdated comment.
The comment explains how to replay the most probably same problem 
based on the replay test in bs_2.tgz.
bs_2.tgz and the comment could be removed in case it turns out that
ml_31_M.test was sufficient for fixing the problem.
 
How to replay the problem?
-------------------------------------------
cd <source>/mysql-test
tar xvzf bs_2.tgz
--> mysqltest_background.sh  
      A shellscript for running session 1 and 2 activity in background +
     signaling when failures happened etc.
--> main/bs_2.test
      The MTR test
cd <source with binaries>/mysql-test
./mysql-test-run.pl --mem bs_2.test | tee prt
 
prt itself tells if the test failed but lacks sometimes important details.
Example:
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
main.bs_2 'innodb'                       [ fail ]
        Test ended at 2019-02-18 22:12:05
 
CURRENT_TEST: main.bs_2
mysqltest: At line 124:
 
The result from queries just before the failure was:
< snip >
   ALTER TABLE t0 DROP KEY `idx1`;
}
--echo # End   of SQL_FILE reached
End   of content of '/home/mleich/Server/10.4/bld_debug/mysql-test/var/tmp/sql.1' ----
Start executing '/home/mleich/Server/10.4/bld_debug/mysql-test/var/tmp/sql.1' at TS: 1550524323
End   executing '/home/mleich/Server/10.4/bld_debug/mysql-test/var/tmp/sql.1' at TS: 1550524323 and status was 1
Begin of       output ----
# Begin of SQL_FILE
SET AUTOCOMMIT = 0 ;
SET SESSION lock_wait_timeout = 2  ;
SET SESSION innodb_lock_wait_timeout = 1 ;
not ok
End   of       output ----
Begin of error output ----
mysqltest: At line 20: command "file_exists" failed with wrong error: 1  my_errno: 2  errno: 2
End   of error output ----
The file '/home/mleich/Server/10.4/bld_debug/mysql-test/var/tmp/run' does no more exist.
This means some concurrent session has observed a problem and deleted that file.
Assuming that this is the reason for my exit status > 0.
ENDE 1550524323
var/log/bs_2.log  is test output with details.
Shortened example:
CREATE TABLE t0 ( col1 INT) ENGINE = MyISAM  ;
# One of the concurrent SQL runner has removed the file '/home/mleich/Server/10.4/bld_debug/mysql-test/var/tmp/run'.
# == Hit an unexpected effect.
MY_CURRENT_TS : 1550524313 , MY_START_TS : 1550524323 , MY_END_TS : 1550524383
...
# Begin of SQL_FILE
SET SESSION wait_timeout = 2147483, max_statement_time = 0;
SET SESSION lock_wait_timeout = 31536000;
not ok
End   of       output ----
Begin of error output ----
mysqltest: At line 30: query 'BACKUP STAGE BLOCK_DDL ' failed: 1213: Deadlock found when trying to get lock; try restarting transaction
End   of error output ----
ERROR: My exit status > 0 means I have hit some not expected problem.
Removing the file '/home/mleich/Server/10.4/bld_debug/mysql-test/var/tmp/run' and giving up.
I apologize in advance in case my mysqltest_background.sh shows somehow bad properties.

Generated at Thu Feb 08 08:45:39 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.