[MDEV-13470] DELETE IGNORE should not ignore deadlocks (again) Created: 2017-08-08  Updated: 2017-08-09  Resolved: 2017-08-09

Status: Closed
Project: MariaDB Server
Component/s: Data Manipulation - Delete, Server
Affects Version/s: 10.2.8
Fix Version/s: 10.2.8

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-117 LP:922146 - Assertion: prebuilt->sql_... Closed

 Description   

This is basically a duplicate or a reincarnation of MDEV-117. For some reason, the test innodb.mdev-117 started failing in 10.2.

It is uncertain when this test started failing. The test is nondeterministic, because there is a race condition between the concurrently executing DELETE IGNORE and DELETE statements.

When a deadlock is reported for DELETE IGNORE, the SQL layer would call handler::print_error() but then proceed to the next row, as if no error had happened (which is the purpose of DELETE IGNORE). So, when it proceeded to handler::ha_rnd_next(), InnoDB would hit an assertion failure, because the transaction no longer exists, and we are not executing at the start of a statement.



 Comments   
Comment by Marko Mäkelä [ 2017-08-08 ]

bb-10.2-marko contains two clean-up commits on top of the actual fix.

Comment by Elena Stepanova [ 2017-08-08 ]

It is uncertain when this test started failing.

It first failed with the assertion failure (Assertion `prebuilt->sql_stat_start || trx->state == TRX_STATE_ACTIVE') On July 30th on bb-10.2-mariarocks, once; and has been failing on 10.2 since Aug 7th, 29ad6284918d4be81f40f05214220c42 (RocksDB merge) – 10 times as of now. While it's not 100%-reliable, 10 failures over the past 36 hours and never before is hardly a statistical error.

Comment by Marko Mäkelä [ 2017-08-09 ]

elenst, thank you for the observation. Indeed, it looks like MDEV-117 could have been reintroduced by the RocksDB merge done by psergey:

PAGER=cat git diff 11948d75862941e2780e550cbc5411895e1cc1c7..29ad6284918d4be81f40f05214220c42 sql/handler.cc

Note that the macro `SET_FATAL_ERROR` is assigning a local variable, and due to the `DBUG_VOID_RETURN` that local variable will no longer be examined, so the `SET_FATAL_ERROR` essentially became a no-op here.
The comment of the commit that introduced this change explains that the intention was to provide more detail in the `ER_LOCK_DEADLOCK` message. But why introduce the `DBUG_VOID_RETURN` and skip the fatal-error handling?

commit c90753e671e961269f8d052c35445144abbe1b00
Author: Sergei Petrunia <psergey@askmonty.org>
Date:   Sun Jul 30 09:03:42 2017 +0000
 
    Follow the upstream MyRocks: provide details in the ER_LOCK_DEADLOCK message
    
    This fixes result mismatches in rocksdb.issue111, rocksdb.hermitage,
    rocksdb.rocksdb_locks
 
diff --git a/sql/handler.cc b/sql/handler.cc
index a4a2297bd2f..a336d9916f1 100644
--- a/sql/handler.cc
+++ b/sql/handler.cc
@@ -3490,10 +3490,17 @@ void handler::print_error(int error, myf errflag)
     textno=ER_LOCK_TABLE_FULL;
     break;
   case HA_ERR_LOCK_DEADLOCK:
-    textno=ER_LOCK_DEADLOCK;
+  {
+    String str, full_err_msg(ER_DEFAULT(ER_LOCK_DEADLOCK), system_charset_info);
+
     /* cannot continue. the statement was already aborted in the engine */
     SET_FATAL_ERROR;
-    break;
+
+    get_error_message(error, &str);
+    full_err_msg.append(str);
+    my_printf_error(ER_LOCK_DEADLOCK, "%s", errflag, full_err_msg.c_ptr_safe());
+    DBUG_VOID_RETURN;
+  }
   case HA_ERR_READ_ONLY_TRANSACTION:
     textno=ER_READ_ONLY_TRANSACTION;
     break;

Generated at Thu Feb 08 08:05:50 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.