[MDEV-14401] Stored procedure that declares a handler that catches ER_LOCK_DEADLOCK error causes thd->is_error() assertion Created: 2017-11-15  Updated: 2019-08-13  Resolved: 2017-12-07

Status: Closed
Project: MariaDB Server
Component/s: Galera, Tests
Fix Version/s: 10.1.30, 10.2.12

Type: Task Priority: Major
Reporter: Andrii Nikitin (Inactive) Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: File galera_bf_lock_wait.test    
Issue Links:
PartOf
is part of MDEV-12837 WSREP: BF lock wait long Closed

 Description   

reproduce in mtr



 Comments   
Comment by Andrii Nikitin (Inactive) [ 2017-11-15 ]

First tried to reproduce with stored procedures, no error problem happens in ~3 min

<content deleted - see in history if needed>

Comment by Andrii Nikitin (Inactive) [ 2017-11-15 ]

Following galera test always quickly triggers the problem for 10.2.10 :

--source include/galera_cluster.inc
--source include/big_test.inc
 
CREATE TABLE t1 ENGINE=InnoDB select 1 as a, 1 as b union select 2, 2;
ALTER TABLE t1 add primary key(a);
 
DELIMITER |;
 
CREATE PROCEDURE p1()
BEGIN
        DECLARE CONTINUE HANDLER FOR SQLEXCEPTION rollback;
        WHILE 1 DO
                start transaction;
                update t1 set b=connection_id() where a=1;
                commit;
        END WHILE;
END|
 
 
DELIMITER ;|
 
--connect node_1_p1, 127.0.0.1, root, , test, $NODE_MYPORT_1
send call p1;
--connect node_1_p2, 127.0.0.1, root, , test, $NODE_MYPORT_1
send call p1;
--connect node_2_p1, 127.0.0.1, root, , test, $NODE_MYPORT_2
send call p1;
--connect node_2_p2, 127.0.0.1, root, , test, $NODE_MYPORT_2
send call p1;
 
connection default;
let $counter=10;
let $sleep_period=10;
 
echo checking error log for 'BF lock wait long' message for $counter times every $sleep_period seconds ...;
while($counter)
{
  eval do sleep($sleep_period);
#  show processlist;
# use error 0,1 instead if want test to continue
  --error 1
  exec grep 'BF lock wait long' $MYSQLTEST_VARDIR/log/mysqld.*.err;
  let $counter=$counter-1;
}
 
drop table t1;

Output:

$ ./mtr MDEV-12837 --mem
Logging: ./mtr  MDEV-12837 --mem
vardir: /home/a/env1/_depot/m-tar/10.2.10/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/a/env1/_depot/m-tar/10.2.10/mysql-test/var'...
 - symlinking 'var' to '/dev/shm/var_auto_4KMJ'
Checking supported features...
MariaDB Version 10.2.10-MariaDB
 - SSL connections supported
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
CREATE TABLE t1 ENGINE=InnoDB select 1 as a, 1 as b union select 2, 2;
ALTER TABLE t1 add primary key(a);
CREATE PROCEDURE p1()
BEGIN
DECLARE CONTINUE HANDLER FOR SQLEXCEPTION rollback;
WHILE 1 DO
start transaction;
update t1 set b=connection_id() where a=1;
commit;
END WHILE;
END|
connect node_1_p1, 127.0.0.1, root, , test, $NODE_MYPORT_1;
call p1;
connect node_1_p2, 127.0.0.1, root, , test, $NODE_MYPORT_1;
call p1;
connect node_2_p1, 127.0.0.1, root, , test, $NODE_MYPORT_2;
call p1;
connect node_2_p2, 127.0.0.1, root, , test, $NODE_MYPORT_2;
call p1;
connection default;
checking error log for 'BF lock wait long' message for 10 times every 10 seconds ...
do sleep(10);
do sleep(10);
do sleep(10);
do sleep(10);
do sleep(10);
do sleep(10);
galera.MDEV-12837 'innodb'               [ fail ]
        Test ended at 2017-11-15 12:54:25
 
CURRENT_TEST: galera.MDEV-12837
mysqltest: At line 44: command "grep 'BF lock wait long' $MYSQLTEST_VARDIR/log/mysqld.*.err" succeeded - should have failed with errno 1...
 
 - saving '/home/a/env1/_depot/m-tar/10.2.10/mysql-test/var/log/galera.MDEV-12837-innodb/' to '/home/a/env1/_depot/m-tar/10.2.10/mysql-test/var/log/galera.MDEV-12837-innodb/'
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 70 seconds executing testcases
 
Failure: Failed 1/1 tests, 0.00% were successful.
 
Failing test(s): galera.MDEV-12837
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
mysql-test-run: *** ERROR: there were failing test cases

Error log node2 (node1 has no messages past sst completion):

Version: '10.2.10-MariaDB-log'  socket: '/home/a/env1/_depot/m-tar/10.2.10/mysql-test/var/tmp/mysqld.2.sock'  port: 16001  MariaDB Server
2017-11-15 12:53:22 139663952140032 [Note] WSREP: Synchronized with group, ready for connections
2017-11-15 12:53:22 139663952140032 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-11-15 12:53:23 139663712954112 [Note] WSREP: (9339150e, 'tcp://0.0.0.0:16005') turning message relay requesting off
WSREP: BF lock wait long
WSREP: BF lock wait long
WSREP: BF lock wait long
 
=====================================
2017-11-15 12:54:18 0x7f05ddffb700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 56 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 2 srv_active, 0 srv_shutdown, 53 srv_idle
srv_master_thread log flush and writes: 55
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 9
OS WAIT ARRAY INFO: signal count 7
RW-shared spins 0, rounds 19, OS waits 6
RW-excl spins 0, rounds 0, OS waits 0
RW-sx spins 0, rounds 0, OS waits 0
Spin rounds per wait: 19.00 RW-shared, 0.00 RW-excl, 0.00 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 1851
Purge done for trx's n:o < 1849 undo n:o < 0 state: running but idle
History list length 25
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421138315865624, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 1850, ACTIVE 53 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 2, OS thread handle 139663952140032, query id 84 Update_rows_log_event::find_row(20)
update t1 set b=connection_id() where a=1
------- TRX HAS BEEN WAITING 53 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 5 page no 3 n bits 72 index PRIMARY of table `test`.`t1` trx id 1850 lock_mode X locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 8; hex 8000000000000001; asc         ;;
 1: len 6; hex 000000000737; asc      7;;
 2: len 7; hex 22000001560110; asc "   V  ;;
 3: len 8; hex 800000000000000c; asc         ;;
 
------------------
TABLE LOCK table `test`.`t1` trx id 1850 lock mode IX
RECORD LOCKS space id 5 page no 3 n bits 72 index PRIMARY of table `test`.`t1` trx id 1850 lock_mode X locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 8; hex 8000000000000001; asc         ;;
 1: len 6; hex 000000000737; asc      7;;
 2: len 7; hex 22000001560110; asc "   V  ;;
 3: len 8; hex 800000000000000c; asc         ;;
 
---TRANSACTION 1847, ACTIVE 53 sec
2 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 1
MySQL thread id 12, OS thread handle 139663277164288, query id 85 localhost 127.0.0.1 root Unlocking tables
commit
TABLE LOCK table `test`.`t1` trx id 1847 lock mode IX
RECORD LOCKS space id 5 page no 3 n bits 72 index PRIMARY of table `test`.`t1` trx id 1847 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 8; hex 8000000000000001; asc         ;;
 1: len 6; hex 000000000737; asc      7;;
 2: len 7; hex 22000001560110; asc "   V  ;;
 3: len 8; hex 800000000000000c; asc         ;;
 
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (write thread)
I/O thread 5 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0] , aio writes: [0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 0; buffer pool: 0
311 OS file reads, 235 OS file writes, 46 OS fsyncs
0.23 reads/s, 16384 avg bytes/read, 4.18 writes/s, 0.80 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
0.00 hash searches/s, 2.82 non-hash searches/s
---
LOG
---
Log sequence number 1664626
Log flushed up to   1664626
Pages flushed up to 1664626
Last checkpoint at  1664617
0 pending log flushes, 0 pending chkp writes
44 log i/o's done, 0.64 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 8601600
Dictionary memory allocated 62527
Buffer pool size   511
Free buffers       180
Database pages     331
Old database pages 0
Modified db pages  0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 174, created 157, written 193
0.23 reads/s, 2.80 creates/s, 3.45 writes/s
Buffer pool hit rate 992 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 331, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=26624, Main thread ID=139663201265408, state: sleeping
Number of rows inserted 2, updated 18, deleted 0, read 22
0.04 inserts/s, 0.32 updates/s, 0.00 deletes/s, 0.39 reads/s
Number of system rows inserted 0, updated 0, deleted 0, read 0
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
WSREP: BF lock wait long
WSREP: BF lock wait long
WSREP: BF lock wait long
WSREP: BF lock wait long
WSREP: BF lock wait long
WSREP: BF lock wait long

Comment by Jan Lindström (Inactive) [ 2017-12-07 ]

commit ba576c5b787de0b2218843d83f4524279ae7848b
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Thu Dec 7 12:43:26 2017 +0200

MDEV-14401: Stored procedure that declares a handler that catches ER_LOCK_DEADLOCK error causes thd->is_error() assertion

This was missing bug fix from MySQL wsrep i.e. Galera.
Problem was that if stored procedure declares a handler that
catches deadlock error, then the error may have been
cleared in method sp_rcontext::handle_sql_condition().
Use wsrep_conflict_state correctly to determine is the
error already sent to client.

Add test case for both this bug and MDEV-12837: WSREP: BF
lock wait long. Test requires both fixes to pass.

Generated at Thu Feb 08 08:13:15 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.