[MDEV-29545] InnoDB replace statement returns "can't find record" error during bulk insert operation. Created: 2022-09-15  Updated: 2023-06-07  Resolved: 2023-03-24

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6.10, 10.7.6
Fix Version/s: 11.1.1, 10.11.3, 10.6.13, 10.8.8, 10.9.6, 10.10.4

Type: Bug Priority: Major
Reporter: Matthias Leich Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 0
Labels: regression-10.6, rr-profile-analyzed

Attachments: File TBR-403-mini.yy     File TBR-403.cfg    
Issue Links:
Relates
relates to MDEV-515 innodb bulk insert Closed
relates to MDEV-24818 Concurrent use of InnoDB table is imp... Closed

 Description   

InnoDB blindly return "DB_END_OF_INDEX" when bulk insert operation transaction
is visible to current transaction even though the search tuple exist in index.

origin/10.7 f3785f099c2f0f251f39632928e822328abe9a02 2022-09-13T08:48:40+03:00
 
Scenario in the RQG test:
Two concurrent sessions run a DDL/DML mix.
One or both sessions go set  SET foreign_key_checks = 0, unique_checks = 0 ;
After a few seconds with DDL/DML load one session catches
REPLACE INTO t6 (col1,col2, col_int, col_string, col_text) VALUES /* 839 */ (839,839,839,REPEAT(SUBSTR(CAST( 839 AS CHAR),1,1), 10),REPEAT(SUBSTR(CAST( 839 AS CHAR),1,1), @fill_amount) ), (839,839,839,REPEAT(SUBSTR(CAST( 839 AS CHAR),1,1), 10),REPEAT(SUBSTR(CAST( 839 AS CHAR),1,1), @fill_amount) )  /* E_R Thread1 QNO 27 CON_ID 17 */  failed: 1032 , errstr: Can't find record in 't6'
RQG reacts with killing the DB server with SIGABRT.
 
pluto:/data/results/1663181785/TBR-403$ _RR_TRACE_DIR=./1/rr/ rr replay
 
Maybe better rr trace
REPLACE INTO t3 (col1) VALUES /* 1894 */ (1894), (1894)  /* E_R Thread1 QNO 149 CON_ID 17 */  failed: 1032 , errstr: Can't find record in 't3'
pluto:/data/results/1663591301/MDEV-29545$ _RR_TRACE_DIR=./1/rr/ rr replay --mark-stdio
 
The problem is on 
origin/HEAD, origin/10.6 8ff10969996a5349e906fab44b45c6e3ec53eea0 2022-08-24T17:06:57+03:00
too.
No replay on 10.5.18
origin/10.5 23a8654cdbd84336fd2004bdd2b0cb4a0ecd3d4a 2022-09-15T12:20:50+04:00



 Comments   
Comment by Marko Mäkelä [ 2022-09-20 ]

I started to debug the first trace as follows:

ssh pluto
rr replay /data/results/1663181785/TBR-403/1/rr/latest-trace

continue
break my_error.cc:114
condition 1 nr==1032
reverse-continue
backtrace

In this build, if you set a breakpoint on a function (such as my_error), the parameter values will initially be incorrect. A breakpoint inside the function body works. Error 1032 is ER_KEY_NOT_FOUND.

Comment by Thirunarayanan Balathandayuthapani [ 2023-02-23 ]

Test case for the scenario:

--source include/have_innodb.inc
set unique_checks=0, foreign_key_checks=0;
create table t1(c1 int primary key)engine=innodb row_format=compact;
begin;
insert into t1 values(3331);
set DEBUG_SYNC="now SIGNAL con1_start";
 
connect(con1,localhost,root,,,);
begin;
set DEBUG_SYNC="now WAIT_FOR con1_start";
select c1 from t1;
set DEBUG_SYNC="now SIGNAL con_def_commit";
 
connection default;
SET DEBUG_SYNC="now WAIT_FOR con_def_commit";
commit;
SET DEBUG_SYNC="now SIGNAL replace_start";
 
connection con1;
set DEBUG_SYNC="now WAIT_FOR replace_start";
replace into t1 values(1984), (1984);
commit;
 
connection default;
DROP TABLE t1;

mysqltest: At line 22: query 'replace into t1 values(1984), (1984)' failed: ER_KEY_NOT_FOUND (1032): Can't find record in 't1'

Comment by Thirunarayanan Balathandayuthapani [ 2023-03-22 ]

Patch is in bb-10.6-MDEV-29545

Comment by Marko Mäkelä [ 2023-03-22 ]

Can we check prebuilt->select_lock_type instead of checking whether the current transaction has already modified the table?

I think that it would be good to treat all locking reads in the same way. I believe that also without the involvement of MDEV-515 bulk insert, a locking read would read the latest version of a record. By definition, the latest version must be a committed version, because otherwise the locking read would conflict with a lock that is held by another transaction.

Generated at Thu Feb 08 10:09:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.