[MDEV-24224] Gap lock on delete in 10.5 using READ COMMITTED - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.5.0, 10.5.8
Fix Version/s: 10.5.9
Component/s: Locking, Storage Engine - InnoDB
Labels:
- performance
- regression
Environment:
docker mariadb:10.5, centos 7

Description

During regression testing of our application on MariaDB 10.5 we hit a gap lock that doesn't occur in previous versions. Our transactions use isolation level READ COMMITTED so there shouldn't be any gap locks.

Test:

DROP TABLE IF EXISTS `test`;

CREATE TABLE `test` (

	ID varchar(40) NOT NULL,

	TEST1 varchar(40) DEFAULT NULL,

	TEST2 varchar(15) NOT NULL,

	TEST3 bigint(20) DEFAULT NULL,

	KEY `IDX_TEST` (TEST1, TEST2, TEST3)

) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC;

INSERT INTO test (ID, TEST1, TEST2, TEST3) VALUES ('row_a', 'A.123', 'C', 3);

INSERT INTO test (ID, TEST1, TEST2, TEST3) VALUES ('row_a', 'B.456', 'C', 3);

SET GLOBAL INNODB_STATUS_OUTPUT_LOCKS = 'ON';

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

BEGIN;

DELETE FROM test WHERE TEST1 = 'A.123a' and TEST2 = 'C' and TEST3 = 3;

SHOW ENGINE INNODB STATUS\G

The delete statement matches 0 rows, but creates a gap lock.

Using a different delete statement doesn't create a gap lock.

DELETE FROM test WHERE TEST1 = 'G.123a' and TEST2 = 'X' and TEST3 = 31;

Expected Behavior:
no gap lock on the delete statement.

Attachments

Issue Links

is caused by

MDEV-19544 Remove innodb_locks_unsafe_for_binlog

Closed

relates to

MDEV-16232 Use fewer mini-transactions

Stalled

MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration

Closed

MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2020-11-17 05:35

I do not think that we have made any significant changes to the locking subsystem since MariaDB 10.3, when trx_sys.rw_trx_hash was introduced.

Write operations will always use the same type of locking, independent of the transaction isolation level. The only special isolation level is SERIALIZABLE, which will cause all reads to be locking ones, as if LOCK IN SHARE MODE had been appended to each SELECT statement.

The READ COMMITTED isolation level behaves just like the default REPEATABLE READ, with the exception that a new read view will be created at the start of each SQL statement.

The READ UNCOMMITTED isolation level will avoid creating a read view altogether, and always return the latest available version. For locking operations, the latest available version is the only possible version.

A DELETE or UPDATE always consists of a locking read (which will acquire an explicit record lock until we address MDEV-16232) followed by the modification. A DELETE will update the record header and the hidden columns DB_TRX_ID,DB_ROLL_PTR in the clustered index record. In secondary indexes, only the record header and the PAGE_MAX_TRX_ID will be updated.

When it comes to gap locks, there is the open bug ~~MDEV-20605~~.

Note that InnoDB does not have a concept of a ‘table row lock’. It has a concept of ‘index record locks’. You are defining one secondary index, and the DELETE statement might use it.

The observed change in behavior might be due to a different query plan. Which version had you tested earlier?

Marko Mäkelä added a comment - 2020-11-17 05:35 I do not think that we have made any significant changes to the locking subsystem since MariaDB 10.3, when trx_sys.rw_trx_hash was introduced. Write operations will always use the same type of locking, independent of the transaction isolation level. The only special isolation level is SERIALIZABLE , which will cause all reads to be locking ones, as if LOCK IN SHARE MODE had been appended to each SELECT statement. The READ COMMITTED isolation level behaves just like the default REPEATABLE READ , with the exception that a new read view will be created at the start of each SQL statement. The READ UNCOMMITTED isolation level will avoid creating a read view altogether, and always return the latest available version. For locking operations, the latest available version is the only possible version. A DELETE or UPDATE always consists of a locking read (which will acquire an explicit record lock until we address MDEV-16232 ) followed by the modification. A DELETE will update the record header and the hidden columns DB_TRX_ID,DB_ROLL_PTR in the clustered index record. In secondary indexes, only the record header and the PAGE_MAX_TRX_ID will be updated. When it comes to gap locks, there is the open bug MDEV-20605 . Note that InnoDB does not have a concept of a ‘table row lock’. It has a concept of ‘index record locks’. You are defining one secondary index, and the DELETE statement might use it. The observed change in behavior might be due to a different query plan. Which version had you tested earlier?

Ben Owen added a comment - 2020-11-17 15:28

I ran the same sequence in 10.4, 10.3 and 10.2 and it doesn't perform a gap lock. Now this could be something else like a difference in optimizers that we never hit before. However today it prevents us from using 10.5 with our code as-is as we are not expecting a lock on the index.

To compare create a SQL file 'mariadb-10.5-defect.sql' with the content:

CREATE DATABASE IF NOT EXISTS `test`;

USE `test`;

DROP TABLE IF EXISTS `test`;

CREATE TABLE `test` (

	ID varchar(40) NOT NULL,

	TEST1 varchar(40) DEFAULT NULL,

	TEST2 varchar(15) NOT NULL,

	TEST3 bigint(20) DEFAULT NULL,

	KEY `IDX_TEST` (TEST1, TEST2, TEST3)

) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC;

INSERT INTO test (ID, TEST1, TEST2, TEST3) VALUES ('row_a', 'A.123', 'C', 3);

INSERT INTO test (ID, TEST1, TEST2, TEST3) VALUES ('row_a', 'B.456', 'C', 3);

SET GLOBAL INNODB_STATUS_OUTPUT_LOCKS = 'ON';

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

BEGIN;

DELETE FROM test WHERE TEST1 = 'A.123a' and TEST2 = 'C' and TEST3 = 3;

SHOW ENGINE INNODB STATUS\G

ROLLBACK;

Then run 10.4 and 10.5 versions in docker

docker run -p 13306:3306 -d -e MYSQL_ALLOW_EMPTY_PASSWORD=true --name mariadb104 mariadb:10.4

mysql --port=13306 -uroot -h 127.0.0.1 < mariadb-10.5-defect.sql

docker run -p 23306:3306 -d -e MYSQL_ALLOW_EMPTY_PASSWORD=true --name mariadb105 mariadb:10.5

mysql --port=23306 -uroot -h 127.0.0.1 < mariadb-10.5-defect.sql

The 10.4 reads:

LIST OF TRANSACTIONS FOR EACH SESSION:

---TRANSACTION 36, ACTIVE 0 sec

1 lock struct(s), heap size 1128, 0 row lock(s)

MySQL thread id 8, OS thread handle 140406844811008, query id 12 172.17.0.1 root Init

SHOW ENGINE INNODB STATUS

TABLE LOCK table `test`.`test` trx id 36 lock mode IX

The 10.5 reads

---TRANSACTION 36, ACTIVE 0 sec

2 lock struct(s), heap size 1128, 1 row lock(s)

MySQL thread id 3, OS thread handle 139746906982144, query id 12 172.17.0.1 root starting

SHOW ENGINE INNODB STATUS

TABLE LOCK table `test`.`test` trx id 36 lock mode IX

RECORD LOCKS space id 5 page no 4 n bits 72 index IDX_TEST of table `test`.`test` trx id 36 lock_mode X locks gap before rec

Record lock, heap no 3 PHYSICAL RECORD: n_fields 4; compact format; info bits 0

 0: len 5; hex 422e343536; asc B.456;;

 1: len 1; hex 43; asc C;;

 2: len 8; hex 8000000000000003; asc         ;;

 3: len 6; hex 000000000201; asc       ;;

Ben Owen added a comment - 2020-11-17 15:28 I ran the same sequence in 10.4, 10.3 and 10.2 and it doesn't perform a gap lock. Now this could be something else like a difference in optimizers that we never hit before. However today it prevents us from using 10.5 with our code as-is as we are not expecting a lock on the index. To compare create a SQL file 'mariadb-10.5-defect.sql' with the content: CREATE DATABASE IF NOT EXISTS `test`; USE `test`; DROP TABLE IF EXISTS `test`; CREATE TABLE `test` ( ID varchar (40) NOT NULL , TEST1 varchar (40) DEFAULT NULL , TEST2 varchar (15) NOT NULL , TEST3 bigint (20) DEFAULT NULL , KEY `IDX_TEST` (TEST1, TEST2, TEST3) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT= DYNAMIC ; INSERT INTO test (ID, TEST1, TEST2, TEST3) VALUES ( 'row_a' , 'A.123' , 'C' , 3); INSERT INTO test (ID, TEST1, TEST2, TEST3) VALUES ( 'row_a' , 'B.456' , 'C' , 3); SET GLOBAL INNODB_STATUS_OUTPUT_LOCKS = 'ON' ; SET TRANSACTION ISOLATION LEVEL READ COMMITTED ; BEGIN ; DELETE FROM test WHERE TEST1 = 'A.123a' and TEST2 = 'C' and TEST3 = 3; SHOW ENGINE INNODB STATUS\G ROLLBACK ; Then run 10.4 and 10.5 versions in docker docker run -p 13306:3306 -d -e MYSQL_ALLOW_EMPTY_PASSWORD=true --name mariadb104 mariadb:10.4 mysql --port=13306 -uroot -h 127.0.0.1 < mariadb-10.5-defect.sql docker run -p 23306:3306 -d -e MYSQL_ALLOW_EMPTY_PASSWORD=true --name mariadb105 mariadb:10.5 mysql --port=23306 -uroot -h 127.0.0.1 < mariadb-10.5-defect.sql The 10.4 reads: LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 36, ACTIVE 0 sec 1 lock struct(s), heap size 1128, 0 row lock(s) MySQL thread id 8, OS thread handle 140406844811008, query id 12 172.17.0.1 root Init SHOW ENGINE INNODB STATUS TABLE LOCK table `test`.`test` trx id 36 lock mode IX The 10.5 reads ---TRANSACTION 36, ACTIVE 0 sec 2 lock struct(s), heap size 1128, 1 row lock(s) MySQL thread id 3, OS thread handle 139746906982144, query id 12 172.17.0.1 root starting SHOW ENGINE INNODB STATUS TABLE LOCK table `test`.`test` trx id 36 lock mode IX RECORD LOCKS space id 5 page no 4 n bits 72 index IDX_TEST of table `test`.`test` trx id 36 lock_mode X locks gap before rec Record lock, heap no 3 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 5; hex 422e343536; asc B.456;; 1: len 1; hex 43; asc C;; 2: len 8; hex 8000000000000003; asc ;; 3: len 6; hex 000000000201; asc ;;

Marko Mäkelä added a comment - 2020-11-18 06:42

I confirmed this with the following test case:

--source include/have_innodb.inc

CREATE TABLE test (

	ID varchar(40) NOT NULL,

	TEST1 varchar(40) DEFAULT NULL,

	TEST2 varchar(15) NOT NULL,

	TEST3 bigint(20) DEFAULT NULL,

	KEY IDX_TEST (TEST1, TEST2, TEST3)

) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC;

INSERT INTO test (ID, TEST1, TEST2, TEST3) VALUES ('row_a', 'A.123', 'C', 3);

INSERT INTO test (ID, TEST1, TEST2, TEST3) VALUES ('row_a', 'B.456', 'C', 3);

SET @save_locks= @@GLOBAL.innodb_status_output_locks;

SET GLOBAL INNODB_STATUS_OUTPUT_LOCKS = 'ON';

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

BEGIN;

EXPLAIN DELETE FROM test WHERE TEST1 = 'A.123a' and TEST2 = 'C' and TEST3 = 3;

DELETE FROM test WHERE TEST1 = 'A.123a' and TEST2 = 'C' and TEST3 = 3;

--replace_regex /.*(\d+ lock struct...), heap size \d+(, \d+ row lock...).*/\1\2/

SHOW ENGINE INNODB STATUS;

SET GLOBAL INNODB_STATUS_OUTPUT_LOCKS = @save_locks;

ROLLBACK;

DROP TABLE test;

The observed filtered output of SHOW ENGINE INNODB STATUS is as follows:

Type	Name	Status

InnoDB		2 lock struct(s), 1 row lock(s)

The expected output is: “1 lock struct(s), 0 row lock(s).” This regression was introduced almost at the start of the history of the 10.5 branch, by ~~MDEV-19544~~.

Marko Mäkelä added a comment - 2020-11-18 06:42 I confirmed this with the following test case: --source include/have_innodb.inc CREATE TABLE test ( ID varchar (40) NOT NULL , TEST1 varchar (40) DEFAULT NULL , TEST2 varchar (15) NOT NULL , TEST3 bigint (20) DEFAULT NULL , KEY IDX_TEST (TEST1, TEST2, TEST3) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT= DYNAMIC ; INSERT INTO test (ID, TEST1, TEST2, TEST3) VALUES ( 'row_a' , 'A.123' , 'C' , 3); INSERT INTO test (ID, TEST1, TEST2, TEST3) VALUES ( 'row_a' , 'B.456' , 'C' , 3); SET @save_locks= @@ GLOBAL .innodb_status_output_locks; SET GLOBAL INNODB_STATUS_OUTPUT_LOCKS = 'ON' ; SET TRANSACTION ISOLATION LEVEL READ COMMITTED ; BEGIN ; EXPLAIN DELETE FROM test WHERE TEST1 = 'A.123a' and TEST2 = 'C' and TEST3 = 3; DELETE FROM test WHERE TEST1 = 'A.123a' and TEST2 = 'C' and TEST3 = 3; --replace_regex /.*(\d+ lock struct...), heap size \d+(, \d+ row lock...).*/\1\2/ SHOW ENGINE INNODB STATUS; SET GLOBAL INNODB_STATUS_OUTPUT_LOCKS = @save_locks; ROLLBACK ; DROP TABLE test; The observed filtered output of SHOW ENGINE INNODB STATUS is as follows: Type Name Status InnoDB 2 lock struct(s), 1 row lock(s) The expected output is: “1 lock struct(s), 0 row lock(s).” This regression was introduced almost at the start of the history of the 10.5 branch, by MDEV-19544 .

Marko Mäkelä added a comment - 2020-11-18 07:49

I narrowed down the cause of this to the following hunks in ~~MDEV-19544~~:

diff a/storage/innobase/row/row0sel.cc b/storage/innobase/row/row0sel.cc

--- a/storage/innobase/row/row0sel.cc

+++ b/storage/innobase/row/row0sel.cc

@@ -4444,22 +4428,16 @@ row_search_mvcc(

 	      || prebuilt->table->no_rollback()

 	      || srv_read_only_mode);

-	if (trx->isolation_level <= TRX_ISO_READ_COMMITTED

-	    && prebuilt->select_lock_type != LOCK_NONE

-	    && trx->mysql_thd != NULL

-	    && thd_is_select(trx->mysql_thd)) {

-		/* It is a plain locking SELECT and the isolation

-		level is low: do not lock gaps */

-		set_also_gap_locks = FALSE;

-	}

+	/* Do not lock gaps for plain SELECT

+	at READ UNCOMMITTED or READ COMMITTED isolation level */

+	const bool set_also_gap_locks =

+		prebuilt->select_lock_type != LOCK_NONE

+		&& (trx->isolation_level > TRX_ISO_READ_COMMITTED

+		    || !thd_is_select(trx->mysql_thd))

 #ifdef WITH_WSREP

-	else if (wsrep_thd_skip_locking(trx->mysql_thd)) {

-		ut_ad(!strcmp(wsrep_get_sr_table_name(),

-			      prebuilt->table->name.m_name));

-		set_also_gap_locks = FALSE;

-	}

+		&& !wsrep_thd_skip_locking(trx->mysql_thd)

 #endif /* WITH_WSREP */

+		;

 	/* Note that if the search mode was GE or G, then the cursor

 	naturally moves upward (in fetch next) in alphabetical order,

@@ -4820,17 +4786,7 @@ row_search_mvcc(

 		if (0 != cmp_dtuple_rec(search_tuple, rec, offsets)) {

 			if (set_also_gap_locks

-			    && !(srv_locks_unsafe_for_binlog

-				 || trx->isolation_level

-				 <= TRX_ISO_READ_COMMITTED)

-			    && prebuilt->select_lock_type != LOCK_NONE

 			    && !dict_index_is_spatial(index)) {

-				/* Try to place a gap lock on the index

-				record only if innodb_locks_unsafe_for_binlog

-				option is not set or this session is not

-				using a READ COMMITTED or lower isolation level. */

 				err = sel_set_rec_lock(

 					pcur,

 					rec, index, offsets,

Previously, the definition of set_also_gap_locks was rather convoluted: initialized to TRUE and then based on some conditions, later assigned FALSE. I tried to fold the condition prebuilt->select_lock_type != LOCK_NONE into the new definition, so that further checks for that could be omitted. Apparently, I got it wrong in this particular code path.
The following one-liner would fix this test case:

diff --git a/storage/innobase/row/row0sel.cc b/storage/innobase/row/row0sel.cc

index d7a0490db3d..c07cd8413b8 100644

--- a/storage/innobase/row/row0sel.cc

+++ b/storage/innobase/row/row0sel.cc

@@ -4900,6 +4900,7 @@ row_search_mvcc(

 		if (0 != cmp_dtuple_rec(search_tuple, rec, offsets)) {

 			if (set_also_gap_locks

+			    && trx->isolation_level > TRX_ISO_READ_COMMITTED

 			    && !dict_index_is_spatial(index)) {

 				err = sel_set_rec_lock(

 					pcur,

There is a similar regression in some other code paths. I think that a fix along the following lines would be better:

diff --git a/storage/innobase/row/row0sel.cc b/storage/innobase/row/row0sel.cc

index d7a0490db3d..6e1cd2592d6 100644

--- a/storage/innobase/row/row0sel.cc

+++ b/storage/innobase/row/row0sel.cc

@@ -4542,12 +4542,11 @@ row_search_mvcc(

 	      || prebuilt->table->no_rollback()

 	      || srv_read_only_mode);

-	/* Do not lock gaps for plain SELECT

-	at READ UNCOMMITTED or READ COMMITTED isolation level */

+	/* Do not lock gaps at READ UNCOMMITTED or READ COMMITTED

+	isolation level */

 	const bool set_also_gap_locks =

 		prebuilt->select_lock_type != LOCK_NONE

-		&& (trx->isolation_level > TRX_ISO_READ_COMMITTED

-		    || !thd_is_select(trx->mysql_thd))

+		&& trx->isolation_level > TRX_ISO_READ_COMMITTED

 #ifdef WITH_WSREP

 		&& !wsrep_thd_skip_locking(trx->mysql_thd)

 #endif /* WITH_WSREP */

@@ -4755,7 +4754,6 @@ row_search_mvcc(

 	if (page_rec_is_supremum(rec)) {

 		if (set_also_gap_locks

-		    && trx->isolation_level > TRX_ISO_READ_COMMITTED

 		    && !dict_index_is_spatial(index)) {

 			/* Try to place a lock on the index record */

@@ -5020,7 +5018,7 @@ row_search_mvcc(

 			goto no_gap_lock;

-		if (!set_also_gap_locks

+		if (!set_also_gap_locks /* FIXME? */

 		    || (unique_search && !rec_get_deleted_flag(rec, comp))

 		    || dict_index_is_spatial(index)) {

In the last hunk, to keep the code equivalent to how it was before ~~MDEV-19544~~, we might have to evaluate thd_is_select(trx->mysql_thd). The broader flow of execution is as follows:

	if (prebuilt->select_lock_type != LOCK_NONE) {

		/* Try to place a lock on the index record; note that delete

		marked records are a special case in a unique search. If there

		is a non-delete marked record, then it is enough to lock its

		existence with LOCK_REC_NOT_GAP. */

		unsigned lock_type;

		if (trx->isolation_level <= TRX_ISO_READ_COMMITTED) {

			/* At READ COMMITTED or READ UNCOMMITTED

			isolation levels, do not lock committed

			delete-marked records. */

			if (!rec_get_deleted_flag(rec, comp)) {

				goto no_gap_lock;

…

			} else {

				/* The secondary index record does not

				point to a delete-marked clustered index

				record that belongs to an active transaction.

				Ignore the secondary index record, because

				it is not locked. */

				goto next_rec;

			goto no_gap_lock;

		if (!set_also_gap_locks /* FIXME? */

		    || (unique_search && !rec_get_deleted_flag(rec, comp))

		    || dict_index_is_spatial(index)) {

			goto no_gap_lock;

		} else {

			lock_type = LOCK_ORDINARY;

The /* FIXME? */ line is only reachable if the isolation level is REPEATABLE READ or SERIALIZABLE. In all branches of the READ UNCOMMITTED or READ COMMITTED case, we will always goto somewhere else. Let us review the proposed change of definition once again:

	const bool set_also_gap_locks =

		prebuilt->select_lock_type != LOCK_NONE

		&& (trx->isolation_level > TRX_ISO_READ_COMMITTED

		    || !thd_is_select(trx->mysql_thd))

The first condition prebuilt->select_lock_type != LOCK_NONE always holds in this code branch. Also trx->isolation_level > TRX_ISO_READ_COMMITTED will always hold at the FIXME line. So, the value of thd_is_select() does not matter at all there, and we can indeed safely remove that condition.

Marko Mäkelä added a comment - 2020-11-18 07:49 I narrowed down the cause of this to the following hunks in MDEV-19544 : diff a/storage/innobase/row/row0sel.cc b/storage/innobase/row/row0sel.cc --- a/storage/innobase/row/row0sel.cc +++ b/storage/innobase/row/row0sel.cc @@ -4444,22 +4428,16 @@ row_search_mvcc( || prebuilt->table->no_rollback() || srv_read_only_mode); - if (trx->isolation_level <= TRX_ISO_READ_COMMITTED - && prebuilt->select_lock_type != LOCK_NONE - && trx->mysql_thd != NULL - && thd_is_select(trx->mysql_thd)) { - /* It is a plain locking SELECT and the isolation - level is low: do not lock gaps */ - - set_also_gap_locks = FALSE; - } + /* Do not lock gaps for plain SELECT + at READ UNCOMMITTED or READ COMMITTED isolation level */ + const bool set_also_gap_locks = + prebuilt->select_lock_type != LOCK_NONE + && (trx->isolation_level > TRX_ISO_READ_COMMITTED + || !thd_is_select(trx->mysql_thd)) #ifdef WITH_WSREP - else if (wsrep_thd_skip_locking(trx->mysql_thd)) { - ut_ad(!strcmp(wsrep_get_sr_table_name(), - prebuilt->table->name.m_name)); - set_also_gap_locks = FALSE; - } + && !wsrep_thd_skip_locking(trx->mysql_thd) #endif /* WITH_WSREP */ + ; /* Note that if the search mode was GE or G, then the cursor naturally moves upward (in fetch next) in alphabetical order, @@ -4820,17 +4786,7 @@ row_search_mvcc( if (0 != cmp_dtuple_rec(search_tuple, rec, offsets)) { if (set_also_gap_locks - && !(srv_locks_unsafe_for_binlog - || trx->isolation_level - <= TRX_ISO_READ_COMMITTED) - && prebuilt->select_lock_type != LOCK_NONE && !dict_index_is_spatial(index)) { - - /* Try to place a gap lock on the index - record only if innodb_locks_unsafe_for_binlog - option is not set or this session is not - using a READ COMMITTED or lower isolation level. */ - err = sel_set_rec_lock( pcur, rec, index, offsets, Previously, the definition of set_also_gap_locks was rather convoluted: initialized to TRUE and then based on some conditions, later assigned FALSE . I tried to fold the condition prebuilt->select_lock_type != LOCK_NONE into the new definition, so that further checks for that could be omitted. Apparently, I got it wrong in this particular code path. The following one-liner would fix this test case: diff --git a/storage/innobase/row/row0sel.cc b/storage/innobase/row/row0sel.cc index d7a0490db3d..c07cd8413b8 100644 --- a/storage/innobase/row/row0sel.cc +++ b/storage/innobase/row/row0sel.cc @@ -4900,6 +4900,7 @@ row_search_mvcc( if (0 != cmp_dtuple_rec(search_tuple, rec, offsets)) { if (set_also_gap_locks + && trx->isolation_level > TRX_ISO_READ_COMMITTED && !dict_index_is_spatial(index)) { err = sel_set_rec_lock( pcur, There is a similar regression in some other code paths. I think that a fix along the following lines would be better: diff --git a/storage/innobase/row/row0sel.cc b/storage/innobase/row/row0sel.cc index d7a0490db3d..6e1cd2592d6 100644 --- a/storage/innobase/row/row0sel.cc +++ b/storage/innobase/row/row0sel.cc @@ -4542,12 +4542,11 @@ row_search_mvcc( || prebuilt->table->no_rollback() || srv_read_only_mode); - /* Do not lock gaps for plain SELECT - at READ UNCOMMITTED or READ COMMITTED isolation level */ + /* Do not lock gaps at READ UNCOMMITTED or READ COMMITTED + isolation level */ const bool set_also_gap_locks = prebuilt->select_lock_type != LOCK_NONE - && (trx->isolation_level > TRX_ISO_READ_COMMITTED - || !thd_is_select(trx->mysql_thd)) + && trx->isolation_level > TRX_ISO_READ_COMMITTED #ifdef WITH_WSREP && !wsrep_thd_skip_locking(trx->mysql_thd) #endif /* WITH_WSREP */ @@ -4755,7 +4754,6 @@ row_search_mvcc( if (page_rec_is_supremum(rec)) { if (set_also_gap_locks - && trx->isolation_level > TRX_ISO_READ_COMMITTED && !dict_index_is_spatial(index)) { /* Try to place a lock on the index record */ @@ -5020,7 +5018,7 @@ row_search_mvcc( goto no_gap_lock; } - if (!set_also_gap_locks + if (!set_also_gap_locks /* FIXME? */ || (unique_search && !rec_get_deleted_flag(rec, comp)) || dict_index_is_spatial(index)) { In the last hunk, to keep the code equivalent to how it was before MDEV-19544 , we might have to evaluate thd_is_select(trx->mysql_thd) . The broader flow of execution is as follows: if (prebuilt->select_lock_type != LOCK_NONE) { /* Try to place a lock on the index record; note that delete marked records are a special case in a unique search. If there is a non-delete marked record, then it is enough to lock its existence with LOCK_REC_NOT_GAP. */ unsigned lock_type; if (trx->isolation_level <= TRX_ISO_READ_COMMITTED) { /* At READ COMMITTED or READ UNCOMMITTED isolation levels, do not lock committed delete-marked records. */ if (!rec_get_deleted_flag(rec, comp)) { goto no_gap_lock; … } else { /* The secondary index record does not point to a delete-marked clustered index record that belongs to an active transaction. Ignore the secondary index record, because it is not locked. */ goto next_rec; } goto no_gap_lock; } if (!set_also_gap_locks /* FIXME? */ || (unique_search && !rec_get_deleted_flag(rec, comp)) || dict_index_is_spatial(index)) { goto no_gap_lock; } else { lock_type = LOCK_ORDINARY; } The /* FIXME? */ line is only reachable if the isolation level is REPEATABLE READ or SERIALIZABLE . In all branches of the READ UNCOMMITTED or READ COMMITTED case, we will always goto somewhere else. Let us review the proposed change of definition once again: const bool set_also_gap_locks = prebuilt->select_lock_type != LOCK_NONE && (trx->isolation_level > TRX_ISO_READ_COMMITTED || !thd_is_select(trx->mysql_thd)) The first condition prebuilt->select_lock_type != LOCK_NONE always holds in this code branch. Also trx->isolation_level > TRX_ISO_READ_COMMITTED will always hold at the FIXME line. So, the value of thd_is_select() does not matter at all there, and we can indeed safely remove that condition.

MariaDB Server

Gap lock on delete in 10.5 using READ COMMITTED

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration