[MDEV-24818] Concurrent use of InnoDB table is impossible until the first transaction is finished Created: 2021-02-09  Updated: 2023-09-26  Resolved: 2021-03-16

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6
Fix Version/s: 10.6.0

Type: Bug Priority: Blocker
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: regression

Issue Links:
Blocks
blocks MDEV-24993 pymysql - executemany test failing - ... Closed
Problem/Incident
causes MDEV-25297 InnoDB: Failing assertion: trx->roll_... Closed
causes MDEV-25315 Assertion `ptr' failed in ut_align_do... Closed
causes MDEV-25401 Assertion `!is_set() || (m_status == ... Closed
causes MDEV-25496 Assertion `trx->bulk_insert' failed i... Closed
causes MDEV-25534 Assertion `lock_table_has(trx, index-... Closed
causes MDEV-26121 [Note] InnoDB: Resetting invalid page... Closed
is caused by MDEV-515 innodb bulk insert Closed
Relates
relates to MDEV-24859 Assertion `static_cast<ins_node_t*>(t... Closed
relates to MDEV-24914 Assertion `lock_rec_get_first( lock_h... Closed
relates to MDEV-25036 use bulk insert optimization for mult... Open
relates to MDEV-28327 InnoDB persistent statistics fail to ... Closed
relates to MDEV-29545 InnoDB replace statement returns "can... Closed
relates to MDEV-30063 InnoDB: Failing assertion: ib_table->... Open
relates to MDEV-30173 Document magic 10.6+ spell for faster... Open
relates to MDEV-31985 IGNORE in INSERT and LOAD are being i... Confirmed
relates to MDEV-31835 LOAD DATA into InnoDB w/partitions: h... Closed
relates to MDEV-32250 To benefit from MDEV-515 , please mak... Open

 Description   

After the changes made in the scope of MDEV-515, this basic scenario will no longer work:

--source include/have_innodb.inc
create table t1 (pk int primary key) engine=InnoDB;
start transaction;
insert into t1 values (1);
--connect (con1,localhost,root,,)
insert into t1 values (2);

The second INSERT, which in earlier versions is instant, will now hang until the first connection finishes the transaction (or will time out if it doesn't happen within the timeout interval).

While the locking change was clearly intentional and explicitly mentioned at least in the commit comment to MDEV-515 patch, I still file it as a bug, because I think the negative effect of this change may by far exceed the improvements that MDEV-515 introduces.

The scope of MDEV-515 is a performance improvement of a very big LOAD DATA or INSERT SELECT into an empty table – big enough to make this performance improvement significant from the practical perspective. While certainly important, it is still a rather specific use case, and the gain is limited to faster execution.

On the other hand, "create/empty a table and start using it concurrently" scenario above is as basic as it gets; and the loss can be not only in performance (in case the first transaction is reasonably fast and other ones are just delayed), but in more extreme cases, when the first transaction remains open for a long time, it can render the application non-functional – everything else will be failing with timeouts.

The problem can be happening anywhere within an application, both in business logic and deep in maintenance/service jobs.

It will represent as seemingly sporadic not reproducible weird delays and timeouts and will be nearly impossible to investigate – and even if the cause happens to be revealed, changing this logic in all existing applications won't be realistic.

I don't think that even a possibility to "opt-out" the new behavior would be sufficient, as users are generally not aware of details of the internal implementation of applications they use, as well as database admins if, for example, they maintain a shared hosting.



 Comments   
Comment by Marko Mäkelä [ 2021-02-19 ]

I would propose new syntax to make the MDEV-515 ‘opt-in’ and also at the same time fix other limitations with it:

--echo # If any statement fails, the entire transaction will be rolled back.
START TRANSACTION ATOMIC;

This would have the following consequences:

  1. The MDEV-515 optimization would only apply to operations between START TRANSACTION ATOMIC and COMMIT. More backward compatibility.
  2. We would also make mysqldump emit this new syntax (version-conditionally) in order not to cause a regression for MDEV-515.
  3. In a follow-up ticket, we could extend InnoDB so that multiple INSERT into the same initially empty table will be optimized.
Comment by Rick Pizzi [ 2021-02-19 ]

MDEV-515 is very welcome . I agree the behaviour shown in this ticket is sub-optimal.
But we could make MDEV-515 trigger on LOAD DATA INFILE only , as no one would normally try to insert rows in a table where a LOAD DATA INFILE is happening . And if he does, then have him waiting.

Comment by Marko Mäkelä [ 2021-02-24 ]

I agree that my original proposed syntax is somewhat weird, because transactions are expected to be kind of atomic. Maybe we could make it explicitly refer to the fact that only full and no partial rollback is supported:

START TRANSACTION WITH ATOMIC ROLLBACK;

Like my original proposal, also this one would reuse existing reserved words.

Comment by Rick Pizzi [ 2021-02-25 ]

What is the problem with adding a new dedicated word?
Here's my take:

START TRANSACTION FOR BULK LOAD;

Comment by Sergei Golubchik [ 2021-02-26 ]

For the purpose of this bug fix and extra optimizations aside, it would be enough to enable the new optimization only in auto-commit more or under LOCK TABLES.

Comment by Sergei Golubchik [ 2021-03-02 ]

The optimization is moved to MDEV-25036

Comment by Marko Mäkelä [ 2021-03-03 ]

The following patch would make this feature ‘opt-in’:

diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc
index 65cb8321500..b652aa187e8 100644
--- a/storage/innobase/row/row0ins.cc
+++ b/storage/innobase/row/row0ins.cc
@@ -2660,6 +2660,7 @@ row_ins_clust_index_entry_low(
 	if (!(flags & BTR_NO_UNDO_LOG_FLAG)
 	    && page_is_empty(block->frame)
 	    && !entry->is_metadata() && !trx->duplicates
+	    && !trx->check_unique_secondary && !trx->check_foreigns
 	    && !trx->ddl && !trx->internal
 	    && block->page.id().page_no() == index->page
 	    && !index->table->skip_alter_undo

That is, explicit

SET UNIQUE_CHECKS=0;
SET FOREIGN_KEY_CHECKS=0;

would be required to enable the table-level locking and undo logging of the INSERT statement. The settings are used by the scripts that are generated by mysqldump.

We could extend this approach further to allow subsequent INSERT statements in the same transaction to suppress row-level undo logging in the same transaction. That should make loads of mysqldump fast out of the box. I do not think that we really need to detect LOCK TABLES to enable this logic, because unique_checks=0 should never be set unless one really is sure that no duplicate keys are possible. (It allows the InnoDB change buffer to corrupt data when the change buffer merge of a duplicate key eventually occurs.)

Comment by Marko Mäkelä [ 2021-03-12 ]

I found out that the sysbench prepare scripts are usually running in autocommit mode without transaction boundaries. The following would enable the slightly faster insert (and remove the need of subsequent purge of INSERT history that was introduced in MDEV-12288):

--- /usr/share/sysbench/oltp_update_index.lua	2020-06-05 16:47:03.000000000 +0300
+++ oltp_update_index.lua	2021-03-12 14:51:26.974791244 +0200
@@ -22,7 +22,9 @@
 require("oltp_common")
 
 function prepare_statements()
+   con:query("SET autocommit=0, unique_checks=0, foreign_key_checks=0")
    prepare_index_updates()
+   con:query("SET autocommit=1, unique_checks=1, foreign_key_checks=1")
 end
 
 function event()

Generated at Thu Feb 08 09:32:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.