Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24818

Concurrent use of InnoDB table is impossible until the first transaction is finished

Details

    Description

      After the changes made in the scope of MDEV-515, this basic scenario will no longer work:

      --source include/have_innodb.inc
      create table t1 (pk int primary key) engine=InnoDB;
      start transaction;
      insert into t1 values (1);
      --connect (con1,localhost,root,,)
      insert into t1 values (2);
      

      The second INSERT, which in earlier versions is instant, will now hang until the first connection finishes the transaction (or will time out if it doesn't happen within the timeout interval).

      While the locking change was clearly intentional and explicitly mentioned at least in the commit comment to MDEV-515 patch, I still file it as a bug, because I think the negative effect of this change may by far exceed the improvements that MDEV-515 introduces.

      The scope of MDEV-515 is a performance improvement of a very big LOAD DATA or INSERT SELECT into an empty table – big enough to make this performance improvement significant from the practical perspective. While certainly important, it is still a rather specific use case, and the gain is limited to faster execution.

      On the other hand, "create/empty a table and start using it concurrently" scenario above is as basic as it gets; and the loss can be not only in performance (in case the first transaction is reasonably fast and other ones are just delayed), but in more extreme cases, when the first transaction remains open for a long time, it can render the application non-functional – everything else will be failing with timeouts.

      The problem can be happening anywhere within an application, both in business logic and deep in maintenance/service jobs.

      It will represent as seemingly sporadic not reproducible weird delays and timeouts and will be nearly impossible to investigate – and even if the cause happens to be revealed, changing this logic in all existing applications won't be realistic.

      I don't think that even a possibility to "opt-out" the new behavior would be sufficient, as users are generally not aware of details of the internal implementation of applications they use, as well as database admins if, for example, they maintain a shared hosting.

      Attachments

        Issue Links

          Activity

            I would propose new syntax to make the MDEV-515 ‘opt-in’ and also at the same time fix other limitations with it:

            --echo # If any statement fails, the entire transaction will be rolled back.
            START TRANSACTION ATOMIC;
            

            This would have the following consequences:

            1. The MDEV-515 optimization would only apply to operations between START TRANSACTION ATOMIC and COMMIT. More backward compatibility.
            2. We would also make mysqldump emit this new syntax (version-conditionally) in order not to cause a regression for MDEV-515.
            3. In a follow-up ticket, we could extend InnoDB so that multiple INSERT into the same initially empty table will be optimized.
            marko Marko Mäkelä added a comment - I would propose new syntax to make the MDEV-515 ‘opt-in’ and also at the same time fix other limitations with it: --echo # If any statement fails, the entire transaction will be rolled back. START TRANSACTION ATOMIC; This would have the following consequences: The MDEV-515 optimization would only apply to operations between START TRANSACTION ATOMIC and COMMIT . More backward compatibility. We would also make mysqldump emit this new syntax (version-conditionally) in order not to cause a regression for MDEV-515 . In a follow-up ticket, we could extend InnoDB so that multiple INSERT into the same initially empty table will be optimized.
            rpizzi Rick Pizzi (Inactive) added a comment - - edited

            MDEV-515 is very welcome . I agree the behaviour shown in this ticket is sub-optimal.
            But we could make MDEV-515 trigger on LOAD DATA INFILE only , as no one would normally try to insert rows in a table where a LOAD DATA INFILE is happening . And if he does, then have him waiting.

            rpizzi Rick Pizzi (Inactive) added a comment - - edited MDEV-515 is very welcome . I agree the behaviour shown in this ticket is sub-optimal. But we could make MDEV-515 trigger on LOAD DATA INFILE only , as no one would normally try to insert rows in a table where a LOAD DATA INFILE is happening . And if he does, then have him waiting.

            I agree that my original proposed syntax is somewhat weird, because transactions are expected to be kind of atomic. Maybe we could make it explicitly refer to the fact that only full and no partial rollback is supported:

            START TRANSACTION WITH ATOMIC ROLLBACK;
            

            Like my original proposal, also this one would reuse existing reserved words.

            marko Marko Mäkelä added a comment - I agree that my original proposed syntax is somewhat weird, because transactions are expected to be kind of atomic. Maybe we could make it explicitly refer to the fact that only full and no partial rollback is supported: START TRANSACTION WITH ATOMIC ROLLBACK ; Like my original proposal, also this one would reuse existing reserved words.

            What is the problem with adding a new dedicated word?
            Here's my take:

            START TRANSACTION FOR BULK LOAD;
            

            rpizzi Rick Pizzi (Inactive) added a comment - What is the problem with adding a new dedicated word? Here's my take: START TRANSACTION FOR BULK LOAD;

            For the purpose of this bug fix and extra optimizations aside, it would be enough to enable the new optimization only in auto-commit more or under LOCK TABLES.

            serg Sergei Golubchik added a comment - For the purpose of this bug fix and extra optimizations aside, it would be enough to enable the new optimization only in auto-commit more or under LOCK TABLES.

            The optimization is moved to MDEV-25036

            serg Sergei Golubchik added a comment - The optimization is moved to MDEV-25036

            The following patch would make this feature ‘opt-in’:

            diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc
            index 65cb8321500..b652aa187e8 100644
            --- a/storage/innobase/row/row0ins.cc
            +++ b/storage/innobase/row/row0ins.cc
            @@ -2660,6 +2660,7 @@ row_ins_clust_index_entry_low(
             	if (!(flags & BTR_NO_UNDO_LOG_FLAG)
             	    && page_is_empty(block->frame)
             	    && !entry->is_metadata() && !trx->duplicates
            +	    && !trx->check_unique_secondary && !trx->check_foreigns
             	    && !trx->ddl && !trx->internal
             	    && block->page.id().page_no() == index->page
             	    && !index->table->skip_alter_undo
            

            That is, explicit

            SET UNIQUE_CHECKS=0;
            SET FOREIGN_KEY_CHECKS=0;
            

            would be required to enable the table-level locking and undo logging of the INSERT statement. The settings are used by the scripts that are generated by mysqldump.

            We could extend this approach further to allow subsequent INSERT statements in the same transaction to suppress row-level undo logging in the same transaction. That should make loads of mysqldump fast out of the box. I do not think that we really need to detect LOCK TABLES to enable this logic, because unique_checks=0 should never be set unless one really is sure that no duplicate keys are possible. (It allows the InnoDB change buffer to corrupt data when the change buffer merge of a duplicate key eventually occurs.)

            marko Marko Mäkelä added a comment - The following patch would make this feature ‘opt-in’: diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc index 65cb8321500..b652aa187e8 100644 --- a/storage/innobase/row/row0ins.cc +++ b/storage/innobase/row/row0ins.cc @@ -2660,6 +2660,7 @@ row_ins_clust_index_entry_low( if (!(flags & BTR_NO_UNDO_LOG_FLAG) && page_is_empty(block->frame) && !entry->is_metadata() && !trx->duplicates + && !trx->check_unique_secondary && !trx->check_foreigns && !trx->ddl && !trx->internal && block->page.id().page_no() == index->page && !index->table->skip_alter_undo That is, explicit SET UNIQUE_CHECKS=0; SET FOREIGN_KEY_CHECKS=0; would be required to enable the table-level locking and undo logging of the INSERT statement. The settings are used by the scripts that are generated by mysqldump . We could extend this approach further to allow subsequent INSERT statements in the same transaction to suppress row-level undo logging in the same transaction. That should make loads of mysqldump fast out of the box. I do not think that we really need to detect LOCK TABLES to enable this logic, because unique_checks=0 should never be set unless one really is sure that no duplicate keys are possible. (It allows the InnoDB change buffer to corrupt data when the change buffer merge of a duplicate key eventually occurs.)

            I found out that the sysbench prepare scripts are usually running in autocommit mode without transaction boundaries. The following would enable the slightly faster insert (and remove the need of subsequent purge of INSERT history that was introduced in MDEV-12288):

            --- /usr/share/sysbench/oltp_update_index.lua	2020-06-05 16:47:03.000000000 +0300
            +++ oltp_update_index.lua	2021-03-12 14:51:26.974791244 +0200
            @@ -22,7 +22,9 @@
             require("oltp_common")
             
             function prepare_statements()
            +   con:query("SET autocommit=0, unique_checks=0, foreign_key_checks=0")
                prepare_index_updates()
            +   con:query("SET autocommit=1, unique_checks=1, foreign_key_checks=1")
             end
             
             function event()
            

            marko Marko Mäkelä added a comment - I found out that the sysbench prepare scripts are usually running in autocommit mode without transaction boundaries. The following would enable the slightly faster insert (and remove the need of subsequent purge of INSERT history that was introduced in MDEV-12288 ): --- /usr/share/sysbench/oltp_update_index.lua 2020-06-05 16:47:03.000000000 +0300 +++ oltp_update_index.lua 2021-03-12 14:51:26.974791244 +0200 @@ -22,7 +22,9 @@ require("oltp_common") function prepare_statements() + con:query("SET autocommit=0, unique_checks=0, foreign_key_checks=0") prepare_index_updates() + con:query("SET autocommit=1, unique_checks=1, foreign_key_checks=1") end function event()

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.