Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19577

Replication does not work with innodb_autoinc_lock_mode=2

Details

    Description

      In MDEV-17073, marko suggested that innodb_autoinc_lock_mode may only be needed for statement-based replication:

      Note: Comments in MySQL Bug #50413 suggest that innodb_autoinc_lock_mode settings 0 and 1 are equivalent in this respect. I’d also like to know whether this parameter is at all relevant outside statement-based replication (that is, when innodb_autoinc_lock_mode=2 could be safe to use). With the setting 2, InnoDB does not acquire any auto-increment lock within the transaction. With the settings 0 or 1, InnoDB will hold a lock until the end of the current statement. This would suggest that the setting only matters in statement-based replication.

      He wants to deprecate and remove the innodb_autoinc_lock_mode system variable, after setting it to 2 by default (MDEV-27844). For that to happen, the dependency on this variable would have to be removed for replication and binary logging.

      We have two bugs in replication:

      • binlog_format=STATEMENT is not being refused, even if it is unsafe in innodb_autoinc_lock_mode=2.
      • binlog_format=MIXED incorrectly uses STATEMENT-level replication for the affected operations

      Attachments

        Issue Links

          Activity

            Elkin Andrei Elkin added a comment -

            The statement binlog format can be extended to contain auto-increment intervals.
            There is a base 60272e750e921 to collect and make use at the "remote" side at applying.
            See handler::update_auto_increment() how it was supposed to work.
            The task effectively boils down

            • to collect intervals appearing as the result of innodb_autoinc_lock_mode=2
            • to encode them into Query_log_event
            • to decode them into execution context of the Query applying
            • to deprecate innodb_autoinc_lock_mode
            • and eventually remove the option.
            Elkin Andrei Elkin added a comment - The statement binlog format can be extended to contain auto-increment intervals. There is a base 60272e750e921 to collect and make use at the "remote" side at applying. See handler::update_auto_increment() how it was supposed to work. The task effectively boils down to collect intervals appearing as the result of innodb_autoinc_lock_mode=2 to encode them into Query_log_event to decode them into execution context of the Query applying to deprecate innodb_autoinc_lock_mode and eventually remove the option.
            Elkin Andrei Elkin added a comment - The following commit may be relevant: https://github.com/mysql/mysql-server/commit/72bca1d9a11e16b32585ce8e979ecfd8bbb349b1

            monty, you claimed some time ago that the auto-increment locks are unnecessary. It looks like you could be right. With the following simple patch, no mtr tests failed, apart from those that test this default value:

            diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
            index f79e58523f6..ceaa9795545 100644
            --- a/storage/innobase/handler/ha_innodb.cc
            +++ b/storage/innobase/handler/ha_innodb.cc
            @@ -19276,7 +19276,7 @@ static MYSQL_SYSVAR_LONG(autoinc_lock_mode, innobase_autoinc_lock_mode,
               " 1 => New style AUTOINC locking;"
               " 2 => No AUTOINC locking (unsafe for SBR)",
               NULL, NULL,
            -  AUTOINC_NEW_STYLE_LOCKING,	/* Default setting */
            +  AUTOINC_NO_LOCKING,	/* Default setting */
               AUTOINC_OLD_STYLE_LOCKING,	/* Minimum value */
               AUTOINC_NO_LOCKING, 0);	/* Maximum value */
             
            

            10.8 93756c992f18e86e380cb02078c4ba4cea123e5a

            Failing test(s): sys_vars.sysvars_innodb sys_vars.innodb_autoinc_lock_mode_basic
            

            Can you confirm that this is on the right track?

            If this really is so simple, the next steps would be to remove all code related to auto-increment locking, to deprecate and ignore this parameter, and of course, to run some replication stress tests.

            marko Marko Mäkelä added a comment - monty , you claimed some time ago that the auto-increment locks are unnecessary. It looks like you could be right. With the following simple patch, no mtr tests failed, apart from those that test this default value: diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc index f79e58523f6..ceaa9795545 100644 --- a/storage/innobase/handler/ha_innodb.cc +++ b/storage/innobase/handler/ha_innodb.cc @@ -19276,7 +19276,7 @@ static MYSQL_SYSVAR_LONG(autoinc_lock_mode, innobase_autoinc_lock_mode, " 1 => New style AUTOINC locking;" " 2 => No AUTOINC locking (unsafe for SBR)", NULL, NULL, - AUTOINC_NEW_STYLE_LOCKING, /* Default setting */ + AUTOINC_NO_LOCKING, /* Default setting */ AUTOINC_OLD_STYLE_LOCKING, /* Minimum value */ AUTOINC_NO_LOCKING, 0); /* Maximum value */ 10.8 93756c992f18e86e380cb02078c4ba4cea123e5a Failing test(s): sys_vars.sysvars_innodb sys_vars.innodb_autoinc_lock_mode_basic Can you confirm that this is on the right track? If this really is so simple, the next steps would be to remove all code related to auto-increment locking, to deprecate and ignore this parameter, and of course, to run some replication stress tests.

            Yes, as far as I know, you only need the variable for statement or mixed base logging (when mixed as run as statement, but this can trivially be checked for the statement).
            The reason is this works is because of two features:

            1) For binary logging, all rows are fully logged and thus there is no reason to lock the innodb increment. This will ensure that the slave will get the right data.
            2) Internally increment ranges is requested from the engine trough handler::update_auto_increment(), which ensures that two threads are not using the same auto_increment value.

            For doing efficient statement based logging with increments, then we would need save the used auto-increment intervals as part of the statement based logging, like Andrei suggested. For binary logging this should not be needed.

            monty Michael Widenius added a comment - Yes, as far as I know, you only need the variable for statement or mixed base logging (when mixed as run as statement, but this can trivially be checked for the statement). The reason is this works is because of two features: 1) For binary logging, all rows are fully logged and thus there is no reason to lock the innodb increment. This will ensure that the slave will get the right data. 2) Internally increment ranges is requested from the engine trough handler::update_auto_increment(), which ensures that two threads are not using the same auto_increment value. For doing efficient statement based logging with increments, then we would need save the used auto-increment intervals as part of the statement based logging, like Andrei suggested. For binary logging this should not be needed.
            marko Marko Mäkelä added a comment - - edited

            monty, thank you for the clarification.

            I have understood that with the default BINLOG_FORMAT=MIXED, statement-level logging will be used unless THD::current_stmt_binlog_format has been changed to BINLOG_FORMAT_ROW. There is some predicate for checking if an operation requires row-level logging.

            The uniqueness of AUTO_INCREMENT values is guaranteed by the dict_table_t::autoinc_mutex.

            If I understood correctly, the statement-level LOCK_AUTO_INC are needed for the BINLOG_FORMAT=STATEMENT format, to ensure that each statement will get a contiguous sequence of AUTO_INCREMENT values. Unfortunately, this locking will be needed in all operations that involve AUTO_INCREMENT values if there is any chance that any concurrent connection may use statement-level logging for AUTO_INCREMENT operations. That is a high price to pay for something that is rarely used (if my understanding is correct).

            We would seem to have to do something about AUTO_INCREMENT operations in BINLOG_FORMAT=STATEMENT mode. I can think of two options:

            1. Disallow such operations, similar to how we handled INSERT…ON DUPLICATE KEY UPDATE in MDEV-17073 until MDEV-17614 was fixed.
            2. Explicitly acquire an exclusive table lock for the duration of the statement, and downgrade it to intention-exclusive after the statement. That is, the exclusive InnoDB table lock would replace the role of the auto-increment lock. This would introduce new conflicts, for example, an UPDATE of some other than auto-increment column would not be able to execute concurrently.

            Obviously, disallowing would be easier. But it probably is unacceptable, because BINLOG_FORMAT=STATEMENT could still be widely used.

            We might need some storage engine API change for implementing the binlog-driven AUTO_INCREMENT locking, unless it is somehow doable with the existing member functions.

            marko Marko Mäkelä added a comment - - edited monty , thank you for the clarification. I have understood that with the default BINLOG_FORMAT=MIXED , statement-level logging will be used unless THD::current_stmt_binlog_format has been changed to BINLOG_FORMAT_ROW . There is some predicate for checking if an operation requires row-level logging. The uniqueness of AUTO_INCREMENT values is guaranteed by the dict_table_t::autoinc_mutex . If I understood correctly, the statement-level LOCK_AUTO_INC are needed for the BINLOG_FORMAT=STATEMENT format, to ensure that each statement will get a contiguous sequence of AUTO_INCREMENT values. Unfortunately, this locking will be needed in all operations that involve AUTO_INCREMENT values if there is any chance that any concurrent connection may use statement-level logging for AUTO_INCREMENT operations. That is a high price to pay for something that is rarely used (if my understanding is correct). We would seem to have to do something about AUTO_INCREMENT operations in BINLOG_FORMAT=STATEMENT mode. I can think of two options: Disallow such operations, similar to how we handled INSERT…ON DUPLICATE KEY UPDATE in MDEV-17073 until MDEV-17614 was fixed. Explicitly acquire an exclusive table lock for the duration of the statement, and downgrade it to intention-exclusive after the statement. That is, the exclusive InnoDB table lock would replace the role of the auto-increment lock. This would introduce new conflicts, for example, an UPDATE of some other than auto-increment column would not be able to execute concurrently. Obviously, disallowing would be easier. But it probably is unacceptable, because BINLOG_FORMAT=STATEMENT could still be widely used. We might need some storage engine API change for implementing the binlog-driven AUTO_INCREMENT locking, unless it is somehow doable with the existing member functions.
            Elkin Andrei Elkin added a comment -

            As a remark to MDEV-19577 description
            With the setting 2, InnoDB does not acquire any auto-increment lock within the transaction
            here we could (that is it's not done now) at least convert MIXED -> ROW.

            Elkin Andrei Elkin added a comment - As a remark to MDEV-19577 description With the setting 2, InnoDB does not acquire any auto-increment lock within the transaction here we could (that is it's not done now) at least convert MIXED -> ROW.
            Elkin Andrei Elkin added a comment -

            (A) As the STATEMENT binlog format will be undoubtedly required for a long time and
            (B) to remove @@ innodb_autoinc_lock_mode var means to stick to its current value 2
            the plan should be to teach binlog to collect intervals into INSERT Query_log_event.
            The slave applier part seems to be straightforward, the slave would repeat the master side interval allocations in this part of the stack which is common between them:

            #0  handler::update_auto_increment (this=0x61d000239728) at handler.cc:3838
            #1  0x000055555844c5d8 in ha_innobase::write_row (this=0x61d000239728, record=0x61a000095b38 "\377") at ha_innodb.cc:7740
            #2  0x0000555557ac42fd in handler::ha_write_row (this=0x61d000239728, buf=0x61a000095b38 "\377") at handler.cc:7519
            #3  0x0000555557134843 in write_record (thd=0x62b0001ab288, table=0x619000051a0
            

            As it looks there's nothing more requiring Engine changes for (B).

            OLD->NEW requirement constrains us to maintain @@ innodb_autoinc_lock_mode = 1 execution mode on slave, until the current INSERT Query_log_event format (with just one initial value) gets out of support. So we'd probably have to maintain the current Innodb mode=1 behavior until the relieve moment.

            Elkin Andrei Elkin added a comment - (A) As the STATEMENT binlog format will be undoubtedly required for a long time and (B) to remove @@ innodb_autoinc_lock_mode var means to stick to its current value 2 the plan should be to teach binlog to collect intervals into INSERT Query_log_event . The slave applier part seems to be straightforward, the slave would repeat the master side interval allocations in this part of the stack which is common between them: #0 handler::update_auto_increment (this=0x61d000239728) at handler.cc:3838 #1 0x000055555844c5d8 in ha_innobase::write_row (this=0x61d000239728, record=0x61a000095b38 "\377") at ha_innodb.cc:7740 #2 0x0000555557ac42fd in handler::ha_write_row (this=0x61d000239728, buf=0x61a000095b38 "\377") at handler.cc:7519 #3 0x0000555557134843 in write_record (thd=0x62b0001ab288, table=0x619000051a0 As it looks there's nothing more requiring Engine changes for (B). OLD->NEW requirement constrains us to maintain @@ innodb_autoinc_lock_mode = 1 execution mode on slave, until the current INSERT Query_log_event format (with just one initial value) gets out of support. So we'd probably have to maintain the current Innodb mode=1 behavior until the relieve moment.

            The following is implemented:

            • statement binlog format is switched to ROW if binlog format is MIXED and the statement changes autoincremented fields
            • warning is issued if innodb_autoinc_lock_mode == 2 and binlog format is STATEMENT
            vlad.lesin Vladislav Lesin added a comment - The following is implemented: statement binlog format is switched to ROW if binlog format is MIXED and the statement changes autoincremented fields warning is issued if innodb_autoinc_lock_mode == 2 and binlog format is STATEMENT

            People

              vlad.lesin Vladislav Lesin
              GeoffMontee Geoff Montee (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.