Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24581

MDL race condition in trans_rollback_to_savepoint()

Details

    Description

      In MariaDB Server 10.0.11, a change from MySQL 5.6.16 was applied.

      As far as I understand, the intention of the ha_rollback_to_savepoint_can_release_mdl() check is to ensure that no other transaction inside InnoDB is waiting for a lock on the table that the ROLLBACK TO SAVEPOINT would release. But, the check seems to be misplaced. We could have the following happen:

      1. innobase_rollback_to_savepoint_can_release_mdl() returns true, because the transaction only had placed implicit record locks. (Until MDEV-16232 is implemented, this is only possible if the transaction performed nothing else than INSERT operations).
      2. Another transaction starts to wait for a record lock that our transaction had implicitly locked. The implicit lock will be converted to an explicit one.
      3. ha_rollback_to_savepoint() will be executed. No explicit locks will be released, because we did not roll back the entire transaction.
      4. The MDL will be released in thd->mdl_context.rollback_to_savepoint(), even though the other transaction will be blocked inside InnoDB until our transaction has been committed or rolled back.

      Note: In innobase_rollback_to_savepoint_can_release_mdl() there is another race that would be fixed by the first hunk of the following untested patch:

      diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
      index 7cc4401d0ff..bca47f011a6 100644
      --- a/storage/innobase/handler/ha_innodb.cc
      +++ b/storage/innobase/handler/ha_innodb.cc
      @@ -4999,15 +4999,14 @@ innobase_rollback_to_savepoint_can_release_mdl(
       	DBUG_ASSERT(hton == innodb_hton_ptr);
       
       	trx_t*	trx = check_trx_exists(thd);
      +	trx_mutex_enter(trx);
      +	ulint n_locks = UT_LIST_GET_LEN(trx->lock.trx_locks) == 0;
      +	trx_mutex_exit(trx);
      +
       
       	/* If transaction has not acquired any locks then it is safe
       	to release MDL after rollback to savepoint */
      -	if (UT_LIST_GET_LEN(trx->lock.trx_locks) == 0) {
      -
      -		DBUG_RETURN(true);
      -	}
      -
      -	DBUG_RETURN(false);
      +	DBUG_RETURN(!n_locks);
       }
       
       /*****************************************************************//**
      diff --git a/sql/transaction.cc b/sql/transaction.cc
      index 543e0b7ad38..66b40d23e80 100644
      --- a/sql/transaction.cc
      +++ b/sql/transaction.cc
      @@ -718,10 +718,6 @@ bool trans_rollback_to_savepoint(THD *thd, LEX_STRING name)
           For backward-compatibility reasons we always release MDL if binary
           logging is off.
         */
      -  bool mdl_can_safely_rollback_to_savepoint=
      -                (!(mysql_bin_log.is_open() && thd->variables.sql_log_bin) ||
      -                 ha_rollback_to_savepoint_can_release_mdl(thd));
      -
         if (ha_rollback_to_savepoint(thd, sv))
           res= TRUE;
         else if (((thd->variables.option_bits & OPTION_KEEP_LOG) ||
      @@ -733,7 +729,8 @@ bool trans_rollback_to_savepoint(THD *thd, LEX_STRING name)
       
         thd->transaction.savepoints= sv;
       
      -  if (!res && mdl_can_safely_rollback_to_savepoint)
      +  if (!res && (!(mysql_bin_log.is_open() && thd->variables.sql_log_bin) ||
      +               ha_rollback_to_savepoint_can_release_mdl(thd)))
           thd->mdl_context.rollback_to_savepoint(sv->mdl_savepoint);
       
         DBUG_RETURN(MY_TEST(res));
      

      Note: I did not review whether this could break something in other implementations of handlerton::savepoint_rollback_can_release_mdl.

      Attachments

        Issue Links

          Activity

            server part of the change is done in MDEV-26077, commit https://github.com/MariaDB/server/commit/2bf6f2c054c2

            serg Sergei Golubchik added a comment - server part of the change is done in MDEV-26077 , commit https://github.com/MariaDB/server/commit/2bf6f2c054c2

            I believe that this was fixed by MDEV-25902 and its MDEV-26077 backport.

            The function innobase_rollback_to_savepoint_can_release_mdl() checks if trx->lock.trx_locks is 0. That read is not protected by trx->mutex, but a dirty read should be fine, because if the value is 0, then there can’t be any table locks, and therefore also no implicit or explicit record locks may be created.

            As I had pointed out, the optimization is not optimal: perhaps we could release some MDL, but the API does not let InnoDB tell which MDL can be released: it is an "all or nothing" deal.

            marko Marko Mäkelä added a comment - I believe that this was fixed by MDEV-25902 and its MDEV-26077 backport. The function innobase_rollback_to_savepoint_can_release_mdl() checks if trx->lock.trx_locks is 0. That read is not protected by trx->mutex , but a dirty read should be fine, because if the value is 0, then there can’t be any table locks, and therefore also no implicit or explicit record locks may be created. As I had pointed out, the optimization is not optimal: perhaps we could release some MDL, but the API does not let InnoDB tell which MDL can be released: it is an "all or nothing" deal.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.