Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6321

close_temporary_tables() in format description event not serialised correctly

Details

    Description

      In Start_log_event_v3::do_apply_event(), there is code to detect when the
      format description event is the first one logged after master restart. In this
      case, it calls rli->close_temporary_tables() and
      cleanup_load_tmpdir(&rli->mi->cmp_connection_name) to cleanup any temporary
      tables left dangling.

      This is a problem in parallel replication, as the code currently does not have
      any serialisation for this case.

      The code needs to handle this:

      • before processing the master restart event, wait until all prior events
        have completed, in case they are referencing temporary tables.
      • Delay processing of all later events, in case they will open new temporary
        tables.

      Attachments

        Activity

          knielsen Kristian Nielsen added a comment - Patch for review: http://lists.askmonty.org/pipermail/commits/2014-July/006275.html

          Hm, I think that patch is not enough.

          We also need to handle the case where the master crashes after writing only part of a transaction into the binlog. In this case, the following format description event is used to notify of the partial event group that must be rolled back.

          So I think when the FD event is encountered, we need to queue some marker event that tells the worker thread with the partial event group to roll back, otherwise we will deadlock with FD event waiting for group to complete and group waiting for FD event to know it must roll back...

          knielsen Kristian Nielsen added a comment - Hm, I think that patch is not enough. We also need to handle the case where the master crashes after writing only part of a transaction into the binlog. In this case, the following format description event is used to notify of the partial event group that must be rolled back. So I think when the FD event is encountered, we need to queue some marker event that tells the worker thread with the partial event group to roll back, otherwise we will deadlock with FD event waiting for group to complete and group waiting for FD event to know it must roll back...
          knielsen Kristian Nielsen added a comment - Patches for review: http://lists.askmonty.org/pipermail/commits/2014-August/006421.html http://lists.askmonty.org/pipermail/commits/2014-August/006423.html

          Assigning to Monty for review, as agreed on the IRC

          knielsen Kristian Nielsen added a comment - Assigning to Monty for review, as agreed on the IRC

          Review for patch 1

          void
          +rpl_parallel::wait_for_workers_idle(THD *thd)
          +{
          + uint32 i, max_i;
          +

          Add a comment why we don't need a mutex while we loop over
          domain_hash. (I noticed we do this a lot in the code already, but
          didn't find any comment why this is safe).

          I assume this is because this is called by the driver thread and it's
          only the driver thread that can cause things to be inserted into
          domain_hash?

          + max_i= domain_hash.records;
          + for (i= 0; i < max_i; ++i)
          + {
          + bool active;
          + wait_for_commit my_orderer;
          + struct rpl_parallel_entry *e;
          +
          + e= (struct rpl_parallel_entry *)my_hash_element(&domain_hash, i);
          + mysql_mutex_lock(&e->LOCK_parallel_entry);
          + if ((active= (e->current_sub_id > e->last_committed_sub_id)))
          +

          { + wait_for_commit *waitee= &e->current_group_info->commit_orderer; + my_orderer.register_wait_for_prior_commit(waitee); + thd->wait_for_commit_ptr= &my_orderer; + }

          + mysql_mutex_unlock(&e->LOCK_parallel_entry);
          + if (active)
          +

          { + my_orderer.wait_for_prior_commit(thd); + thd->wait_for_commit_ptr= NULL; + }

          + }
          +}

          <cut>

          + if (typ == FORMAT_DESCRIPTION_EVENT)
          + {
          + Format_description_log_event *fdev=
          + static_cast<Format_description_log_event *>(ev);
          + if (fdev->created)
          +

          { + /* + This format description event marks a new binlog after a master server + restart. We are going to close all temporary tables to clean up any + possible left-overs after a prior master crash. + + Thus we need to wait for all prior events to execute to completion, + in case they need access to any of the temporary tables. + */ + wait_for_workers_idle(rli->sql_driver_thd); + }

          + }

          What happens if one of the thread we are waiting for fails and has to
          rollback or if wait_for_workers_idle() is interupted by a kill while we are
          in wait_for_prior_commit() ?

          Will rli->abort_slave or sql_thread_stopping be set in this case so
          that we don't drop the temporary tables while the other thread is
          retrying the statement ?

          Appart for the above questions, I don't have any other comments.

          Regards,
          Monty

          monty Michael Widenius added a comment - Review for patch 1 void +rpl_parallel::wait_for_workers_idle(THD *thd) +{ + uint32 i, max_i; + Add a comment why we don't need a mutex while we loop over domain_hash. (I noticed we do this a lot in the code already, but didn't find any comment why this is safe). I assume this is because this is called by the driver thread and it's only the driver thread that can cause things to be inserted into domain_hash? + max_i= domain_hash.records; + for (i= 0; i < max_i; ++i) + { + bool active; + wait_for_commit my_orderer; + struct rpl_parallel_entry *e; + + e= (struct rpl_parallel_entry *)my_hash_element(&domain_hash, i); + mysql_mutex_lock(&e->LOCK_parallel_entry); + if ((active= (e->current_sub_id > e->last_committed_sub_id))) + { + wait_for_commit *waitee= &e->current_group_info->commit_orderer; + my_orderer.register_wait_for_prior_commit(waitee); + thd->wait_for_commit_ptr= &my_orderer; + } + mysql_mutex_unlock(&e->LOCK_parallel_entry); + if (active) + { + my_orderer.wait_for_prior_commit(thd); + thd->wait_for_commit_ptr= NULL; + } + } +} <cut> + if (typ == FORMAT_DESCRIPTION_EVENT) + { + Format_description_log_event *fdev= + static_cast<Format_description_log_event *>(ev); + if (fdev->created) + { + /* + This format description event marks a new binlog after a master server + restart. We are going to close all temporary tables to clean up any + possible left-overs after a prior master crash. + + Thus we need to wait for all prior events to execute to completion, + in case they need access to any of the temporary tables. + */ + wait_for_workers_idle(rli->sql_driver_thd); + } + } What happens if one of the thread we are waiting for fails and has to rollback or if wait_for_workers_idle() is interupted by a kill while we are in wait_for_prior_commit() ? Will rli->abort_slave or sql_thread_stopping be set in this case so that we don't drop the temporary tables while the other thread is retrying the statement ? Appart for the above questions, I don't have any other comments. Regards, Monty

          Done. See comments

          monty Michael Widenius added a comment - Done. See comments
          knielsen Kristian Nielsen added a comment - - edited

          > What happens if one of the thread we are waiting for fails and has to
          > rollback or if wait_for_workers_idle() is interupted by a kill while we are
          > in wait_for_prior_commit() ?

          Good catch, if the SQL driver thread gets killed while in this wait, the code
          would drop temporary tables without completing the wait. I've fixed it to
          catch the kill and return an error in this case (normally the SQL driver
          thread should not be killed, but better to handle this correctly in any case).

          If one of the threads we are waiting for returns failure from
          wait_for_prior_commit(), then it means that this thread has received a fatal
          error and will no longer execute any events, so that's fine.

          (If a thread gets a temporary error and needs to retry a transaction, it will
          not wake up any wait_for_prior_commit() until the retry succeeded or a fatal
          error occured).

          Thanks for review!

          knielsen Kristian Nielsen added a comment - - edited > What happens if one of the thread we are waiting for fails and has to > rollback or if wait_for_workers_idle() is interupted by a kill while we are > in wait_for_prior_commit() ? Good catch, if the SQL driver thread gets killed while in this wait, the code would drop temporary tables without completing the wait. I've fixed it to catch the kill and return an error in this case (normally the SQL driver thread should not be killed, but better to handle this correctly in any case). If one of the threads we are waiting for returns failure from wait_for_prior_commit(), then it means that this thread has received a fatal error and will no longer execute any events, so that's fine. (If a thread gets a temporary error and needs to retry a transaction, it will not wake up any wait_for_prior_commit() until the retry succeeded or a fatal error occured). Thanks for review!

          Pushed to 10.0

          knielsen Kristian Nielsen added a comment - Pushed to 10.0

          People

            knielsen Kristian Nielsen
            knielsen Kristian Nielsen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.