[MDEV-6321] close_temporary_tables() in format description event not serialised correctly - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.0.11
Fix Version/s: 10.0.14
Component/s: Replication
Labels:
- parallelslave
- replication

Description

In Start_log_event_v3::do_apply_event(), there is code to detect when the
format description event is the first one logged after master restart. In this
case, it calls rli->close_temporary_tables() and
cleanup_load_tmpdir(&rli->mi->cmp_connection_name) to cleanup any temporary
tables left dangling.

This is a problem in parallel replication, as the code currently does not have
any serialisation for this case.

The code needs to handle this:

before processing the master restart event, wait until all prior events
have completed, in case they are referencing temporary tables.

Delay processing of all later events, in case they will open new temporary
tables.

Attachments

Activity

Ascending order - Click to sort in descending order

Kristian Nielsen added a comment - 2014-07-02 13:53

Patch for review: http://lists.askmonty.org/pipermail/commits/2014-July/006275.html

Kristian Nielsen added a comment - 2014-07-02 13:53 Patch for review: http://lists.askmonty.org/pipermail/commits/2014-July/006275.html

Kristian Nielsen added a comment - 2014-07-02 15:34

Hm, I think that patch is not enough.

We also need to handle the case where the master crashes after writing only part of a transaction into the binlog. In this case, the following format description event is used to notify of the partial event group that must be rolled back.

So I think when the FD event is encountered, we need to queue some marker event that tells the worker thread with the partial event group to roll back, otherwise we will deadlock with FD event waiting for group to complete and group waiting for FD event to know it must roll back...

Kristian Nielsen added a comment - 2014-07-02 15:34 Hm, I think that patch is not enough. We also need to handle the case where the master crashes after writing only part of a transaction into the binlog. In this case, the following format description event is used to notify of the partial event group that must be rolled back. So I think when the FD event is encountered, we need to queue some marker event that tells the worker thread with the partial event group to roll back, otherwise we will deadlock with FD event waiting for group to complete and group waiting for FD event to know it must roll back...

Kristian Nielsen added a comment - 2014-08-19 15:29

Patches for review:

http://lists.askmonty.org/pipermail/commits/2014-August/006421.html

http://lists.askmonty.org/pipermail/commits/2014-August/006423.html

Kristian Nielsen added a comment - 2014-08-19 15:29 Patches for review: http://lists.askmonty.org/pipermail/commits/2014-August/006421.html http://lists.askmonty.org/pipermail/commits/2014-August/006423.html

Kristian Nielsen added a comment - 2014-08-19 16:15

Assigning to Monty for review, as agreed on the IRC

Kristian Nielsen added a comment - 2014-08-19 16:15 Assigning to Monty for review, as agreed on the IRC

Michael Widenius added a comment - 2014-08-19 21:17

Review for patch 1

void
+rpl_parallel::wait_for_workers_idle(THD *thd)
+{
+ uint32 i, max_i;
+

Add a comment why we don't need a mutex while we loop over
domain_hash. (I noticed we do this a lot in the code already, but
didn't find any comment why this is safe).

I assume this is because this is called by the driver thread and it's
only the driver thread that can cause things to be inserted into
domain_hash?

+ max_i= domain_hash.records;
+ for (i= 0; i < max_i; ++i)
+ {
+ bool active;
+ wait_for_commit my_orderer;
+ struct rpl_parallel_entry *e;
+
+ e= (struct rpl_parallel_entry *)my_hash_element(&domain_hash, i);
+ mysql_mutex_lock(&e->LOCK_parallel_entry);
+ if ((active= (e->current_sub_id > e->last_committed_sub_id)))
+

{ + wait_for_commit *waitee= &e->current_group_info->commit_orderer; + my_orderer.register_wait_for_prior_commit(waitee); + thd->wait_for_commit_ptr= &my_orderer; + }

+ mysql_mutex_unlock(&e->LOCK_parallel_entry);
+ if (active)
+

{ + my_orderer.wait_for_prior_commit(thd); + thd->wait_for_commit_ptr= NULL; + }

+ }
+}

<cut>

+ if (typ == FORMAT_DESCRIPTION_EVENT)
+ {
+ Format_description_log_event *fdev=
+ static_cast<Format_description_log_event *>(ev);
+ if (fdev->created)
+

{ + /* + This format description event marks a new binlog after a master server + restart. We are going to close all temporary tables to clean up any + possible left-overs after a prior master crash. + + Thus we need to wait for all prior events to execute to completion, + in case they need access to any of the temporary tables. + */ + wait_for_workers_idle(rli->sql_driver_thd); + }

+ }

What happens if one of the thread we are waiting for fails and has to
rollback or if wait_for_workers_idle() is interupted by a kill while we are
in wait_for_prior_commit() ?

Will rli->abort_slave or sql_thread_stopping be set in this case so
that we don't drop the temporary tables while the other thread is
retrying the statement ?

Appart for the above questions, I don't have any other comments.

Regards,
Monty

Michael Widenius added a comment - 2014-08-19 21:17 Review for patch 1 void +rpl_parallel::wait_for_workers_idle(THD *thd) +{ + uint32 i, max_i; + Add a comment why we don't need a mutex while we loop over domain_hash. (I noticed we do this a lot in the code already, but didn't find any comment why this is safe). I assume this is because this is called by the driver thread and it's only the driver thread that can cause things to be inserted into domain_hash? + max_i= domain_hash.records; + for (i= 0; i < max_i; ++i) + { + bool active; + wait_for_commit my_orderer; + struct rpl_parallel_entry *e; + + e= (struct rpl_parallel_entry *)my_hash_element(&domain_hash, i); + mysql_mutex_lock(&e->LOCK_parallel_entry); + if ((active= (e->current_sub_id > e->last_committed_sub_id))) + { + wait_for_commit *waitee= &e->current_group_info->commit_orderer; + my_orderer.register_wait_for_prior_commit(waitee); + thd->wait_for_commit_ptr= &my_orderer; + } + mysql_mutex_unlock(&e->LOCK_parallel_entry); + if (active) + { + my_orderer.wait_for_prior_commit(thd); + thd->wait_for_commit_ptr= NULL; + } + } +} <cut> + if (typ == FORMAT_DESCRIPTION_EVENT) + { + Format_description_log_event *fdev= + static_cast<Format_description_log_event *>(ev); + if (fdev->created) + { + /* + This format description event marks a new binlog after a master server + restart. We are going to close all temporary tables to clean up any + possible left-overs after a prior master crash. + + Thus we need to wait for all prior events to execute to completion, + in case they need access to any of the temporary tables. + */ + wait_for_workers_idle(rli->sql_driver_thd); + } + } What happens if one of the thread we are waiting for fails and has to rollback or if wait_for_workers_idle() is interupted by a kill while we are in wait_for_prior_commit() ? Will rli->abort_slave or sql_thread_stopping be set in this case so that we don't drop the temporary tables while the other thread is retrying the statement ? Appart for the above questions, I don't have any other comments. Regards, Monty

Michael Widenius added a comment - 2014-08-19 21:18

Done. See comments

Michael Widenius added a comment - 2014-08-19 21:18 Done. See comments

Kristian Nielsen added a comment - 2014-08-20 11:54 - edited

> What happens if one of the thread we are waiting for fails and has to
> rollback or if wait_for_workers_idle() is interupted by a kill while we are
> in wait_for_prior_commit() ?

Good catch, if the SQL driver thread gets killed while in this wait, the code
would drop temporary tables without completing the wait. I've fixed it to
catch the kill and return an error in this case (normally the SQL driver
thread should not be killed, but better to handle this correctly in any case).

If one of the threads we are waiting for returns failure from
wait_for_prior_commit(), then it means that this thread has received a fatal
error and will no longer execute any events, so that's fine.

(If a thread gets a temporary error and needs to retry a transaction, it will
not wake up any wait_for_prior_commit() until the retry succeeded or a fatal
error occured).

Thanks for review!

Kristian Nielsen added a comment - 2014-08-20 11:54 - edited > What happens if one of the thread we are waiting for fails and has to > rollback or if wait_for_workers_idle() is interupted by a kill while we are > in wait_for_prior_commit() ? Good catch, if the SQL driver thread gets killed while in this wait, the code would drop temporary tables without completing the wait. I've fixed it to catch the kill and return an error in this case (normally the SQL driver thread should not be killed, but better to handle this correctly in any case). If one of the threads we are waiting for returns failure from wait_for_prior_commit(), then it means that this thread has received a fatal error and will no longer execute any events, so that's fine. (If a thread gets a temporary error and needs to retry a transaction, it will not wake up any wait_for_prior_commit() until the retry succeeded or a fatal error occured). Thanks for review!

Kristian Nielsen added a comment - 2014-08-20 12:41

Pushed to 10.0

Kristian Nielsen added a comment - 2014-08-20 12:41 Pushed to 10.0

People

Assignee:: Kristian Nielsen

Reporter:: Kristian Nielsen

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2014-06-10 13:08

Updated:: 2014-10-01 12:04

Resolved:: 2014-08-20 12:41

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server