[MDEV-6321] close_temporary_tables() in format description event not serialised correctly Created: 2014-06-10  Updated: 2014-10-01  Resolved: 2014-08-20

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.0.11
Fix Version/s: 10.0.14

Type: Bug Priority: Critical
Reporter: Kristian Nielsen Assignee: Kristian Nielsen
Resolution: Fixed Votes: 0
Labels: parallelslave, replication


 Description   

In Start_log_event_v3::do_apply_event(), there is code to detect when the
format description event is the first one logged after master restart. In this
case, it calls rli->close_temporary_tables() and
cleanup_load_tmpdir(&rli->mi->cmp_connection_name) to cleanup any temporary
tables left dangling.

This is a problem in parallel replication, as the code currently does not have
any serialisation for this case.

The code needs to handle this:

  • before processing the master restart event, wait until all prior events
    have completed, in case they are referencing temporary tables.
  • Delay processing of all later events, in case they will open new temporary
    tables.


 Comments   
Comment by Kristian Nielsen [ 2014-07-02 ]

Patch for review: http://lists.askmonty.org/pipermail/commits/2014-July/006275.html

Comment by Kristian Nielsen [ 2014-07-02 ]

Hm, I think that patch is not enough.

We also need to handle the case where the master crashes after writing only part of a transaction into the binlog. In this case, the following format description event is used to notify of the partial event group that must be rolled back.

So I think when the FD event is encountered, we need to queue some marker event that tells the worker thread with the partial event group to roll back, otherwise we will deadlock with FD event waiting for group to complete and group waiting for FD event to know it must roll back...

Comment by Kristian Nielsen [ 2014-08-19 ]

Patches for review:

http://lists.askmonty.org/pipermail/commits/2014-August/006421.html

http://lists.askmonty.org/pipermail/commits/2014-August/006423.html

Comment by Kristian Nielsen [ 2014-08-19 ]

Assigning to Monty for review, as agreed on the IRC

Comment by Michael Widenius [ 2014-08-19 ]

Review for patch 1

void
+rpl_parallel::wait_for_workers_idle(THD *thd)
+{
+ uint32 i, max_i;
+

Add a comment why we don't need a mutex while we loop over
domain_hash. (I noticed we do this a lot in the code already, but
didn't find any comment why this is safe).

I assume this is because this is called by the driver thread and it's
only the driver thread that can cause things to be inserted into
domain_hash?

+ max_i= domain_hash.records;
+ for (i= 0; i < max_i; ++i)
+ {
+ bool active;
+ wait_for_commit my_orderer;
+ struct rpl_parallel_entry *e;
+
+ e= (struct rpl_parallel_entry *)my_hash_element(&domain_hash, i);
+ mysql_mutex_lock(&e->LOCK_parallel_entry);
+ if ((active= (e->current_sub_id > e->last_committed_sub_id)))
+

{ + wait_for_commit *waitee= &e->current_group_info->commit_orderer; + my_orderer.register_wait_for_prior_commit(waitee); + thd->wait_for_commit_ptr= &my_orderer; + }

+ mysql_mutex_unlock(&e->LOCK_parallel_entry);
+ if (active)
+

{ + my_orderer.wait_for_prior_commit(thd); + thd->wait_for_commit_ptr= NULL; + }

+ }
+}

<cut>

+ if (typ == FORMAT_DESCRIPTION_EVENT)
+ {
+ Format_description_log_event *fdev=
+ static_cast<Format_description_log_event *>(ev);
+ if (fdev->created)
+

{ + /* + This format description event marks a new binlog after a master server + restart. We are going to close all temporary tables to clean up any + possible left-overs after a prior master crash. + + Thus we need to wait for all prior events to execute to completion, + in case they need access to any of the temporary tables. + */ + wait_for_workers_idle(rli->sql_driver_thd); + }

+ }

What happens if one of the thread we are waiting for fails and has to
rollback or if wait_for_workers_idle() is interupted by a kill while we are
in wait_for_prior_commit() ?

Will rli->abort_slave or sql_thread_stopping be set in this case so
that we don't drop the temporary tables while the other thread is
retrying the statement ?

Appart for the above questions, I don't have any other comments.

Regards,
Monty

Comment by Michael Widenius [ 2014-08-19 ]

Done. See comments

Comment by Kristian Nielsen [ 2014-08-20 ]

> What happens if one of the thread we are waiting for fails and has to
> rollback or if wait_for_workers_idle() is interupted by a kill while we are
> in wait_for_prior_commit() ?

Good catch, if the SQL driver thread gets killed while in this wait, the code
would drop temporary tables without completing the wait. I've fixed it to
catch the kill and return an error in this case (normally the SQL driver
thread should not be killed, but better to handle this correctly in any case).

If one of the threads we are waiting for returns failure from
wait_for_prior_commit(), then it means that this thread has received a fatal
error and will no longer execute any events, so that's fine.

(If a thread gets a temporary error and needs to retry a transaction, it will
not wake up any wait_for_prior_commit() until the retry succeeded or a fatal
error occured).

Thanks for review!

Comment by Kristian Nielsen [ 2014-08-20 ]

Pushed to 10.0

Generated at Thu Feb 08 07:11:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.