[MDEV-7929] record_gtid() for non-transactional event group calls wakeup_subsequent_commits() too early, causing slave hang Created: 2015-04-07  Updated: 2015-04-08  Resolved: 2015-04-08

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.0.17, 10.1.3
Fix Version/s: 10.0.18, 10.1.4

Type: Bug Priority: Critical
Reporter: Kristian Nielsen Assignee: Kristian Nielsen
Resolution: Fixed Votes: 0
Labels: parallelslave, replication

Issue Links:
Relates
relates to MDEV-7888 ANALYZE TABLE does wakeup_subsequent_... Closed

 Description   

This was found together with MDEV-7888, but it is a logically different bug,
so filing separately.

The parallel replication worker threads can hang in some cases with
non-transactional event groups. The symptom is that worker threads are stuck
in "waiting for prior transaction to start commit".

The problem is when record_gtid() runs at the end of the non-transactional
update. Then it needs to create its own transaction to update the
mysql.gtid_slave_pos table. This causes ha_commit_trans() to call
wakeup_subsequent_commits(). But this is wrong, it is too early.

The hang then occurs because a following transaction things the prior
non-transactional event group is already done - so it deallocates the
corresponding group_commit_orderer object. Then a following worker thread does
not get its wakeup, and the slave gets stuck.



 Comments   
Comment by Kristian Nielsen [ 2015-04-08 ]

http://lists.askmonty.org/pipermail/commits/2015-April/007723.html

Comment by Kristian Nielsen [ 2015-04-08 ]

10.1-specific part:

http://lists.askmonty.org/pipermail/commits/2015-April/007724.html

Generated at Thu Feb 08 07:23:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.