[MDEV-20217] Semi_sync: Last_IO_Error: Fatal error: Failed to run 'after_queue_event' hook Created: 2019-07-30  Updated: 2022-10-21  Resolved: 2019-09-17

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.3.13
Fix Version/s: 10.3.19, 10.4.9

Type: Bug Priority: Major
Reporter: Nilnandan Joshi Assignee: Sujatha Sivakumar (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-29842 ERROR Semi-sync slave net_flush() rep... Closed

 Description   

In Semi-Sync replication environment, when master server got timeout waiting for the reply of binlog

2019-07-29 15:19:30 975211 [Warning] Timeout waiting for reply of binlog (file: bin_log.000346, pos: 247856273), semi-sync up to file bin_log.000346, position 247854688.
2019-07-29 15:19:30 975211 [Note] Semi-sync replication switched OFF.
2019-07-29 15:20:32 974875 [Note] Stop semi-sync binlog_dump to slave (server_id: 2)
2019-07-29 15:20:33 974874 [Note] Stop semi-sync binlog_dump to slave (server_id: 3)

Slave server getting stopped with below error.

2019-07-29 15:20:46 11 [ERROR] Semi-sync slave net_flush() reply failed
2019-07-29 15:20:46 11 [ERROR] Slave I/O: Fatal error: Failed to run 'after_queue_event' hook, Internal MariaDB error code: 1593
2019-07-29 15:20:46 11 [Note] Slave I/O thread exiting, read up to log 'bin_log.000346', position 247948930; GTID position 1-1-146008965

This looks like upstream bug https://bugs.mysql.com/bug.php?id=45852

upstream bug is resolved by MySQL with below note.

semisynch: Last_IO_Error: Fatal error: Failed to run 'after_queue_event' hook
      
      Errors when send reply to master should never cause the IO thread
      to stop, because master can fall back to async replication if it
      does not get reply from slave.
      
      The problem is fixed by deliberately ignoring the return value of
      slaveReply.
     @ plugin/semisync/semisync_slave_plugin.cc
        Deliberately ignore the return value of slaveReply so that errors
        while sending slave reply will not cause the IO thread to stop.



 Comments   
Comment by Sujatha Sivakumar (Inactive) [ 2019-08-05 ]

Hello Sachin,

Can you please review the fix for MDEV-20217.

An upstream fix is being implemented.

Build Bot Link: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.3-sujatha

Patch has been mailed to commit mailing list.

Thank you.

Comment by Sachin Setiya (Inactive) [ 2019-09-04 ]

http://lists.askmonty.org/pipermail/commits/2019-August/013922.html

Comment by Sujatha Sivakumar (Inactive) [ 2019-09-17 ]

Fix for the issue has been implemented in 10.3.19.

Fix was tested on 10.4 version.

10.4 changes: https://github.com/MariaDB/server/commit/090940b4f646be03baef7bd7af1c56084d16b9b1

Generated at Thu Feb 08 08:57:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.