Details
-
Bug
-
Status: Closed (View Workflow)
-
Resolution: Fixed
-
None
-
None
Description
master: --binlog_annotate_rows_log_events
slave: --slave_annotate_rows_log_event --log_slave_updates=0
slave fails with error:
[ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master', Error_code: 1236
if master idle more than --slave_net_timeout.
Test-case in attachment.
I investigated it, and found the following:
1) master doesn't replicate Annotate_rows_log_event to slave if one of the following is true:
a) --replicate_annotate_rows_event is OFF
b) --log_slave_updates is OFF
2) While slave executes events, it calculates mi->master_log_pos (Exec_Master_Log_Pos).
If master doesn't replicate Annotate_rows_log_event to slave, mi->master_log_pos is incorrect (less than log_pos-on-master to sizeof(Annotate_rows_log_event) that weren't sent from master to slave).
3) After slave_net_timeout slave reconnects to master and sends mi->master_log_pos to it.
As a result, master tries to read event from its log from incorrect position.
Possible fixes:
1) always replicate Annotate_rows_log_event from master to slave. If you do this, option "--replicate_annotate_rows_events" doesn't have sense (always true) and should be removed
2) fix the Master Dump Thread - when "-replicate_annotate_rows_events" disabled or slave run with "log_slave_updates=0", master should notify slave about skipped events (probably as filtered event or send Rotate_log_event ) I don't sure what this is possible or correct fix (just idea)