As far as I have understood the issue, it is about master-side-recovery and not slave side recovery.
Currently, having rpl_semi_sync_master_enabled, causes binary logs to be truncated on the master side in the case where the slave has got the binary log event but the master dies before getting the semi-sync acknowledgement.
Truncation of the binary should only happen in the case the master was before a semi-sync-master and then it is changed to slave.
Truncation should not happen in the case the master continues to be a master or if semi-sync was not enabled when the server crashed.
I do not understand how setting just --init-rpl-role=SLAVE can tell the server if it can truncate the last event(s) from the binary log or not.
This is because the server cannot know the original value of rpl_semi_sync_master_enabled as the current value it may not be the same as when the server restarts. It also does not know if it was a master before the restart.
Because of this it may be better to have a dedicated variable to define if binary logs should be truncated or not.
There is another bigger problem:
- Truncation of the binary log should not be done automatically as on automatic master restart, the server cannot automatically know if it should continue as a master or as a slave. This decision needs to be done by other means, like by MaxScale or a human, that has to decide which server should be the new master. If the original master restarts fast, it is better to continue with this as a master. If the master was down for a long time or corrupted, then it is better to assign one of the slaves as a new master.
Here is a suggestion of how to solve this:
- Add option --init-rpl-role=UNKNOWN
- When server starts and notices that UNKNOWN is used, it should leave the binary log untruncated and not commit or rollback any of the binary logged transactions, and wait for MaxScale (or human) to connect and examine the state of the server. MaxScale can then examine the state of replication and decide for each server if it should be a master or slave by executing SET @@global.init_rpl_role=MASTER/SLAVE.
- We still need a separate variable or command to specify if the binary log should be truncated on the slave. Truncation should only be done in the case where the old master is now a slave and its binary log is one transaction before the new master and that transaction is not committed.
there is already such an option, init-rpl-role. It can be empty, MASTER, or SLAVE. Let's reuse it and enable slave-side recovery on --init-rpl-role=SLAVE