[MDEV-15605] Power outage could corrupt binlogs on GTID slave Created: 2018-03-20 Updated: 2018-03-20 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major |
| Reporter: | Michaël de groot | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Description |
|
Hi, From discussing a customer case with Elkin we found that there is a missing feature in replication. With the gtid replication feature _gtid_slave_pos_ the slave is now crash-safe. But what about the slaves of that slave? And what about if the binary log is used for point in time recovery? For example Zmanda uses this point in time recovery for incremental backups. Imagine a slave where sync_binlogs=0 and innodb_flush_log_at_trx_commit=0 and a power outage. It is possible that gtid_slave_pos says position 12, but the binlog says seqno 10. If the server is restarted, the slave will login to the master and start the stream from seqno 12. This means transaction 11 is never written to the binlog. The other way around is also possible: It could be that the binlog has seqno 12, but gtid_slave_pos says position 10. This means seqno 11 is written to the binlog twice. This can be avoided by two features. Feature 1: Feature 2: It may be useful to make these 2 features user-initiatable as well, for example in the same way as sql_slave_skip_counter works. Thank you! |