Details
-
New Feature
-
Status: In Progress (View Workflow)
-
Critical
-
Resolution: Unresolved
-
None
-
Q1/2026 Server Development
Description
In the feature, we should improve this routine in order to avoid throwing
away logs that are safely stored in the disk. Note also that this recovery
routine relies on the correctness of the relay-log.info and only tolerates
coordinate problems in master.info.
⸺ Code Documentation of `init_recovery()` of `slave.cc`
During startup (relay log initialization), a recovery process can keep the safe portion of the log intact and only truncate the trailing corruption.
This may be a partial event, an incomplete event group, or (failing all those) the last file written to.
This light procedure can even be activated automatically, superceding (if not deprecating) @@relay_log_recovery, which invalidates the entire relay log.
Truncating to the last whole event, leaving the group incomplete, will require leveraging the IO Thread's "ability" to resume mid-group.
But truncating to the last complete group has advantages:
- MDEV-38906 wants to get rid of this error-prone ability.
- Binlog Recovery (TC_LOG_BINLOG::recover()) already promises this result, just with XA rollback additionally.
Attachments
Issue Links
- blocks
-
MDEV-4698 With GTID replication, relay logs cannot be relied upon while purging binary logs on master
-
- In Progress
-
- relates to
-
MDEV-6811 Try to recovery from relay log read problem automatically
-
- Open
-
-
MDEV-8946 Add replication crash-safety for non-GTID slave.
-
- Open
-
-
MDEV-38192 Extend Binlog-in-Engine to Replicate XA Prepare
-
- Open
-
-
MDEV-38909
GTID State for Relay Log
-
- Open
-
- split to
-
MDEV-38906 Do not pause IO Threads in the middle of an event group
-
- Open
-