Details
-
New Feature
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Background: MariaDB uses a Two-Phase Commit (2PC) protocol to synchronize the InnoDB Storage Engine and the Binary Log (Transaction Coordinator). During a snapshot-based backup, if BACKUP STAGE BLOCK_COMMIT is used, transactions can be "trapped" in the PREPARED state in InnoDB without a corresponding COMMIT entry in the Binary Log.
Current Behavior: Upon restoring such a snapshot, if the Binary Log is missing, the server detects the "In-Doubt" transactions in InnoDB. To prevent potential data drift between the storage engine and the coordinator, the server hits a "Safety Brake":
- It logs: [ERROR] Found X prepared transactions! ... You have to start server with --tc-heuristic-recover switch.
- The process Aborts. In Kubernetes or automated environments, this leads to an infinite CrashLoopBackOff, requiring manual intervention.
Requirement: The server should be capable of an "Automated Self-Heal" when it detects that the Transaction Coordinator (Binary Log) is unavailable or incomplete.
Proposed Logic Change: Introduce a new configuration variable (e.g., --tc-auto-heuristic-recover=ROLLBACK|COMMIT|OFF) or modify the existing startup logic to handle the following:
- Detection: If the server finds internal PREPARED transactions in InnoDB but the configured TC-log (Binary Log) is missing or cannot be initialized.
- Action: Instead of aborting, the server should apply a pre-configured heuristic decision (defaulting to ROLLBACK to ensure consistency with the missing logs).
- Execution: The server should resolve the transactions, log a [WARNING] instead of an [ERROR], and proceed to a full READY state for connections.
Business Justification: In modern cloud-native environments (Kubernetes/OpenShift), manual intervention to provide startup flags is a significant blocker for High Availability (HA) and Disaster Recovery (DR). Automating this recovery ensures that standby nodes can recover from filesystem-level snapshots without administrative overhead.
Attachments
Issue Links
- relates to
-
MDEV-34705 Improving performance of binary logging by removing the need of syncing it
-
- In Testing
-