Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.0.14
Description
During parallel replication, if a deadlock or other temporary error occurs,
the code needs to re-read and re-execute the events for that transaction.
There is a bug that causes the relay log positions of the SQL thread to be set
incorrect by this retry code. The result is that if the SQL thread is stopped
immediately after the successful retry of a transaction, the relay log
position will be wrong (it is too large by the size of the last event
executed). Thus, when the SQL thread is restarted, some kind of corruption
occurs (usually an error to read the event).
Note that the problem only occurs if the error that causes the retry happens
in the last event of the event group (COMMIT/XID). Otherwise that event is
executed by the normal apply code, which does not have the position bug.