Details
-
Task
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
None
Description
When run after master server crash --tc-heuristic-recover=rollback produces inconsistent server state with binlog still containing transactions that were rolled back by the option.
Such way recovered server may not be used for replication. E.g when such way recovered
ex-master is demoted into slave its binlog state needs further correction to subtract
the rolled back transactions from its binlog status. Otherwise the "new" slave might claim
those transactions as locally present at the (gtid) master-slave connection protocol. At the same time the actual "new" master may never have seen those transactions (because they never arrived at it when it was formerly slave, due to the crash).
This issue should be fixed with refining the recovery procedure with truncating binlog to cut off the prepared rolled back transactions. The method is also known as pioneered by FB
https://percona.community/blog/2018/08/23/question-about-semi-synchronous-replication-answer-with-all-the-details/
Once a transaction reaches the binary logs it should roll forward.
Attachments
Issue Links
- blocks
-
MDEV-11855 Make semisync crash safe with the cluster
-
- Open
-
- causes
-
MDEV-33465 an option to enable semisync recovery
-
- Closed
-
- includes
-
MDEV-26652 xa transactions binlogged in wrong order
-
- Closed
-
- is duplicated by
-
MDEV-20996 Maxscale auto-failover with semi-sync replication is not providing a true HA solution
-
- Closed
-
- relates to
-
MDEV-21168 Active XA transactions stop slave from working after backup was restored.
-
- Closed
-
-
MDEV-25395 server recovery hits replication event checksum error
-
- Stalled
-
-
MDEV-33424 when both rpl_semi_sync_MASTER,SLAVE_enabled set the server should recover as master
-
- Closed
-
-
MDEV-18959 Engine transaction recovery through persistent binlog
-
- Stalled
-
- links to
Activity
Field | Original Value | New Value |
---|---|---|
Link |
This issue relates to |
Link | This issue relates to MENT-203 [ MENT-203 ] |
Description |
When run after master server crash {{--tc-heuristic-recover=rollback}} produces inconsistent server state with binlog still containing transactions that were rolled back by the option.
Such way recovered server may not be used for replication. E.g when such way recovered ex-master is demoted into slave its binlog state needs further correction to subtract the rolled back transactions from its binlog status. Otherwise the "new" slave might claim those transactions as locally present in the master-slave gtid connection protocol. At the same time the actual "new" master may never have seen those transactions (because they never arrived at it when it was formerly slave, due to the crash). This issue should be fixed with refining the recovery procedure with truncating binlog to cut off the prepared rolled back transactions. The method is also known as pioneered by FB https://www.percona.com/community-blog/2018/08/23/question-about-semi-synchronous-replication-answer-with-all-the-details/. |
When run after master server crash {{--tc-heuristic-recover=rollback}} produces inconsistent server state with binlog still containing transactions that were rolled back by the option.
Such way recovered server may not be used for replication. E.g when such way recovered ex-master is demoted into slave its binlog state needs further correction to subtract the rolled back transactions from its binlog status. Otherwise the "new" slave might claim those transactions as locally present at the (gtid) master-slave connection protocol. At the same time the actual "new" master may never have seen those transactions (because they never arrived at it when it was formerly slave, due to the crash). This issue should be fixed with refining the recovery procedure with truncating binlog to cut off the prepared rolled back transactions. The method is also known as pioneered by FB https://www.percona.com/community-blog/2018/08/23/question-about-semi-synchronous-replication-answer-with-all-the-details/. |
Remote Link | This issue links to "MySQL WL#5493: Binlog crash-safe when master crashed (Web Link)" [ 29310 ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
Link |
This issue relates to |
Priority | Major [ 3 ] | Critical [ 2 ] |
Priority | Critical [ 2 ] | Blocker [ 1 ] |
Labels | need_feedback |
Labels | need_feedback |
Fix Version/s | 10.5 [ 23123 ] |
Affects Version/s | 10.5 [ 23123 ] |
Assignee | Sujatha Sivakumar [ sujatha.sivakumar ] | Andrei Elkin [ elkin ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Assignee | Andrei Elkin [ elkin ] | Sujatha Sivakumar [ sujatha.sivakumar ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Link |
This issue is blocked by |
Assignee | Sujatha Sivakumar [ sujatha.sivakumar ] | Sergei Golubchik [ serg ] |
Status | Stalled [ 10000 ] | In Review [ 10002 ] |
Priority | Blocker [ 1 ] | Critical [ 2 ] |
Link |
This issue is blocked by |
Assignee | Sergei Golubchik [ serg ] | Andrei Elkin [ elkin ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Status | Stalled [ 10000 ] | In Progress [ 3 ] |
Assignee | Andrei Elkin [ elkin ] | Sergei Golubchik [ serg ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Fix Version/s | 10.1 [ 16100 ] |
Fix Version/s | 10.5 [ 23123 ] |
Link | This issue blocks MENT-203 [ MENT-203 ] |
Link | This issue relates to MENT-203 [ MENT-203 ] |
Assignee | Sergei Golubchik [ serg ] | Andrei Elkin [ elkin ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Status | Stalled [ 10000 ] | In Progress [ 3 ] |
Link | This issue relates to MDEV-24654 [ MDEV-24654 ] |
Assignee | Andrei Elkin [ elkin ] | Sergei Golubchik [ serg ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Link | This issue is blocked by MDEV-24654 [ MDEV-24654 ] |
Link | This issue relates to MDEV-24654 [ MDEV-24654 ] |
Attachment | recovery_design.txt [ 55817 ] |
Summary | --tc-heuristic-recover=rollback is not replication safe | recovery for --rpl-semi-sync-slave-enabled server |
Attachment | recovery_design.txt [ 55817 ] |
Attachment | recovery_design.txt [ 55820 ] |
Link | This issue is blocked by MDEV-24654 [ MDEV-24654 ] |
Attachment | recovery_design.txt [ 55892 ] |
Attachment | recovery_design.txt [ 55820 ] |
Assignee | Sergei Golubchik [ serg ] | Sujatha Sivakumar [ sujatha.sivakumar ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Assignee | Sujatha Sivakumar [ sujatha.sivakumar ] | Andrei Elkin [ elkin ] |
Link |
This issue is duplicated by |
Link |
This issue relates to |
Link | This issue relates to MDEV-25395 [ MDEV-25395 ] |
Assignee | Andrei Elkin [ elkin ] | Sergei Golubchik [ serg ] |
Status | Stalled [ 10000 ] | In Review [ 10002 ] |
Summary | recovery for --rpl-semi-sync-slave-enabled server | refine the server binlog-based recovery for semisync |
Assignee | Sergei Golubchik [ serg ] | Andrei Elkin [ elkin ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Link | This issue blocks MDEV-11855 [ MDEV-11855 ] |
Affects Version/s | 10.2 [ 14601 ] | |
Affects Version/s | 10.1 [ 16100 ] | |
Affects Version/s | 10.3 [ 22126 ] | |
Affects Version/s | 10.4 [ 22408 ] | |
Issue Type | Bug [ 1 ] | Task [ 3 ] |
Issue Type | Task [ 3 ] | Bug [ 1 ] |
Affects Version/s | 10.1 [ 16100 ] | |
Affects Version/s | 10.2 [ 14601 ] | |
Affects Version/s | 10.3 [ 22126 ] | |
Affects Version/s | 10.4 [ 22408 ] | |
Affects Version/s | 10.5 [ 23123 ] |
Affects Version/s | 10.2 [ 14601 ] | |
Affects Version/s | 10.1 [ 16100 ] | |
Affects Version/s | 10.3 [ 22126 ] | |
Affects Version/s | 10.4 [ 22408 ] | |
Affects Version/s | 10.5 [ 23123 ] | |
Issue Type | Bug [ 1 ] | Task [ 3 ] |
Fix Version/s | 10.6 [ 24028 ] | |
Fix Version/s | 10.2 [ 14601 ] | |
Fix Version/s | 10.3 [ 22126 ] | |
Fix Version/s | 10.4 [ 22408 ] | |
Fix Version/s | 10.5 [ 23123 ] |
Link | This issue blocks MENT-1187 [ MENT-1187 ] |
Comment | [ A comment with security level 'Developers' was removed. ] |
Status | Stalled [ 10000 ] | In Progress [ 3 ] |
Assignee | Andrei Elkin [ elkin ] | Sergei Golubchik [ serg ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Assignee | Sergei Golubchik [ serg ] | Andrei Elkin [ elkin ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Assignee | Andrei Elkin [ elkin ] | Sergei Golubchik [ serg ] |
Status | Stalled [ 10000 ] | In Review [ 10002 ] |
Assignee | Sergei Golubchik [ serg ] | Andrei Elkin [ elkin ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Fix Version/s | 10.6.2 [ 25800 ] | |
Fix Version/s | 10.6 [ 24028 ] | |
Resolution | Fixed [ 1 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Assignee | Andrei Elkin [ elkin ] | Ian Gilfillan [ greenman ] |
Labels | need_feedback |
Labels | need_feedback |
Link |
This issue relates to |
Link |
This issue includes |
Link |
This issue relates to |
Workflow | MariaDB v3 [ 101333 ] | MariaDB v4 [ 134141 ] |
Description |
When run after master server crash {{--tc-heuristic-recover=rollback}} produces inconsistent server state with binlog still containing transactions that were rolled back by the option.
Such way recovered server may not be used for replication. E.g when such way recovered ex-master is demoted into slave its binlog state needs further correction to subtract the rolled back transactions from its binlog status. Otherwise the "new" slave might claim those transactions as locally present at the (gtid) master-slave connection protocol. At the same time the actual "new" master may never have seen those transactions (because they never arrived at it when it was formerly slave, due to the crash). This issue should be fixed with refining the recovery procedure with truncating binlog to cut off the prepared rolled back transactions. The method is also known as pioneered by FB https://www.percona.com/community-blog/2018/08/23/question-about-semi-synchronous-replication-answer-with-all-the-details/. |
When run after master server crash {{--tc-heuristic-recover=rollback}} produces inconsistent server state with binlog still containing transactions that were rolled back by the option.
Such way recovered server may not be used for replication. E.g when such way recovered ex-master is demoted into slave its binlog state needs further correction to subtract the rolled back transactions from its binlog status. Otherwise the "new" slave might claim those transactions as locally present at the (gtid) master-slave connection protocol. At the same time the actual "new" master may never have seen those transactions (because they never arrived at it when it was formerly slave, due to the crash). This issue should be fixed with refining the recovery procedure with truncating binlog to cut off the prepared rolled back transactions. The method is also known as pioneered by FB https://percona.community/blog/2018/08/23/question-about-semi-synchronous-replication-answer-with-all-the-details/ |
Description |
When run after master server crash {{--tc-heuristic-recover=rollback}} produces inconsistent server state with binlog still containing transactions that were rolled back by the option.
Such way recovered server may not be used for replication. E.g when such way recovered ex-master is demoted into slave its binlog state needs further correction to subtract the rolled back transactions from its binlog status. Otherwise the "new" slave might claim those transactions as locally present at the (gtid) master-slave connection protocol. At the same time the actual "new" master may never have seen those transactions (because they never arrived at it when it was formerly slave, due to the crash). This issue should be fixed with refining the recovery procedure with truncating binlog to cut off the prepared rolled back transactions. The method is also known as pioneered by FB https://percona.community/blog/2018/08/23/question-about-semi-synchronous-replication-answer-with-all-the-details/ |
When run after master server crash {{--tc-heuristic-recover=rollback}} produces inconsistent server state with binlog still containing transactions that were rolled back by the option.
Such way recovered server may not be used for replication. E.g when such way recovered ex-master is demoted into slave its binlog state needs further correction to subtract the rolled back transactions from its binlog status. Otherwise the "new" slave might claim those transactions as locally present at the (gtid) master-slave connection protocol. At the same time the actual "new" master may never have seen those transactions (because they never arrived at it when it was formerly slave, due to the crash). This issue should be fixed with refining the recovery procedure with truncating binlog to cut off the prepared rolled back transactions. The method is also known as pioneered by FB https://percona.community/blog/2018/08/23/question-about-semi-synchronous-replication-answer-with-all-the-details/ Once a transaction reaches the binary logs it should roll forward. |
Assignee | Ian Gilfillan [ greenman ] | Andrei Elkin [ elkin ] |
Resolution | Fixed [ 1 ] | |
Status | Closed [ 6 ] | Stalled [ 10000 ] |
Assignee | Andrei Elkin [ elkin ] | Brandon Nesterenko [ JIRAUSER48702 ] |
Resolution | Fixed [ 1 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Link |
This issue relates to |
Resolution | Fixed [ 1 ] | |
Status | Closed [ 6 ] | Stalled [ 10000 ] |
Link |
This issue causes |
Resolution | Fixed [ 1 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Zendesk Related Tickets | 125800 172110 134539 |
Link | This issue relates to MDEV-18959 [ MDEV-18959 ] |