Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11855

Make semisync crash safe with the cluster

Details

    Description

      Semisync state and document that the slave is up to date with master under some predefine delay , but this is not physically true (despite the client have never seen those extra transaction they are in the binlog) as the ACK is done after SYNC or after COMMIT. What is true is that no transaction have been committed until it reach a slave , but it can lead the old master in a state where it need to be restore from the cluster.

      PERSISTENT ACK

      The sync would be prepare with the group commit store inside InnoDB system table with expected gtid inside a master system table and ACK acknowledge the system table receive before commit. During Crash recover all trx missing the acknowledge would be rollback.

      PUSH MODEL
      Inter storage engine 2PC can be use to push the binlog to the slaves. Lucky spider engine can do this in 2PC with all slaves or some preselected slaves and report back the failure of the transaction via it's monitoring of 2PC feature.

      Semisync master plugin would inject into a spider table linked to a relay log system table on every node of the cluster , based on the number of success will assign the status of in sync to the replication. This is loosing the first ACK win but bring back true crash safe capabilities .

      The Semisync slave plugin would select witch queue to apply in sync mode it read the relay from system table , in assync mode from the binog

      A remote failure would commit anyway as there is no reason for the local spider system table not to succeed. Only spider monitoring would tell us that remote slaves are down and replication should be switch to un sync . Coming back to sync state would reset optimistic spider table status to make an other tentative by reseting the state of the slave inside the spider local table.

      mysql_sandbox5012-bin.000001 648803 Gtid 5054 648841 BEGIN GTID 0-5054-4185
      mysql_sandbox5012-bin.000001 648841 Table_map 5054 648889 table_id: 60 (test.test119)
      mysql_sandbox5012-bin.000001 648889 Write_rows_v1 5054 648931 table_id: 60 flags: STMT_END_F
      AFTER SYNC ACK (crash 1)
      mysql_sandbox5012-bin.000001 648931 Xid 5054 648958 COMMIT /* xid=1400239 */
      (crash 2)
      AFTER COMMIT ACK

      Crash 1 follow by crash recovery would make an unfinished transaction rollback on master restart ,
      how the slave manage such case ?

      Crash 2 in AFTER SYNC Client have receive trx ok but slave may not receive XID Commit
      if elect a slave here we may miss the transaction

      Crash in 2
      AFTER COMMIT ACK
      May leave the master with extra transaction , elect slave will miss one transaction

      Attachments

        Issue Links

          Activity

            stephane@skysql.com VAROQUI Stephane created issue -
            stephane@skysql.com VAROQUI Stephane made changes -
            Field Original Value New Value
            Summary Make semisync work as expected Make semisync crash safe with the cluster
            stephane@skysql.com VAROQUI Stephane made changes -
            Description Semisync state and document that the slave is up to date with master under some predefine delay , but this is not the true as the ACK is done after SYNC or after COMMIT.

            a proper way for fixing this is by introducing the ACK inside a storage engine to enable inter storage engine 2PC to work as usual inside a transaction. Lucky spider engine can do this in 2PC with all slaves or some preselected slaves and report back the failure of the transaction via it's monitoring of 2PC feature.


            Semisync master plugin would inject into a spider table linked to a system table on every node of the cluster , based on the number of success will assign the status of in sync to the replication. This is loosing the first ACK win but bring back true crash safe capabilities .

            A remote failure would commit anyway as there is no reason for the local spider system table not to succeed. Only spider monitoring would tell us that remote slaves are down and replication should be switch to un sync . Coming back to sync state would reset optimistic spider table status to make an other tentative by reseting the state of the slave inside the spider local table.

               
            Semisync state and document that the slave is up to date with master under some predefine delay , but this is not the true as the ACK is done after SYNC or after COMMIT. What is true is that no transaction have been committed until it reach a slave , but it can lead the old master in a state where it need to be restore from the cluster.

            PERSISTENT ACK

            The sync would be prepare with the group commit store inside InnoDB system table with expected gtid inside a master system table and ACK acknowledge the system table receive before commit. During Crash recover all trx missing the acknowledge would be rollback.

            PUSH MODEL
            Inter storage engine 2PC can be use to push the binlog to the slaves. Lucky spider engine can do this in 2PC with all slaves or some preselected slaves and report back the failure of the transaction via it's monitoring of 2PC feature.

            Semisync master plugin would inject into a spider table linked to a relay log system table on every node of the cluster , based on the number of success will assign the status of in sync to the replication. This is loosing the first ACK win but bring back true crash safe capabilities .

            The Semisync slave plugin would select witch queue to apply in sync mode it read the relay from system table , in assync mode from the binog
              
            A remote failure would commit anyway as there is no reason for the local spider system table not to succeed. Only spider monitoring would tell us that remote slaves are down and replication should be switch to un sync . Coming back to sync state would reset optimistic spider table status to make an other tentative by reseting the state of the slave inside the spider local table.

               
            stephane@skysql.com VAROQUI Stephane made changes -
            Description Semisync state and document that the slave is up to date with master under some predefine delay , but this is not the true as the ACK is done after SYNC or after COMMIT. What is true is that no transaction have been committed until it reach a slave , but it can lead the old master in a state where it need to be restore from the cluster.

            PERSISTENT ACK

            The sync would be prepare with the group commit store inside InnoDB system table with expected gtid inside a master system table and ACK acknowledge the system table receive before commit. During Crash recover all trx missing the acknowledge would be rollback.

            PUSH MODEL
            Inter storage engine 2PC can be use to push the binlog to the slaves. Lucky spider engine can do this in 2PC with all slaves or some preselected slaves and report back the failure of the transaction via it's monitoring of 2PC feature.

            Semisync master plugin would inject into a spider table linked to a relay log system table on every node of the cluster , based on the number of success will assign the status of in sync to the replication. This is loosing the first ACK win but bring back true crash safe capabilities .

            The Semisync slave plugin would select witch queue to apply in sync mode it read the relay from system table , in assync mode from the binog
              
            A remote failure would commit anyway as there is no reason for the local spider system table not to succeed. Only spider monitoring would tell us that remote slaves are down and replication should be switch to un sync . Coming back to sync state would reset optimistic spider table status to make an other tentative by reseting the state of the slave inside the spider local table.

               
            Semisync state and document that the slave is up to date with master under some predefine delay , but this is not physically true (despite the client have never seen those extra transaction they are in the binlog) as the ACK is done after SYNC or after COMMIT. What is true is that no transaction have been committed until it reach a slave , but it can lead the old master in a state where it need to be restore from the cluster.

            PERSISTENT ACK

            The sync would be prepare with the group commit store inside InnoDB system table with expected gtid inside a master system table and ACK acknowledge the system table receive before commit. During Crash recover all trx missing the acknowledge would be rollback.

            PUSH MODEL
            Inter storage engine 2PC can be use to push the binlog to the slaves. Lucky spider engine can do this in 2PC with all slaves or some preselected slaves and report back the failure of the transaction via it's monitoring of 2PC feature.

            Semisync master plugin would inject into a spider table linked to a relay log system table on every node of the cluster , based on the number of success will assign the status of in sync to the replication. This is loosing the first ACK win but bring back true crash safe capabilities .

            The Semisync slave plugin would select witch queue to apply in sync mode it read the relay from system table , in assync mode from the binog
              
            A remote failure would commit anyway as there is no reason for the local spider system table not to succeed. Only spider monitoring would tell us that remote slaves are down and replication should be switch to un sync . Coming back to sync state would reset optimistic spider table status to make an other tentative by reseting the state of the slave inside the spider local table.

               
            elenst Elena Stepanova made changes -
            Affects Version/s 10.1.21 [ 22113 ]
            Affects Version/s 10.2.3 [ 22115 ]
            Issue Type Bug [ 1 ] Task [ 3 ]
            elenst Elena Stepanova made changes -
            Assignee Lixun Peng [ plinux ]
            stephane@skysql.com VAROQUI Stephane made changes -
            Description Semisync state and document that the slave is up to date with master under some predefine delay , but this is not physically true (despite the client have never seen those extra transaction they are in the binlog) as the ACK is done after SYNC or after COMMIT. What is true is that no transaction have been committed until it reach a slave , but it can lead the old master in a state where it need to be restore from the cluster.

            PERSISTENT ACK

            The sync would be prepare with the group commit store inside InnoDB system table with expected gtid inside a master system table and ACK acknowledge the system table receive before commit. During Crash recover all trx missing the acknowledge would be rollback.

            PUSH MODEL
            Inter storage engine 2PC can be use to push the binlog to the slaves. Lucky spider engine can do this in 2PC with all slaves or some preselected slaves and report back the failure of the transaction via it's monitoring of 2PC feature.

            Semisync master plugin would inject into a spider table linked to a relay log system table on every node of the cluster , based on the number of success will assign the status of in sync to the replication. This is loosing the first ACK win but bring back true crash safe capabilities .

            The Semisync slave plugin would select witch queue to apply in sync mode it read the relay from system table , in assync mode from the binog
              
            A remote failure would commit anyway as there is no reason for the local spider system table not to succeed. Only spider monitoring would tell us that remote slaves are down and replication should be switch to un sync . Coming back to sync state would reset optimistic spider table status to make an other tentative by reseting the state of the slave inside the spider local table.

               
            Semisync state and document that the slave is up to date with master under some predefine delay , but this is not physically true (despite the client have never seen those extra transaction they are in the binlog) as the ACK is done after SYNC or after COMMIT. What is true is that no transaction have been committed until it reach a slave , but it can lead the old master in a state where it need to be restore from the cluster.

            PERSISTENT ACK

            The sync would be prepare with the group commit store inside InnoDB system table with expected gtid inside a master system table and ACK acknowledge the system table receive before commit. During Crash recover all trx missing the acknowledge would be rollback.

            PUSH MODEL
            Inter storage engine 2PC can be use to push the binlog to the slaves. Lucky spider engine can do this in 2PC with all slaves or some preselected slaves and report back the failure of the transaction via it's monitoring of 2PC feature.

            Semisync master plugin would inject into a spider table linked to a relay log system table on every node of the cluster , based on the number of success will assign the status of in sync to the replication. This is loosing the first ACK win but bring back true crash safe capabilities .

            The Semisync slave plugin would select witch queue to apply in sync mode it read the relay from system table , in assync mode from the binog
              
            A remote failure would commit anyway as there is no reason for the local spider system table not to succeed. Only spider monitoring would tell us that remote slaves are down and replication should be switch to un sync . Coming back to sync state would reset optimistic spider table status to make an other tentative by reseting the state of the slave inside the spider local table.

            mysql_sandbox5012-bin.000001 648803 Gtid 5054 648841 BEGIN GTID 0-5054-4185
            mysql_sandbox5012-bin.000001 648841 Table_map 5054 648889 table_id: 60 (test.test119)
            mysql_sandbox5012-bin.000001 648889 Write_rows_v1 5054 648931 table_id: 60 flags: STMT_END_F
            AFTER SYNC ACK (crash 1)
            mysql_sandbox5012-bin.000001 648931 Xid 5054 648958 COMMIT /* xid=1400239 */
            (crash 2)
            AFTER COMMIT ACK

            Crash 1 follow by crash recovery would make an unfinished transaction rollback on master restart ,
            how the slave manage such case ?

            Crash 2 in AFTER SYNC Client have receive trx ok but slave may not receive XID Commit
            if elect a slave here we may miss the transaction

            Crash in 2
            AFTER COMMIT ACK
            May leave the master with extra transaction , elect slave will miss one transaction


              
            Elkin Andrei Elkin made changes -
            Assignee Lixun Peng [ plinux ] Andrei Elkin [ elkin ]
            Elkin Andrei Elkin made changes -
            Labels semisync

            See also this recent upstream MySQL bug report:

            https://bugs.mysql.com/bug.php?id=99370

            valerii Valerii Kravchuk added a comment - See also this recent upstream MySQL bug report: https://bugs.mysql.com/bug.php?id=99370
            valerii Valerii Kravchuk made changes -
            Labels semisync semisync upstream
            julien.fritsch Julien Fritsch made changes -
            Assignee Andrei Elkin [ elkin ] Max Mether [ maxmether ]
            julien.fritsch Julien Fritsch made changes -
            julien.fritsch Julien Fritsch made changes -
            Assignee Max Mether [ maxmether ] Andrei Elkin [ elkin ]
            julien.fritsch Julien Fritsch made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 79296 ] MariaDB v4 [ 130584 ]
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 150484
            bnestere Brandon Nesterenko made changes -

            People

              Elkin Andrei Elkin
              stephane@skysql.com VAROQUI Stephane
              Votes:
              2 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.