Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33424

when both rpl_semi_sync_MASTER,SLAVE_enabled set the server should recover as master

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Won't Fix
    • 10.6, 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
    • N/A
    • Replication
    • None

    Description

      When at recovery both rpl_semi_sync_master_enabled and rpl_semi_sync_slave_enabled variables are set ON
      the server recovers as a semisync slave to conduct binlog truncation according to MDEV-21117.

      While which of the roles the user means with the settings is unclear, and considering that switchover must be of lesser demand compare to a "normal" master crash-recovery, a presumed intent is better off be for MASTER.

      That is when both variables are ON a post-crash restarting server would execute the normal recovery.

      Attachments

        Issue Links

          Activity

            Elkin Andrei Elkin added a comment - - edited

            As a background, the semisync slave recovery mode of MDEV-21117 was introduced for providing failover (to a new master) such that the old master is repurposed to slave to the new one. Historical facts as attributing to this method as well as its practical side are discussed in Jean-François Gagné blog. The old master is required to have rpl_semi_sync_slave_enabled = ON.
            In essence the old master rolls back transactions in doubt which ensures it can't be ahead (e.g in the GTID terms) of the new master.

            As of current a possible (and conflicting in "normal" master-slave two servers setup) rpl_semi_sync_MASTER_enabled = ON is ignored
            for the recovery mode computation.
            This ticket offers to avoid the server to take any dubious decision. Only when the server is configured unambiguously as semisync slave
            it will pass through the semisync recovery.
            Otherwise recovery is normal.

            Elkin Andrei Elkin added a comment - - edited As a background, the semisync slave recovery mode of MDEV-21117 was introduced for providing failover (to a new master) such that the old master is repurposed to slave to the new one. Historical facts as attributing to this method as well as its practical side are discussed in Jean-François Gagné blog . The old master is required to have rpl_semi_sync_slave_enabled = ON . In essence the old master rolls back transactions in doubt which ensures it can't be ahead (e.g in the GTID terms) of the new master. As of current a possible (and conflicting in "normal" master-slave two servers setup) rpl_semi_sync_MASTER_enabled = ON is ignored for the recovery mode computation. This ticket offers to avoid the server to take any dubious decision. Only when the server is configured unambiguously as semisync slave it will pass through the semisync recovery. Otherwise recovery is normal.

            This is solution is a non-go because of the following reasons:

            • This will be very hard to document and understand.
            • There is no relationship between the two variables and there should not be. One used in the case the machine is a master, the other is if the machine is a slave. In many master-slave environments a machine can be a master, a slave or both. It should be safe to have both ALWAYS on. This should even be the default for anyone wanting to have semi-sync always on for all machines that are in replication setup.
            • How recovery is done should not depend on these variables, but on some other recovery related variable that should be easy to document and understand.
            • The real problem we are having in MDEV-21117 is related to how a slave continues when a transaction GTID it has seen part of does not exist anymore. We have to fix this case anyway!
            monty Michael Widenius added a comment - This is solution is a non-go because of the following reasons: This will be very hard to document and understand. There is no relationship between the two variables and there should not be. One used in the case the machine is a master, the other is if the machine is a slave. In many master-slave environments a machine can be a master, a slave or both. It should be safe to have both ALWAYS on. This should even be the default for anyone wanting to have semi-sync always on for all machines that are in replication setup. How recovery is done should not depend on these variables, but on some other recovery related variable that should be easy to document and understand. The real problem we are having in MDEV-21117 is related to how a slave continues when a transaction GTID it has seen part of does not exist anymore. We have to fix this case anyway!

            People

              bnestere Brandon Nesterenko
              Elkin Andrei Elkin
              Votes:
              2 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.