Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Cannot Reproduce
    • 2.3.6
    • N/A
    • failover
    • None

    Description

      In a situation where a master has two slaves on delayed replication and the master goes down, Maxscale cannot promote a slave due to replication lag until the delay has expired.

      However, if the original master becomes available again before the slave delay expires, the master should automatically resume it's previous role.

      Attachments

        Activity

          markus makela markus makela added a comment - - edited

          This doesn't seem to be a problem anymore:

          2021-08-25 13:08:56   error  : Monitor timed out when connecting to server server1[127.0.0.1:3000] : 'Lost connection to server at 'handshake: reading initial communication packet', system error: 110'
          2021-08-25 13:08:56   notice : Server changed state: server1[127.0.0.1:3000]: master_down. [Master, Running] -> [Down]
          2021-08-25 13:08:56   warning: [mariadbmon] Master has failed. If master does not return in 4 monitor tick(s), failover begins.
          2021-08-25 13:09:14   notice : [mariadbmon] Selecting a server to promote and replace 'server1'. Candidates are: 'server2', 'server3', 'server4'.
          2021-08-25 13:09:14   warning: [mariadbmon] Slave 'server2' has gtid_strict_mode disabled. Enabling this setting is recommended. For more information, see https://mariadb.com/kb/en/library/gtid/#gtid_strict_mode
          2021-08-25 13:09:14   warning: [mariadbmon] Slave 'server3' has gtid_strict_mode disabled. Enabling this setting is recommended. For more information, see https://mariadb.com/kb/en/library/gtid/#gtid_strict_mode
          2021-08-25 13:09:14   warning: [mariadbmon] Slave 'server4' has gtid_strict_mode disabled. Enabling this setting is recommended. For more information, see https://mariadb.com/kb/en/library/gtid/#gtid_strict_mode
          2021-08-25 13:09:14   notice : [mariadbmon] Selected 'server2'.
          2021-08-25 13:09:14   warning: [mariadbmon] The relay log of 'server2' has 5 unprocessed events (Gtid_IO_Pos: 0-3000-34, Gtid_Current_Pos: 0-3000-29). To avoid data loss, failover is postponed until the log has been processed.
          2021-08-25 13:09:14   warning: [mariadbmon] Not performing automatic failover. Will keep retrying with most error messages suppressed.
          2021-08-25 13:09:47   notice : Server changed state: server1[127.0.0.1:3000]: master_up. [Down] -> [Master, Running]
          

          This was tested with a replication delay of 3000 seconds to make sure the servers never catch up.

          markus makela markus makela added a comment - - edited This doesn't seem to be a problem anymore: 2021-08-25 13:08:56 error : Monitor timed out when connecting to server server1[127.0.0.1:3000] : 'Lost connection to server at 'handshake: reading initial communication packet', system error: 110' 2021-08-25 13:08:56 notice : Server changed state: server1[127.0.0.1:3000]: master_down. [Master, Running] -> [Down] 2021-08-25 13:08:56 warning: [mariadbmon] Master has failed. If master does not return in 4 monitor tick(s), failover begins. 2021-08-25 13:09:14 notice : [mariadbmon] Selecting a server to promote and replace 'server1'. Candidates are: 'server2', 'server3', 'server4'. 2021-08-25 13:09:14 warning: [mariadbmon] Slave 'server2' has gtid_strict_mode disabled. Enabling this setting is recommended. For more information, see https://mariadb.com/kb/en/library/gtid/#gtid_strict_mode 2021-08-25 13:09:14 warning: [mariadbmon] Slave 'server3' has gtid_strict_mode disabled. Enabling this setting is recommended. For more information, see https://mariadb.com/kb/en/library/gtid/#gtid_strict_mode 2021-08-25 13:09:14 warning: [mariadbmon] Slave 'server4' has gtid_strict_mode disabled. Enabling this setting is recommended. For more information, see https://mariadb.com/kb/en/library/gtid/#gtid_strict_mode 2021-08-25 13:09:14 notice : [mariadbmon] Selected 'server2'. 2021-08-25 13:09:14 warning: [mariadbmon] The relay log of 'server2' has 5 unprocessed events (Gtid_IO_Pos: 0-3000-34, Gtid_Current_Pos: 0-3000-29). To avoid data loss, failover is postponed until the log has been processed. 2021-08-25 13:09:14 warning: [mariadbmon] Not performing automatic failover. Will keep retrying with most error messages suppressed. 2021-08-25 13:09:47 notice : Server changed state: server1[127.0.0.1:3000]: master_up. [Down] -> [Master, Running] This was tested with a replication delay of 3000 seconds to make sure the servers never catch up.

          People

            Unassigned Unassigned
            toddstoffel Todd Stoffel (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.