Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-2884

MaxScale secondary master failover not working

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 2.4.6
    • Fix Version/s: N/A
    • Component/s: binlogrouter
    • Labels:
      None

      Description

      I've seen tickets on the topic before, but it remains unresolved, hence a new one. If the binlog router is going to be replaced, it still needs making sure that the replication slave in MaxScale is able to maintain secondary master failover.

      3-node Galera cluster with MariaDB 10.4.12. MaxScale 2.4.6 configured as binlog slave to the first Galera node with the remaining two nodes Galera configured as secondary masters. The setup is configured to use GTID.

      Turning off the primary master on Galera results in MaxScale failing to continue the replication from a secondary one with the following error:

      Last_Error: Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.

      The GTID is, of course, there - all three Galera nodes (or two, after the one we use for replication is shut down) report the same set of two GTIDs.

      Manually stopping the slave, setting the secondary master as primary one and starting the salve resumes the replication - but the whole purpose of secondary masters is to ensure transparent failover.

      Relevant part of Galera configuration:

      log-bin
      log_slave_updates=ON
      server_id=10
      wsrep_gtid_mode=ON
      wsrep_gtid_domain_id=20
      gtid_domain_id=21

      1. gtid_domain_id set to 22 and 23 on the second and third Galera member

      MaxScale binlog slave configuration:

      [Replication]
      type=service
      router=binlogrouter
      user=repl
      password=pass1234
      server_id=21
      binlogdir=/var/lib/maxscale/
      mariadb10-compatibility=1
      mariadb10_master_gtid=1

      [Replication-Listener]
      type=listener
      service=Replication
      protocol=MariaDBClient
      port=3307

      Sequence of commands on MaxScale binlog slave prior to turning off the first Galera node:

      SET @@global.gtid_slave_pos = "20-10-7";
      CHANGE MASTER TO MASTER_HOST="172.20.104.17", MASTER_PORT=3306, MASTER_USER="repl", MASTER_PASSWORD="pass1234", MASTER_USE_GTID=slave_pos;
      START SLAVE;
      SHOW SLAVE STATUS\G

      STOP SLAVE;
      CHANGE MASTER ':2' TO MASTER_HOST="172.20.104.18", MASTER_PORT=3306, MASTER_USER="repl", MASTER_PASSWORD="pass1234", MASTER_USE_GTID=slave_pos;
      CHANGE MASTER ':3' TO MASTER_HOST="172.20.104.19", MASTER_PORT=3306, MASTER_USER="repl", MASTER_PASSWORD="pass1234", MASTER_USE_GTID=slave_pos;
      START SLAVE;
      SHOW SLAVE STATUS\G

      MaxScale report on Galera cluster status after the primary master was turned off (note that the GTIDs are there and sync'ed between the remaining two Galera nodes):

      [root@mariadb-59f24c1f-1007-0 ~]# maxctrl list servers
      ┌───────────────────────────┬───────────────┬──────┬─────────────┬─────────────────────────┬───────────────────┐
      │ Server │ Address │ Port │ Connections │ State │ GTID │
      ├───────────────────────────┼───────────────┼──────┼─────────────┼─────────────────────────┼───────────────────┤
      │ binlog_router_master_host │ none │ 3306 │ 0 │ Running │ │
      ├───────────────────────────┼───────────────┼──────┼─────────────┼─────────────────────────┼───────────────────┤
      │ 172.20.104.17 │ 172.20.104.17 │ 3306 │ 0 │ Down │ │
      ├───────────────────────────┼───────────────┼──────┼─────────────┼─────────────────────────┼───────────────────┤
      │ 172.20.104.18 │ 172.20.104.18 │ 3306 │ 0 │ Master, Synced, Running │ 20-10-10,21-10-39 │
      ├───────────────────────────┼───────────────┼──────┼─────────────┼─────────────────────────┼───────────────────┤
      │ 172.20.104.19 │ 172.20.104.19 │ 3306 │ 0 │ Slave, Synced, Running │ 20-10-10,21-10-39 │
      └───────────────────────────┴───────────────┴──────┴─────────────┴─────────────────────────┴───────────────────┘

        Attachments

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            assen.totin Assen Totin (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.