[MDEV-5566] Change master to using relay log Created: 2014-01-25  Updated: 2021-12-28  Resolved: 2021-12-28

Status: Closed
Project: MariaDB Server
Component/s: Replication, Storage Engine - Spider
Fix Version/s: N/A

Type: Task Priority: Major
Reporter: VAROQUI Stephane Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: None

Issue Links:
Blocks
blocks MDEV-5562 LOR Cluster Closed
blocks MDEV-5570 Spider to promote slave on Master Fai... Closed

 Description   

Promoting a slave to master, we need to ensure that the slave get most up to date binlog events from all slaves in the cluster. Using SQL command only we must ensure to make all slaves starting at the same state.



 Comments   
Comment by Sergei Golubchik [ 2015-02-18 ]

could you elaborate on this, please?

Comment by VAROQUI Stephane [ 2015-02-18 ]

We need to implement a way to make each cluster node topology aware

  • HA group
  • DSN
  • Role : Slave, Master , Standalone
  • replication type (assync , semi-sync , sync )
  • Candidate Master

Nodes in a cluster could be instrumented from various way from plugin to external tools so a system table is probably the best here

Defining a cluster manager plugin API that can provide the nodes status

Some possible plugin implementation:

  • Rest external cluster manager
  • Corosync plugin in each node
  • Wsrep plugin in each node
  • Spider XA monitoring

When system table topology change we need to auto set the replication

This required founding the oldest gtid in the cluster, fetching all following gtid from that node , instrumenting the new master define in new topology and waiting until that master already have the oldest gtid.

I propose to start the task with a SQL command that implement the replication failover based on the system table (improving the existing "server" table with a cluster name or HA group to mimic the fabric concept + additional per node status properties )

In charge to cluster manager plugin or external tools for populating that table before using the command

MHA or MariaDB rpl tools from Guillaume , have to do this manually found the topology of the replication , found the most up to date slave , wait until each slave catch up from the promoted master

Maxscale can populate such tables based on his monitoring plugin and later on trigger failover by invoking the command

First cluster manager plugin can be demonstrated on 3 nodes using spider storage engine

One of the node is instrumented with :
cluster_manager_nodes=node1,node2,node3
cluster_manager_ha_group=mycluster1
cluster_manager_ha_group_mode=master-slaves-assync
cluster_manager_director=on
cluster_manager_candidate_master=off

All other nodes
cluster_manager_nodes=node1,node2,node3
cluster_manager_ha_group=mycluster1
cluster_manager_ha_group_mode=master-slaves-assync
cluster_manager_director=off
cluster_manager_candidate_master=on

All nodes
Create a dummy heartbeat single record system table when loading the plugin

Director do
-heartbeat_spider table linking to all heartbeat table of every cluster node and replicate to all nodes in XA

  • Spider start to monitor the status of the heartbeat table
  • If the state of the connection to heartbeat table is changing spider will change is own spider_table and the plugin can change the failover server system table on every remaining nodes and trigger failover on each of those nodes
  • Constantly changing the status of the spider_table for the old master to check if the old master is coming back to life (this should be improved as a native spider feature to have an extra status value)
  • When old master back to life mark him as not accepting connections somehow , using connection pool we can set the pool to 0

We later can emprove with a rollback from gtid feature that would rollback following gtid transactions by reversing the binlog row events based on the before image. It would enable reintroducing the old master, and copying all rollbacked events to a bin-log.lost files.

Generated at Thu Feb 08 07:05:22 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.