Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.0.28, 10.0.29, 10.1.21
-
debian
-
10.2.11
Description
Hello,
I have the following scenario with a slave running multisource and no GTID and at some point I want to enable GTID on that existing slave. So I start setting the gtid_domain_id flags on both masters and it slaves as follows:
master 1:
gtid_domain_id=12
version: 10.0.28
master 2:
gtid_domain_id=14
version: 10.0.28
slave1 which replicates from both masters with multi source
gtid_domain_id=10
version: 10.1.21
gtid disabled
Current status after setting gtid_domain_id on the three hosts:
MariaDB [gtid_10]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 12
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.12
|
Master_User: slave_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000061
|
Read_Master_Log_Pos: 2510
|
Relay_Log_File: mysqld-relay-bin-12.000002
|
Relay_Log_Pos: 538
|
Relay_Master_Log_File: mariadb-bin.000061
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 2510
|
Relay_Log_Space: 840
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 12
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos: 0-2-210669,12-12-24,14-14-1
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 29
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-1
|
*************************** 2. row ***************************
|
Connection_name: 14
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.14
|
Master_User: slave_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mysqld-bin.000063
|
Read_Master_Log_Pos: 500
|
Relay_Log_File: mysqld-relay-bin-14.000002
|
Relay_Log_Pos: 537
|
Relay_Master_Log_File: mysqld-bin.000063
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 500
|
Relay_Log_Space: 839
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 14
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos: 0-2-210669,12-12-24,14-14-1
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 15
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-1
|
2 rows in set (0.00 sec)
|
I keep inserting data and at some point I want to enable GTID, (slave_pos) and they crash replication:
MariaDB [gtid_10]> stop all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.00 sec)
|
|
MariaDB [gtid_10]> change master '12' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [gtid_10]> change master '14' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [gtid_10]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.02 sec)
|
|
MariaDB [gtid_10]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 12
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.12
|
Master_User: slave_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000061
|
Read_Master_Log_Pos: 2631
|
Relay_Log_File: mysqld-relay-bin-12.000002
|
Relay_Log_Pos: 762
|
Relay_Master_Log_File: mariadb-bin.000061
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 2631
|
Relay_Log_Space: 1064
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 12
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 0-2-210669,12-12-25,14-14-2
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 40
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-25,14-14-2
|
*************************** 2. row ***************************
|
Connection_name: 14
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State:
|
Master_Host: 192.168.56.14
|
Master_User: slave_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mysqld-bin.000063
|
Read_Master_Log_Pos: 621
|
Relay_Log_File: mysqld-relay-bin-14.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: mysqld-bin.000063
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 621
|
Relay_Log_Space: 249
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 12-12-25, which is not in the master's binlog'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 14
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 0-2-210669,12-12-25,14-14-2
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 18
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-25,14-14-2
|
2 rows in set (0.00 sec)
|
This happens with both, current_pos and slave_pos.
If I go back to non GTID replication gets fixed and I can keep inserting data.
MariaDB [gtid_10]> stop all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.01 sec)
|
|
MariaDB [gtid_10]> change master '12' to master_use_gtid=no;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [gtid_10]> change master '14' to master_use_gtid=no;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [gtid_10]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.00 sec)
|
|
MariaDB [gtid_10]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 12
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.12
|
Master_User: slave_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000061
|
Read_Master_Log_Pos: 2631
|
Relay_Log_File: mysqld-relay-bin-12.000002
|
Relay_Log_Pos: 538
|
Relay_Master_Log_File: mariadb-bin.000061
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 2631
|
Relay_Log_Space: 840
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 12
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 51
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-2
|
*************************** 2. row ***************************
|
Connection_name: 14
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.14
|
Master_User: slave_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mysqld-bin.000063
|
Read_Master_Log_Pos: 621
|
Relay_Log_File: mysqld-relay-bin-14.000002
|
Relay_Log_Pos: 537
|
Relay_Master_Log_File: mysqld-bin.000063
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 621
|
Relay_Log_Space: 839
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 14
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 29
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-2
|
2 rows in set (0.00 sec)
|
And I insert data on the masters and they get replicated:
MariaDB [gtid_10]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 12
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.12
|
Master_User: slave_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000061
|
Read_Master_Log_Pos: 2748
|
Relay_Log_File: mysqld-relay-bin-12.000002
|
Relay_Log_Pos: 655
|
Relay_Master_Log_File: mariadb-bin.000061
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 2748
|
Relay_Log_Space: 957
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 12
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 53
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-26,14-14-3
|
*************************** 2. row ***************************
|
Connection_name: 14
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.14
|
Master_User: slave_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mysqld-bin.000063
|
Read_Master_Log_Pos: 748
|
Relay_Log_File: mysqld-relay-bin-14.000002
|
Relay_Log_Pos: 664
|
Relay_Master_Log_File: mysqld-bin.000063
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 748
|
Relay_Log_Space: 966
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 14
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 31
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-26,14-14-3
|
Can this be related?: https://mariadb.atlassian.net/browse/MDEV-4485
Attachments
Issue Links
- duplicates
-
MDEV-11969 Can't remove GTIDs for a stale GTID Domain ID
-
- Closed
-
- relates to
-
MDEV-9108 "GTID not in master's binlog" error with {ignore|do}_domain_ids
-
- Open
-
Activity
Field | Original Value | New Value |
---|---|---|
Description |
Hello,
I have the following scenario with a slave running multisource and no GTID and at some point I want to enable GTID on that existing slave. So I start setting the gtid_domain_id flags on both masters and it slaves as follows: master 1: gtid_domain_id=12 version: 10.0.28 master 2: gtid_domain_id=14 version: 10.0.28 slave1 which replicates from both masters with multi source gtid_domain_id=10 version: 10.0.21 gtid disabled Current status after setting gtid_domain_id on the three hosts: {code} MariaDB [gtid_10]> show all slaves status\G *************************** 1. row *************************** Connection_name: 12 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.12 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000061 Read_Master_Log_Pos: 2510 Relay_Log_File: mysqld-relay-bin-12.000002 Relay_Log_Pos: 538 Relay_Master_Log_File: mariadb-bin.000061 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2510 Relay_Log_Space: 840 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 12 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-2-210669,12-12-24,14-14-1 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 29 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-1 *************************** 2. row *************************** Connection_name: 14 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.14 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysqld-bin.000063 Read_Master_Log_Pos: 500 Relay_Log_File: mysqld-relay-bin-14.000002 Relay_Log_Pos: 537 Relay_Master_Log_File: mysqld-bin.000063 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 500 Relay_Log_Space: 839 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 14 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-2-210669,12-12-24,14-14-1 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 15 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-1 2 rows in set (0.00 sec) {code} I keep inserting data and at some point I want to enable GTID, (slave_pos) and they crash replication: {code} MariaDB [gtid_10]> stop all slaves; Query OK, 0 rows affected, 2 warnings (0.00 sec) MariaDB [gtid_10]> change master '12' to master_use_gtid=slave_pos; Query OK, 0 rows affected (0.01 sec) MariaDB [gtid_10]> change master '14' to master_use_gtid=slave_pos; Query OK, 0 rows affected (0.01 sec) MariaDB [gtid_10]> start all slaves; Query OK, 0 rows affected, 2 warnings (0.02 sec) MariaDB [gtid_10]> show all slaves status\G *************************** 1. row *************************** Connection_name: 12 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.12 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000061 Read_Master_Log_Pos: 2631 Relay_Log_File: mysqld-relay-bin-12.000002 Relay_Log_Pos: 762 Relay_Master_Log_File: mariadb-bin.000061 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2631 Relay_Log_Space: 1064 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 12 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 0-2-210669,12-12-25,14-14-2 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 40 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-25,14-14-2 *************************** 2. row *************************** Connection_name: 14 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Master_Host: 192.168.56.14 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysqld-bin.000063 Read_Master_Log_Pos: 621 Relay_Log_File: mysqld-relay-bin-14.000001 Relay_Log_Pos: 4 Relay_Master_Log_File: mysqld-bin.000063 Slave_IO_Running: No Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 621 Relay_Log_Space: 249 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 12-12-25, which is not in the master's binlog' Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 14 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 0-2-210669,12-12-25,14-14-2 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 18 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-25,14-14-2 2 rows in set (0.00 sec) {code} This happens with both, current_pos and slave_pos. If I go back to non GTID replication gets fixed and I can keep inserting data. {code} MariaDB [gtid_10]> stop all slaves; Query OK, 0 rows affected, 2 warnings (0.01 sec) MariaDB [gtid_10]> change master '12' to master_use_gtid=no; Query OK, 0 rows affected (0.00 sec) MariaDB [gtid_10]> change master '14' to master_use_gtid=no; Query OK, 0 rows affected (0.01 sec) MariaDB [gtid_10]> start all slaves; Query OK, 0 rows affected, 2 warnings (0.00 sec) MariaDB [gtid_10]> show all slaves status\G *************************** 1. row *************************** Connection_name: 12 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.12 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000061 Read_Master_Log_Pos: 2631 Relay_Log_File: mysqld-relay-bin-12.000002 Relay_Log_Pos: 538 Relay_Master_Log_File: mariadb-bin.000061 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2631 Relay_Log_Space: 840 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 12 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 51 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-2 *************************** 2. row *************************** Connection_name: 14 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.14 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysqld-bin.000063 Read_Master_Log_Pos: 621 Relay_Log_File: mysqld-relay-bin-14.000002 Relay_Log_Pos: 537 Relay_Master_Log_File: mysqld-bin.000063 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 621 Relay_Log_Space: 839 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 14 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 29 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-2 2 rows in set (0.00 sec) {code} And I insert data on the masters and they get replicated: {code} MariaDB [gtid_10]> show all slaves status\G *************************** 1. row *************************** Connection_name: 12 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.12 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000061 Read_Master_Log_Pos: 2748 Relay_Log_File: mysqld-relay-bin-12.000002 Relay_Log_Pos: 655 Relay_Master_Log_File: mariadb-bin.000061 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2748 Relay_Log_Space: 957 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 12 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 53 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-26,14-14-3 *************************** 2. row *************************** Connection_name: 14 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.14 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysqld-bin.000063 Read_Master_Log_Pos: 748 Relay_Log_File: mysqld-relay-bin-14.000002 Relay_Log_Pos: 664 Relay_Master_Log_File: mysqld-bin.000063 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 748 Relay_Log_Space: 966 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 14 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 31 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-26,14-14-3 {code} Can this be related?: https://mariadb.atlassian.net/browse/MDEV-4485 |
Hello,
I have the following scenario with a slave running multisource and no GTID and at some point I want to enable GTID on that existing slave. So I start setting the gtid_domain_id flags on both masters and it slaves as follows: master 1: gtid_domain_id=12 version: 10.0.28 master 2: gtid_domain_id=14 version: 10.0.28 slave1 which replicates from both masters with multi source gtid_domain_id=10 version: 10.1.21 gtid disabled Current status after setting gtid_domain_id on the three hosts: {code} MariaDB [gtid_10]> show all slaves status\G *************************** 1. row *************************** Connection_name: 12 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.12 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000061 Read_Master_Log_Pos: 2510 Relay_Log_File: mysqld-relay-bin-12.000002 Relay_Log_Pos: 538 Relay_Master_Log_File: mariadb-bin.000061 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2510 Relay_Log_Space: 840 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 12 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-2-210669,12-12-24,14-14-1 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 29 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-1 *************************** 2. row *************************** Connection_name: 14 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.14 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysqld-bin.000063 Read_Master_Log_Pos: 500 Relay_Log_File: mysqld-relay-bin-14.000002 Relay_Log_Pos: 537 Relay_Master_Log_File: mysqld-bin.000063 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 500 Relay_Log_Space: 839 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 14 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-2-210669,12-12-24,14-14-1 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 15 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-1 2 rows in set (0.00 sec) {code} I keep inserting data and at some point I want to enable GTID, (slave_pos) and they crash replication: {code} MariaDB [gtid_10]> stop all slaves; Query OK, 0 rows affected, 2 warnings (0.00 sec) MariaDB [gtid_10]> change master '12' to master_use_gtid=slave_pos; Query OK, 0 rows affected (0.01 sec) MariaDB [gtid_10]> change master '14' to master_use_gtid=slave_pos; Query OK, 0 rows affected (0.01 sec) MariaDB [gtid_10]> start all slaves; Query OK, 0 rows affected, 2 warnings (0.02 sec) MariaDB [gtid_10]> show all slaves status\G *************************** 1. row *************************** Connection_name: 12 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.12 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000061 Read_Master_Log_Pos: 2631 Relay_Log_File: mysqld-relay-bin-12.000002 Relay_Log_Pos: 762 Relay_Master_Log_File: mariadb-bin.000061 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2631 Relay_Log_Space: 1064 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 12 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 0-2-210669,12-12-25,14-14-2 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 40 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-25,14-14-2 *************************** 2. row *************************** Connection_name: 14 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Master_Host: 192.168.56.14 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysqld-bin.000063 Read_Master_Log_Pos: 621 Relay_Log_File: mysqld-relay-bin-14.000001 Relay_Log_Pos: 4 Relay_Master_Log_File: mysqld-bin.000063 Slave_IO_Running: No Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 621 Relay_Log_Space: 249 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 12-12-25, which is not in the master's binlog' Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 14 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 0-2-210669,12-12-25,14-14-2 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 18 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-25,14-14-2 2 rows in set (0.00 sec) {code} This happens with both, current_pos and slave_pos. If I go back to non GTID replication gets fixed and I can keep inserting data. {code} MariaDB [gtid_10]> stop all slaves; Query OK, 0 rows affected, 2 warnings (0.01 sec) MariaDB [gtid_10]> change master '12' to master_use_gtid=no; Query OK, 0 rows affected (0.00 sec) MariaDB [gtid_10]> change master '14' to master_use_gtid=no; Query OK, 0 rows affected (0.01 sec) MariaDB [gtid_10]> start all slaves; Query OK, 0 rows affected, 2 warnings (0.00 sec) MariaDB [gtid_10]> show all slaves status\G *************************** 1. row *************************** Connection_name: 12 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.12 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000061 Read_Master_Log_Pos: 2631 Relay_Log_File: mysqld-relay-bin-12.000002 Relay_Log_Pos: 538 Relay_Master_Log_File: mariadb-bin.000061 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2631 Relay_Log_Space: 840 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 12 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 51 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-2 *************************** 2. row *************************** Connection_name: 14 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.14 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysqld-bin.000063 Read_Master_Log_Pos: 621 Relay_Log_File: mysqld-relay-bin-14.000002 Relay_Log_Pos: 537 Relay_Master_Log_File: mysqld-bin.000063 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 621 Relay_Log_Space: 839 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 14 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 29 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-2 2 rows in set (0.00 sec) {code} And I insert data on the masters and they get replicated: {code} MariaDB [gtid_10]> show all slaves status\G *************************** 1. row *************************** Connection_name: 12 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.12 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000061 Read_Master_Log_Pos: 2748 Relay_Log_File: mysqld-relay-bin-12.000002 Relay_Log_Pos: 655 Relay_Master_Log_File: mariadb-bin.000061 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2748 Relay_Log_Space: 957 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 12 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 53 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-26,14-14-3 *************************** 2. row *************************** Connection_name: 14 Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.14 Master_User: slave_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysqld-bin.000063 Read_Master_Log_Pos: 748 Relay_Log_File: mysqld-relay-bin-14.000002 Relay_Log_Pos: 664 Relay_Master_Log_File: mysqld-bin.000063 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 748 Relay_Log_Space: 966 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 14 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 0-10-210671,12-2-23,14-14-2,10-10-17 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative Retried_transactions: 0 Max_relay_log_size: 104857600 Executed_log_entries: 31 Slave_received_heartbeats: 0 Slave_heartbeat_period: 1800.000 Gtid_Slave_Pos: 0-10-210671,10-10-17,12-12-26,14-14-3 {code} Can this be related?: https://mariadb.atlassian.net/browse/MDEV-4485 |
I have done the same test with:
Master1 - 10.1.21
Master2 - 10.1.19
Slave - 10.1.19
- Three servers without gtid_domain_id set
- No GTID enabled on the slave
- Replication works without any issue
- Enable gtid_domain_id on the three hosts
- Replication keeps flowing fine
- Slave:
stop all slaves;
- Slave:
change master '1' to master_use_gtid=slave_pos; change master '2' to master_user_gtid=slave_pos; start all slaves;
Replication breaks
The setup in the initial report shows that there was more history there.
The first paste "Current status after setting gtid_domain_id on the three hosts" is supposed to have been taken before GTID replication had been enabled; yet, it says
Gtid_IO_Pos: 0-2-210669,12-12-24,14-14-1
|
How would it happen? It shouldn't have Gtid_IO_Pos if it has never been enabled, right?
Also, the same paste shows
Gtid_Slave_Pos: 0-10-210671,10-10-17,12-2-23,14-14-1
|
So, there was at some point a server with gtid_domain_id=12 and server_id=2.
I can't say anything about the test in the last comment, as there are no logs or output. Did you perform it on the same database, just after upgrading servers, or was it an entirely new setup?
We might be missing something important, as I couldn't so far reproduce it on a clean setup.
Could you please take any of the setups where you observe the failure, and in addition to SHOW ALL SLAVES STATUS also paste the output of
select * from mysql.gtid_slave_pos; |
show variables like '%gtid%'; |
from all servers, the slave and both masters?
Labels | gtid mariadb multisource | gtid mariadb multisource need_feedback |
Hi Elena!
Thanks for your answer.
So these hosts, some of them were upgraded and some others were fresh, just like it'd happen with a production environment. So it is possible that I could've had a master with gtid_domain_id=12 and server_id=2 indeed.
The current status is that the slave (as you will see below) is using server_id=2 for the master that actually has server_id=14)
I believe the reason why Gtid_IO_Pos is showing there is just because I am able to enable one of the replication threads with GTID, but as soon as the second one comes up, it breaks. Sometimes both break, sometimes one of them break.
This is the output positions might not be the same as the initial post as I have been trying different things (and GTID is now disable - I can try to enable them back if needed).
Again, this is a test with pretty much what can be found in production.
- slaves being upgraded through different versions
- slaves and masters not having gtid_domain_id enabled from the start
- GTID not being enabled from start on all the hosts
This is the output
slave:
root@localhost[(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 12
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.12
|
Master_User: slave_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000063
|
Read_Master_Log_Pos: 566
|
Relay_Log_File: mysqld-relay-bin-12.000012
|
Relay_Log_Pos: 538
|
Relay_Master_Log_File: mariadb-bin.000063
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 566
|
Relay_Log_Space: 1447
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 12
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 32
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-10-210671,10-10-20,12-2-23,14-2-6
|
*************************** 2. row ***************************
|
Connection_name: 14
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.14
|
Master_User: slave_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mysqld-bin.000066
|
Read_Master_Log_Pos: 836
|
Relay_Log_File: mysqld-relay-bin-14.000015
|
Relay_Log_Pos: 680
|
Relay_Master_Log_File: mysqld-bin.000066
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 836
|
Relay_Log_Space: 1270
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 38
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-10-210671,10-10-20,12-2-23,14-2-6
|
2 rows in set (0.00 sec)
|
|
root@localhost[(none)]> select * from mysql.gtid_slave_pos;
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 66 | 2 | 210669 |
|
| 0 | 68 | 2 | 210669 |
|
| 12 | 67 | 12 | 28 |
|
| 12 | 70 | 2 | 23 |
|
| 14 | 69 | 2 | 5 |
|
| 14 | 71 | 2 | 6 |
|
+-----------+--------+-----------+--------+
|
6 rows in set (0.00 sec)
|
|
root@localhost[(none)]> show variables like '%gtid%';
|
+------------------------+-------------------------------------------------------------------------+
|
| Variable_name | Value |
|
+------------------------+-------------------------------------------------------------------------+
|
| gtid_binlog_pos | 0-10-210671,10-10-20,12-12-28,14-2-6 |
|
| gtid_binlog_state | 0-1-788,0-2-210669,0-10-210671,10-10-20,12-2-27,12-12-28,14-14-4,14-2-6 |
|
| gtid_current_pos | 0-10-210671,10-10-20,12-2-23,14-2-6 |
|
| gtid_domain_id | 10 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | 0-2-210669,12-2-23,14-2-6 |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------------------------------------------------------------------------+
|
11 rows in set (0.00 sec)
|
master1:
MariaDB [(none)]> show all slaves status\G
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 143 | 1 | 3426 |
|
| 0 | 144 | 1 | 3427 |
|
+-----------+--------+-----------+--------+
|
2 rows in set (0.00 sec)
|
|
MariaDB [(none)]> show variables like '%gtid%';
|
+------------------------+-----------------------------------------------+
|
| Variable_name | Value |
|
+------------------------+-----------------------------------------------+
|
| gtid_binlog_pos | 0-2-210669,12-12-28 |
|
| gtid_binlog_state | 0-1-3427,0-3-3426,0-2-210669,12-2-27,12-12-28 |
|
| gtid_current_pos | 0-1-3427,12-12-28 |
|
| gtid_domain_id | 12 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | 0-1-3427 |
|
| gtid_strict_mode | OFF |
|
| last_gtid | 12-12-28 |
|
+------------------------+-----------------------------------------------+
|
9 rows in set (0.00 sec)
|
MariaDB [(none)]> show slave hosts;
|
+-----------+------+------+-----------+
|
| Server_id | Host | Port | Master_id |
|
+-----------+------+------+-----------+
|
| 10 | | 3306 | 12 |
|
+-----------+------+------+-----------+
|
1 row in set (0.00 sec)
|
master2:
MariaDB [(none)]> show all slaves status\G
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 16 | 2 | 210668 |
|
| 0 | 17 | 2 | 210669 |
|
| 12 | 39 | 2 | 22 |
|
| 12 | 40 | 2 | 23 |
|
+-----------+--------+-----------+--------+
|
4 rows in set (0.00 sec)
|
|
MariaDB [(none)]> show variables like '%gtid%';
|
+------------------------+--------------------------------------------------------+
|
| Variable_name | Value |
|
+------------------------+--------------------------------------------------------+
|
| gtid_binlog_pos | 0-2-210669,12-2-23,14-2-6 |
|
| gtid_binlog_state | 0-1-4922,0-14-210663,0-2-210669,12-2-23,14-14-4,14-2-6 |
|
| gtid_current_pos | 0-2-210669,12-2-23,14-2-6 |
|
| gtid_domain_id | 14 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | 0-2-210669,12-2-23 |
|
| gtid_strict_mode | OFF |
|
| last_gtid | 14-2-6 |
|
+------------------------+--------------------------------------------------------+
|
9 rows in set (0.00 sec)
|
|
MariaDB [(none)]> show slave hosts;
|
+-----------+------+------+-----------+
|
| Server_id | Host | Port | Master_id |
|
+-----------+------+------+-----------+
|
| 10 | | 3306 | 2 |
|
+-----------+------+------+-----------+
|
1 row in set (0.00 sec)
|
If you believe the server_id on the replication thread being different from what it actually is can play a role on the GTID mess up can try to reset that replication thread and see what happens after enabling GTID=slave_pos
Thanks for taking a look into this.
I noticed the whole thing is a bit messed up with all the tests I have been doing. So I am going to set up three servers completely fresh, insert a few things without gtid_domain_id, and then enable it and after that enable GTID just to make sure all the tests are not making this even harder to debug.
This is the new setup completely fresh
master1
MariaDB [(none)]> select @@version;
|
+--------------------------+
|
| @@version |
|
+--------------------------+
|
| 10.0.29-MariaDB-1~jessie |
|
+--------------------------+
|
MariaDB [(none)]> show global variables like 'server_id'; show global variables like 'gtid_domain_id';
|
+---------------+-------+
|
| Variable_name | Value |
|
+---------------+-------+
|
| server_id | 1 |
|
+---------------+-------+
|
1 row in set (0.00 sec)
|
|
+----------------+-------+
|
| Variable_name | Value |
|
+----------------+-------+
|
| gtid_domain_id | 0 |
|
+----------------+-------+
|
1 row in set (0.00 sec)
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> show variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
master2
MariaDB [(none)]> select @@version;
|
+--------------------------+
|
| @@version |
|
+--------------------------+
|
| 10.0.29-MariaDB-1~jessie |
|
+--------------------------+
|
MariaDB [(none)]> show global variables like 'server_id'; show global variables like 'gtid_domain_id';
|
+---------------+-------+
|
| Variable_name | Value |
|
+---------------+-------+
|
| server_id | 2 |
|
+---------------+-------+
|
1 row in set (0.00 sec)
|
|
+----------------+-------+
|
| Variable_name | Value |
|
+----------------+-------+
|
| gtid_domain_id | 0 |
|
+----------------+-------+
|
1 row in set (0.00 sec)
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> show variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
slave1
MariaDB [(none)]> select @@version;
|
+--------------------------+
|
| @@version |
|
+--------------------------+
|
| 10.1.21-MariaDB-1~jessie |
|
+--------------------------+
|
1 row in set (0.00 sec)
|
MariaDB [(none)]> show global variables like 'server_id'; show global variables like 'gtid_domain_id';
|
+---------------+-------+
|
| Variable_name | Value |
|
+---------------+-------+
|
| server_id | 3 |
|
+---------------+-------+
|
1 row in set (0.00 sec)
|
|
+----------------+-------+
|
| Variable_name | Value |
|
+----------------+-------+
|
| gtid_domain_id | 0 |
|
+----------------+-------+
|
1 row in set (0.01 sec)
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> show variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------+
|
11 rows in set (0.00 sec)
|
Replication configured without inserting anything, gtid_domain_id remains unset (0) and GTID disbled:
master1
MariaDB [(none)]> show all slaves status\G
|
Empty set (0.00 sec)
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> show variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | 0-1-2 |
|
| gtid_binlog_state | 0-1-2 |
|
| gtid_current_pos | 0-1-2 |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
master2
MariaDB [(none)]> show all slaves status\G
|
Empty set (0.00 sec)
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> show variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | 0-2-2 |
|
| gtid_binlog_state | 0-2-2 |
|
| gtid_current_pos | 0-2-2 |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
slave1
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 629
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 538
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 629
|
Relay_Log_Space: 839
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 5
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-2
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000003
|
Read_Master_Log_Pos: 328
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 538
|
Relay_Master_Log_File: mariadb-bin.000003
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 328
|
Relay_Log_Space: 839
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 5
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-2
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 1 | 1 | 2 |
|
| 0 | 2 | 2 | 2 |
|
+-----------+--------+-----------+--------+
|
2 rows in set (0.00 sec)
|
|
MariaDB [(none)]> show variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | 0-2-2 |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | 0-2-2 |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------+
|
11 rows in set (0.00 sec)
|
Now we proceed to insert some data on the masters and replication works fine:
master1
MariaDB [(none)]> show all slaves status\G select * from mysql.gtid_slave_pos; show variables like '%gtid%';
|
Empty set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+--------+
|
| Variable_name | Value |
|
+------------------------+--------+
|
| gtid_binlog_pos | 0-1-10 |
|
| gtid_binlog_state | 0-1-10 |
|
| gtid_current_pos | 0-1-10 |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
+------------------------+--------+
|
9 rows in set (0.00 sec)
|
master2
MariaDB [(none)]> show all slaves status\G select * from mysql.gtid_slave_pos; show variables like '%gtid%';
|
Empty set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | 0-2-7 |
|
| gtid_binlog_state | 0-2-7 |
|
| gtid_current_pos | 0-2-7 |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
slave
MariaDB [(none)]> show all slaves status\G select * from mysql.gtid_slave_pos; show variables like '%gtid%';
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 1685
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 1594
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 1685
|
Relay_Log_Space: 1895
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 21
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-7
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000003
|
Read_Master_Log_Pos: 977
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 1187
|
Relay_Master_Log_File: mariadb-bin.000003
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 977
|
Relay_Log_Space: 1488
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 15
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-7
|
2 rows in set (0.00 sec)
|
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 14 | 2 | 6 |
|
| 0 | 15 | 2 | 7 |
|
+-----------+--------+-----------+--------+
|
2 rows in set (0.00 sec)
|
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | 0-2-7 |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | 0-2-7 |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------+
|
11 rows in set (0.00 sec)
|
Now I enable gtid_domain_id as follows:
slave
MariaDB [(none)]> show global variables like 'server_id';
|
+---------------+-------+
|
| Variable_name | Value |
|
+---------------+-------+
|
| server_id | 3 |
|
+---------------+-------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> set global gtid_domain_id=3;
|
Query OK, 0 rows affected (0.00 sec)
|
MariaDB [(none)]> select @@hostname;
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
insert data on masters - replication keeps working fine:
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 1822
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 1731
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 1822
|
Relay_Log_Space: 2032
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 23
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-8
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000003
|
Read_Master_Log_Pos: 1124
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 1334
|
Relay_Master_Log_File: mariadb-bin.000003
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 1124
|
Relay_Log_Space: 1635
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 17
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-8
|
2 rows in set (0.00 sec)
|
master1
MariaDB [(none)]> select @@hostname;
|
+------------+
|
| @@hostname |
|
+------------+
|
| master1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> show global variables like 'server_id';
|
+---------------+-------+
|
| Variable_name | Value |
|
+---------------+-------+
|
| server_id | 1 |
|
+---------------+-------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> set global gtid_domain_id=1;
|
Query OK, 0 rows affected (0.00 sec)
|
insert data on masters - replication keeps working fine and we can see that the Gitd_slave_pos has changed for both replication threads with the new gtid_domain_id = 1
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 2072
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 1981
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 2072
|
Relay_Log_Space: 2282
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 27
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-9,1-1-2
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000003
|
Read_Master_Log_Pos: 1247
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 1457
|
Relay_Master_Log_File: mariadb-bin.000003
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 1247
|
Relay_Log_Space: 1758
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 19
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-9,1-1-2
|
2 rows in set (0.00 sec)
|
master1
MariaDB [(none)]> select * from mysql.gtid_slave_pos; show variables like '%gtid%';
|
Empty set (0.00 sec)
|
|
+------------------------+--------------+
|
| Variable_name | Value |
|
+------------------------+--------------+
|
| gtid_binlog_pos | 1-1-2,0-1-11 |
|
| gtid_binlog_state | 1-1-2,0-1-11 |
|
| gtid_current_pos | 1-1-2,0-1-11 |
|
| gtid_domain_id | 1 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
+------------------------+--------------+
|
9 rows in set (0.00 sec)
|
master2
MariaDB [(none)]> select * from mysql.gtid_slave_pos; show variables like '%gtid%';
|
Empty set (0.00 sec)
|
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | 0-2-9 |
|
| gtid_binlog_state | 0-2-9 |
|
| gtid_current_pos | 0-2-9 |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
slave
MariaDB [(none)]> select * from mysql.gtid_slave_pos; show variables like '%gtid%';
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 17 | 2 | 8 |
|
| 0 | 20 | 2 | 9 |
|
| 1 | 18 | 1 | 1 |
|
| 1 | 19 | 1 | 2 |
|
+-----------+--------+-----------+--------+
|
4 rows in set (0.00 sec)
|
|
+------------------------+-------------+
|
| Variable_name | Value |
|
+------------------------+-------------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | 0-2-9,1-1-2 |
|
| gtid_domain_id | 3 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | 0-2-9,1-1-2 |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------------+
|
11 rows in set (0.00 sec)
|
|
And now we change gtid_domain_id on the master2
master 2
MariaDB [(none)]> select @@hostname;
|
+------------+
|
| @@hostname |
|
+------------+
|
| master2 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> show global variables like 'server_id';
|
+---------------+-------+
|
| Variable_name | Value |
|
+---------------+-------+
|
| server_id | 2 |
|
+---------------+-------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> set global gtid_domain_id=2;
|
Query OK, 0 rows affected (0.00 sec)
|
insert data on masters and replication is fine:
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 2336
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 2245
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 2336
|
Relay_Log_Space: 2546
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 31
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-10,1-1-4,2-2-1
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000003
|
Read_Master_Log_Pos: 1549
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 1759
|
Relay_Master_Log_File: mariadb-bin.000003
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 1549
|
Relay_Log_Space: 2060
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 23
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-10,1-1-4,2-2-1
|
2 rows in set (0.00 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos; show variables like '%gtid%';
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 20 | 2 | 9 |
|
| 0 | 21 | 2 | 10 |
|
| 1 | 22 | 1 | 3 |
|
| 1 | 24 | 1 | 4 |
|
| 2 | 23 | 2 | 1 |
|
+-----------+--------+-----------+--------+
|
5 rows in set (0.00 sec)
|
|
+------------------------+--------------------+
|
| Variable_name | Value |
|
+------------------------+--------------------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | 0-2-10,1-1-4,2-2-1 |
|
| gtid_domain_id | 3 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | 0-2-10,1-1-4,2-2-1 |
|
| gtid_strict_mode | OFF |
|
| last_gtid | |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+--------------------+
|
11 rows in set (0.00 sec)
|
master1
MariaDB [(none)]> select * from mysql.gtid_slave_pos; show variables like '%gtid%';
|
Empty set (0.00 sec)
|
|
+------------------------+--------------+
|
| Variable_name | Value |
|
+------------------------+--------------+
|
| gtid_binlog_pos | 1-1-4,0-1-11 |
|
| gtid_binlog_state | 1-1-4,0-1-11 |
|
| gtid_current_pos | 1-1-4,0-1-11 |
|
| gtid_domain_id | 1 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| last_gtid | 1-1-4 |
|
+------------------------+--------------+
|
9 rows in set (0.00 sec)
|
master2
MariaDB [(none)]> select * from mysql.gtid_slave_pos; show variables like '%gtid%';
|
Empty set (0.00 sec)
|
|
+------------------------+--------------+
|
| Variable_name | Value |
|
+------------------------+--------------+
|
| gtid_binlog_pos | 0-2-10,2-2-1 |
|
| gtid_binlog_state | 0-2-10,2-2-1 |
|
| gtid_current_pos | 0-2-10,2-2-1 |
|
| gtid_domain_id | 2 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| last_gtid | 2-2-1 |
|
+------------------------+--------------+
|
9 rows in set (0.00 sec)
|
And now, let's stop all the slaves and enable GTID and replication breaks without even inserting data
MariaDB [(none)]> stop all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.00 sec)
|
|
MariaDB [(none)]> change master '1' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '2' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.01 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State:
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 2336
|
Relay_Log_File: mysqld-relay-bin-1.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 2336
|
Relay_Log_Space: 249
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 0-2-10, which is not in the master's binlog. Since the master's binlog contains GTIDs with higher sequence numbers, it probably means that the slave has diverged due to executing extra erroneous transactions'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-4,0-2-10,2-2-1
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 32
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-10,1-1-4,2-2-1
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000003
|
Read_Master_Log_Pos: 1549
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 712
|
Relay_Master_Log_File: mariadb-bin.000003
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 1549
|
Relay_Log_Space: 1013
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-4,0-2-10,2-2-1
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 32
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-10,1-1-4,2-2-1
|
2 rows in set (0.00 sec)
|
As a side note, the order I chose to enable gtid_domain_id was:
slave
master1
master2
The reason for that is because in a big environment, using Puppet, you cannot be sure in which order it will be enabled and thus that was a random choice.
Labels | gtid mariadb multisource need_feedback | gtid mariadb multisource |
Thanks a lot for this thorough demonstration, now it's quite clear what is happening (assuming that I interpret it correctly).
At the beginning, since both masters are running with domain_id 0, the slave considers it a single replication stream.
We can see how the slave only has 0-2-x, although master1 has 0-1-x and master2 has 0-2-x.
When the slave is switched to using GTID replication, it sends the position to each master. Although domain IDs on masters are now set to unique values, they of course stay aware of the old 0 domain they used to use, so they are trying to find the position 0-2-x which corresponds to this domain. But only one of the servers has it, the other one returns an error.
I think one way to overcome the problem is to adjust the slave position manually before switching to using GTID-based replication.
That is, assuming that you've ensured that all old-zero-domain events from both servers have been replicated by this time, after you stopped slaves and before you changed them to using GTID, you delete zero domain from mysql.gtid_slave_pos and from @@global.gtid_slave_pos:
stop all slaves; |
|
delete from mysql.gtid_slave_pos where domain_id=0; |
|
select @@global.gtid_slave_pos; |
# Let's say it's 0-2-10,1-1-4,2-2-1 as in the data above |
set global gtid_slave_pos= '1-1-4,2-2-1'; |
|
change master '1' to master_use_gtid=slave_pos; |
change master '2' to master_use_gtid=slave_pos; |
start all slaves; |
It might produce a warning about missing domain "3", but it will be adjusted automatically.
I think it should help to get it up and running, although it's a bit cumbersome.
I'm not sure whether it's a bug.
knielsen, what do you think?
Also, maybe you can suggest a more elegant way to perform this transition?
Labels | gtid mariadb multisource | gtid mariadb multisource need_feedback |
I agree with Elena. When replicating with GTID, the master must have either all of the domain in the binlog, or none of the domain. Changing gtid_domain_id on a master at some point does not change the binlog of the past.
RESET MASTER on the masters will remove the old events with wrong domain id, though that may be inconvenient to do on a production setup as it loses all binlogs on the masters.
Thanks Elena for your answer. I appreciate it.
First of all, I have tried your approach with two different scenarios.
Scenario #1 in which we have already broken the slave (as per above) and try to fix it with your workaround and scenario #2 in which we are aware of this limitation and we try to switch to GTID before it breaks
Scenario #1
Let's assume we have the slave already broken from my previous test and we try to fix it by changing the slave_pos as you suggested. Obviously, the masters have kept inserting data meanwhile:
MariaDB [(none)]> stop all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.00 sec)
|
|
MariaDB [(none)]> select @@global.gtid_slave_pos;
|
+-------------------------+
|
| @@global.gtid_slave_pos |
|
+-------------------------+
|
| 0-2-10,1-1-4,2-2-6 |
|
+-------------------------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> delete from mysql.gtid_slave_pos where domain_id=0;
|
Query OK, 2 rows affected (0.00 sec)
|
|
MariaDB [(none)]> set global gtid_slave_pos='1-1-4,2-2-6';
|
Query OK, 0 rows affected, 1 warning (0.04 sec)
|
|
MariaDB [(none)]> change master '1' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '2' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.02 sec)
|
And replicaiton breaks because the slaves tries to execute veeeeeeery old stuff. From the start of the binlogs.
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State:
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 2953
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 919
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 1007
|
Last_Error: Error 'Can't create database 'lala'; database exists' on query. Default database: 'lala'. Query: 'create database lala'
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 629
|
Relay_Log_Space: 3085
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 1007
|
Last_SQL_Error: Error 'Can't create database 'lala'; database exists' on query. Default database: 'lala'. Query: 'create database lala'
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-9,0-1-11,2-2-6
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 46
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-2,1-1-4,2-2-6,3-3-1
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State:
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000003
|
Read_Master_Log_Pos: 2164
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 604
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 1396
|
Last_Error: Error 'Operation CREATE USER failed for 'replication'@'%'' on query. Default database: ''. Query: 'CREATE USER 'replication'@'%' IDENTIFIED BY 'password''
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 314
|
Relay_Log_Space: 3011
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 1396
|
Last_SQL_Error: Error 'Operation CREATE USER failed for 'replication'@'%'' on query. Default database: ''. Query: 'CREATE USER 'replication'@'%' IDENTIFIED BY 'password''
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-4,0-2-10,2-2-6
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 60
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-2,1-1-4,2-2-6,3-3-1
|
2 rows in set (0.00 sec)
|
|
Scenario #2
Let's assume the slave was up-to-date so the gtid_domain_id was changed in the two masters and the slave has replicated all events coming from them after the gtid_domain_id was set in all the servers.
So:
slave1;
MariaDB [(none)]> stop all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.01 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State:
|
Slave_IO_State:
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000001
|
Read_Master_Log_Pos: 1375
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 1599
|
Relay_Master_Log_File: mariadb-bin.000001
|
Slave_IO_Running: No
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 1375
|
Relay_Log_Space: 1900
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 23
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-7,1-1-5,2-2-3
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State:
|
Slave_IO_State:
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000001
|
Read_Master_Log_Pos: 1496
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 1720
|
Relay_Master_Log_File: mariadb-bin.000001
|
Slave_IO_Running: No
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 1496
|
Relay_Log_Space: 2021
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 25
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-7,1-1-5,2-2-3
|
2 rows in set (0.00 sec)
|
Meanwhile, the masters keep inserting data as they'd normally do.
Let's try the workaround you suggested:
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 10 | 2 | 6 |
|
| 0 | 12 | 2 | 7 |
|
| 1 | 17 | 1 | 4 |
|
| 1 | 18 | 1 | 5 |
|
| 2 | 15 | 2 | 2 |
|
| 2 | 19 | 2 | 3 |
|
+-----------+--------+-----------+--------+
|
6 rows in set (0.00 sec)
|
MariaDB [(none)]> delete from mysql.gtid_slave_pos where domain_id=0;
|
Query OK, 2 rows affected (0.01 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 1 | 17 | 1 | 4 |
|
| 1 | 18 | 1 | 5 |
|
| 2 | 15 | 2 | 2 |
|
| 2 | 19 | 2 | 3 |
|
+-----------+--------+-----------+--------+
|
4 rows in set (0.00 sec)
|
MariaDB [(none)]> select @@global.gtid_slave_pos;
|
+-------------------------+
|
| @@global.gtid_slave_pos |
|
+-------------------------+
|
| 0-2-7,1-1-5,2-2-3 |
|
+-------------------------+
|
1 row in set (0.00 sec)
|
MariaDB [(none)]> set global gtid_slave_pos='1-1-5,2-2-3';
|
Query OK, 0 rows affected, 1 warning (0.06 sec)
|
|
MariaDB [(none)]> change master '1' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '2' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.00 sec)
|
And after starting it it breaks horribly because it even tries to go thru the binlog again and tries to write stuff that was already in there as it can it be seen below. Same issue than with scenario #1
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.01 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State:
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000001
|
Read_Master_Log_Pos: 1732
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 604
|
Relay_Master_Log_File: mariadb-bin.000001
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 1007
|
Last_Error: Error 'Can't create database 'd1'; database exists' on query. Default database: 'd1'. Query: 'create database d1'
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 314
|
Relay_Log_Space: 1785
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 1007
|
Last_SQL_Error: Error 'Can't create database 'd1'; database exists' on query. Default database: 'd1'. Query: 'create database d1'
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-8,0-1-4,2-2-3
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 32
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-3-8,1-1-5,2-2-3
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State:
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000001
|
Read_Master_Log_Pos: 1853
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 604
|
Relay_Master_Log_File: mariadb-bin.000001
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 1007
|
Last_Error: Error 'Can't create database 'd2'; database exists' on query. Default database: 'd2'. Query: 'create database d2'
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 314
|
Relay_Log_Space: 2142
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 1007
|
Last_SQL_Error: Error 'Can't create database 'd2'; database exists' on query. Default database: 'd2'. Query: 'create database d2'
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-5,0-2-7,2-2-6
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 34
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-3-8,1-1-5,2-2-3
|
2 rows in set (0.00 sec)
|
To be honest, I do consider this a bug, or at least a serious limitation as it doesn't work as expected (specially keeping in mind that the documentation doesn't talk about existing multisource slaves breaking when trying to set up gtid_domain_id+GTID). If this is the expected behaviour it should be stated there as a big warning:
I consider this (if it is the expected behaviour) a serious limitation because it forces you to either:
- not use GTID with multisource slaves
or - Making sure that you set up all the stuff from the start (gtid_domain_id) before setting up any sort of multisource replication, otherwise it will break. Which again, is a massive stop for existing environments where you'd like to take advantage of multisource (ie: reduce hardware) and then the safety of GTID on top of it.
Even if the suggested workaround would work (maybe we are doing something wrong and it really works?) it requires to stop the slave, which might be a no-go for some production environments. And it should be specified on the documentation.
But doing a RESET MASTER is probably a no-go for the 95% of the production environments I would say.
Whatever the fix is, if it cannot be done live (the documentation suggests that gtid_domain_id can be changed on the fly, which is true, but swiching to GTID will break anyways so the whole point of gtid_domain_id is gone), it needs to be specified on the documentation.
I am happy to keep debugging this issue to see if we can reach a workaround can work
Thanks again!
Manuel.
Labels | gtid mariadb multisource need_feedback | gtid mariadb multisource |
Hello,
I have done another test.
- Multisource replication without gtid_domain_id on the masters and without GTID
- data being written
- enabled gtid_domain_id on the masters
- data being written.
- I flush the logs on the masters to get rid of the old binlogs which still have 0 as a domain_id and then purge then
MariaDB [(none)]> show master status\G
*************************** 1. row ***************************
File: mariadb-bin.000001
Position: 1760
Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.00 sec)
MariaDB [(none)]> flush binary logs;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> show master status\G
*************************** 1. row ***************************
File: mariadb-bin.000002
Position: 544
Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.00 sec)
MariaDB [(none)]> show binary logs;
+--------------------+-----------+
| Log_name | File_size |
+--------------------+-----------+
| mariadb-bin.000002 | 544 |
+--------------------+-----------+
1 row in set (0.00 sec)
Same thing done on both masters
Data keeps flowing and the slave looks good and using the new binlogs
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 705
|
Relay_Log_File: mysqld-relay-bin-1.000005
|
Relay_Log_Pos: 995
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 705
|
Relay_Log_Space: 1341
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 45
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-3,1-1-9,2-2-9
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 709
|
Relay_Log_File: mysqld-relay-bin-2.000005
|
Relay_Log_Pos: 999
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 709
|
Relay_Log_Space: 1345
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 43
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-3,1-1-9,2-2-9
|
2 rows in set (0.00 sec)
|
So let's try to change to GTID:
MariaDB [(none)]> stop all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.00 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State:
|
Slave_IO_State:
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 705
|
Relay_Log_File: mysqld-relay-bin-1.000005
|
Relay_Log_Pos: 995
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: No
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 705
|
Relay_Log_Space: 1341
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 45
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-3,1-1-9,2-2-9
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State:
|
Slave_IO_State:
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 709
|
Relay_Log_File: mysqld-relay-bin-2.000005
|
Relay_Log_Pos: 999
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: No
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 709
|
Relay_Log_Space: 1345
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 43
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-3,1-1-9,2-2-9
|
2 rows in set (0.00 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 15 | 2 | 3 |
|
| 0 | 16 | 1 | 3 |
|
| 1 | 26 | 1 | 8 |
|
| 1 | 28 | 1 | 9 |
|
| 2 | 27 | 2 | 8 |
|
| 2 | 29 | 2 | 9 |
|
+-----------+--------+-----------+--------+
|
6 rows in set (0.00 sec)
|
|
MariaDB [(none)]> select @@gtid_slave_pos;
|
+-------------------+
|
| @@gtid_slave_pos |
|
+-------------------+
|
| 0-1-3,1-1-9,2-2-9 |
|
+-------------------+
|
1 row in set (0.00 sec)
|
|
|
Now let's try the work around you suggested:
MariaDB [(none)]> delete from mysql.gtid_slave_pos where domain_id=0;
|
Query OK, 2 rows affected (0.00 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 1 | 26 | 1 | 8 |
|
| 1 | 28 | 1 | 9 |
|
| 2 | 27 | 2 | 8 |
|
| 2 | 29 | 2 | 9 |
|
+-----------+--------+-----------+--------+
|
4 rows in set (0.00 sec)
|
|
MariaDB [(none)]> set global gtid_slave_pos = '1-1-9,2-2-9';
|
Query OK, 0 rows affected, 1 warning (0.06 sec)
|
|
MariaDB [(none)]> show warnings;
|
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| Level | Code | Message |
|
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| Warning | 1948 | Specified value for @@gtid_slave_pos contains no value for replication domain 0. This conflicts with the binary log which contains GTID 0-3-5. If MASTER_GTID_POS=CURRENT_POS is used, the binlog position will override the new value of @@gtid_slave_pos. |
|
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> change master '1' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '2' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.03 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State:
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 705
|
Relay_Log_File: mysqld-relay-bin-1.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 705
|
Relay_Log_Space: 249
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-9,2-2-9
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 46
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-3-4,1-1-9,2-2-9
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State:
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 709
|
Relay_Log_File: mysqld-relay-bin-2.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 709
|
Relay_Log_Space: 249
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-9,2-2-9
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 44
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-3-4,1-1-9,2-2-9
|
2 rows in set (0.00 sec)
|
So it looks like it is trying to download other binlogs.
Let's try to reset the slave all and reconfigure it:
MariaDB [(none)]> reset slave '1' all;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> reset slave '2' all;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 1 | 32 | 1 | 9 |
|
| 2 | 33 | 2 | 9 |
|
+-----------+--------+-----------+--------+
|
2 rows in set (0.00 sec)
|
|
MariaDB [(none)]> delete from mysql.gtid_slave_pos;
|
Query OK, 2 rows affected (0.01 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> change master '2' to master_host='192.168.56.22', master_user='replication', master_password='xx', master_log_pos=709, master_log_file='mariadb-bin.000002';
|
Query OK, 0 rows affected (0.02 sec)
|
|
MariaDB [(none)]> change master '1' to master_host='192.168.56.21', master_user='replication', master_password='xx', master_log_pos=705, master_log_file='mariadb-bin.000002';
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '1' to master_use_gtid=current_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '2' to master_use_gtid=current_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.02 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State:
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 705
|
Relay_Log_File: mysqld-relay-bin-1.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 705
|
Relay_Log_Space: 249
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 0-3-6, which is not in the master's binlog'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Current_Pos
|
Gtid_IO_Pos: 1-1-9,0-3-6,2-2-9
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 1
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-3-6,1-1-9,2-2-9
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State:
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 709
|
Relay_Log_File: mysqld-relay-bin-2.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 709
|
Relay_Log_Space: 249
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 0-3-6, which is not in the master's binlog'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Current_Pos
|
Gtid_IO_Pos: 1-1-9,0-3-6,2-2-9
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 1
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-3-6,1-1-9,2-2-9
|
2 rows in set (0.00 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
MariaDB [(none)]> show global variables like '%gtid%';
|
+------------------------+-------------------+
|
| Variable_name | Value |
|
+------------------------+-------------------+
|
| gtid_binlog_pos | 0-3-6 |
|
| gtid_binlog_state | 0-3-6 |
|
| gtid_current_pos | 0-3-6,1-1-9,2-2-9 |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | 1-1-9,2-2-9 |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------------------+
|
9 rows in set (0.00 sec)
|
|
After that, let's try a step further:
MariaDB [(none)]> stop all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.00 sec)
|
|
MariaDB [(none)]> reset slave '1' all;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> reset slave '2' all;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> delete from mysql.gtid_slave_pos;
|
Query OK, 0 rows affected (0.00 sec)
|
|
root@slave1:/var/log/mysql# /etc/init.d/mysql stop
|
[ ok ] Stopping mysql (via systemctl): mysql.service.
|
|
root@slave1:/var/log/mysql# ls
|
mariadb-bin.000001 mariadb-bin.000002 mariadb-bin.index mariadb-bin.state
|
root@slave1:/var/log/mysql# rm -fr *
|
root@slave1:/var/log/mysql# rm -fr /var/lib/mysql/multi-master.info
|
|
MariaDB [(none)]> show global variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> change master '2' to master_host='192.168.56.22', master_user='replication', master_password='xx', master_log_pos=709, master_log_file='mariadb-bin.000002';
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '1' to master_host='192.168.56.21', master_user='replication', master_password='xx', master_log_pos=705, master_log_file='mariadb-bin.000002';
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
|
|
MariaDB [(none)]> show global variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
|
MariaDB [(none)]> change master '1' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '2' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.02 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State:
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 705
|
Relay_Log_File: mysqld-relay-bin-1.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 705
|
Relay_Log_Space: 249
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 2
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos:
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State:
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 709
|
Relay_Log_File: mysqld-relay-bin-2.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 709
|
Relay_Log_Space: 249
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 2
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos:
|
2 rows in set (0.00 sec)
|
Obviously the masters binlogs are there:
MariaDB [(none)]> select @@hostname;
|
+------------+
|
| @@hostname |
|
+------------+
|
| master1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> show master status\G
|
*************************** 1. row ***************************
|
File: mariadb-bin.000002
|
Position: 705
|
Binlog_Do_DB:
|
Binlog_Ignore_DB:
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]>
|
MariaDB [(none)]> select @@hostname;
|
+------------+
|
| @@hostname |
|
+------------+
|
| master2 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> show master status\G
|
*************************** 1. row ***************************
|
File: mariadb-bin.000002
|
Position: 709
|
Binlog_Do_DB:
|
Binlog_Ignore_DB:
|
1 row in set (0.00 sec)
|
|
If I start non GTID replication it works
MariaDB [(none)]> stop all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.00 sec)
|
|
MariaDB [(none)]> change master '2' to master_use_gtid=no;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '1' to master_use_gtid=no;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.00 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 705
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 538
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 705
|
Relay_Log_Space: 839
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 7
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-3,1-1-9,2-2-9
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000002
|
Read_Master_Log_Pos: 709
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 538
|
Relay_Master_Log_File: mariadb-bin.000002
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 709
|
Relay_Log_Space: 839
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 7
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-3,1-1-9,2-2-9
|
Hope this helps
I tried the workaround with manually adjusting gtid_slave_pos before suggesting it and it worked, but I absolutely believe that it only helps in very specific (maybe overly-simplified) cases and not in real-life situations. I also do agree that it's a real problem, when I expressed my doubt whether it's a bug, I didn't mean it's not important, only that it's probably not expected to work in scope of current design.
I'll assign it to monty, maybe he can think of a way to either make it work, or at least to facilitate the transition from pre-GTID multi-source to GTID-based multi-source.
Fix Version/s | 10.1 [ 16100 ] | |
Assignee | Michael Widenius [ monty ] | |
Priority | Blocker [ 1 ] | Critical [ 2 ] |
Thanks Elena! If there is something else I can do (or test) just let me know. Happy to help here
Thanks again!
It looks to me like there is a flaw in Multi-Source + GTID design:
- Without GTID, each source maintains its own position in master.info and relay-log.info.
- With GTID, all those are merged in the mysql.gtid_slave_pos table loosing the connection_name data.
So when a multi-source slave connects to a master with master_use_gtid=slave_pos, it sends the GTIDs from all the sources to the master when it should only have sent the GTIDs of the current source.
To me, it also looks like a muti-source slave connecting to a master with master_use_gtid=current_pos does not make sense. This feature is made for reconnecting an old (failed) master to a promoted slave (new master). This makes little sense in a multi-source deployment (a multi-source slave should never be promoted as a new master).
Hello,
I have been doing more investigations and this might be useful:
Before setting up the replication this is the situation on all the hosts:
|
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show slave status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| master1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
+------------------------+-------+
|
7 rows in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
|
|
|
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show slave status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| master2 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
+------------------------+-------+
|
7 rows in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
|
|
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show slave status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
Empty set (0.00 sec)
|
Let's set up multisource replication:
MariaDB [(none)]> change master '1' to master_host='192.168.56.21', master_user='replication', master_password='password', master_log_pos=314, master_log_file='mariadb-bin.000001';
|
Query OK, 0 rows affected (0.03 sec)
|
|
MariaDB [(none)]> change master '2' to master_host='192.168.56.22', master_user='replication', master_password='password', master_log_pos=314, master_log_file='mariadb-bin.000001';
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.00 sec)
|
|
MariaDB [(none)]> show warnings;
|
+-------+------+-------------------+
|
| Level | Code | Message |
|
+-------+------+-------------------+
|
| Note | 1937 | SLAVE '1' started |
|
| Note | 1937 | SLAVE '2' started |
|
+-------+------+-------------------+
|
2 rows in set (0.00 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000001
|
Read_Master_Log_Pos: 314
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 538
|
Relay_Master_Log_File: mariadb-bin.000001
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 314
|
Relay_Log_Space: 839
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 5
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos:
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000001
|
Read_Master_Log_Pos: 314
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 538
|
Relay_Master_Log_File: mariadb-bin.000001
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 314
|
Relay_Log_Space: 839
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 5
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos:
|
2 rows in set (0.00 sec)
|
Now I generate some writes on each master and rotating logs on the maters too.
We can see that we now have some more coordenates on the slave:
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show all slaves status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+--------+
|
| Variable_name | Value |
|
+------------------------+--------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | 0-2-35 |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | 0-2-35 |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+--------+
|
9 rows in set (0.00 sec)
|
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 63 | 2 | 34 |
|
| 0 | 64 | 2 | 35 |
|
+-----------+--------+-----------+--------+
|
2 rows in set (0.00 sec)
|
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000003
|
Read_Master_Log_Pos: 3649
|
Relay_Log_File: mysqld-relay-bin-1.000006
|
Relay_Log_Pos: 3939
|
Relay_Master_Log_File: mariadb-bin.000003
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 3649
|
Relay_Log_Space: 5111
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 83
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-35
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000006
|
Read_Master_Log_Pos: 3310
|
Relay_Log_File: mysqld-relay-bin-2.000012
|
Relay_Log_Pos: 3600
|
Relay_Master_Log_File: mariadb-bin.000006
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 3310
|
Relay_Log_Space: 5238
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 125
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-35
|
2 rows in set (0.00 sec)
|
And of course on the masters:
MariaDB [(none)]> select @@hostname; show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show slave status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| master1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
+------------------------+--------+
|
| Variable_name | Value |
|
+------------------------+--------+
|
| gtid_binlog_pos | 0-1-29 |
|
| gtid_binlog_state | 0-1-29 |
|
| gtid_current_pos | 0-1-29 |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
+------------------------+--------+
|
7 rows in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
|
|
MariaDB [(none)]> select @@hostname; show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show slave status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| master2 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
+------------------------+--------+
|
| Variable_name | Value |
|
+------------------------+--------+
|
| gtid_binlog_pos | 0-2-35 |
|
| gtid_binlog_state | 0-2-35 |
|
| gtid_current_pos | 0-2-35 |
|
| gtid_domain_id | 0 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
+------------------------+--------+
|
7 rows in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
Empty set (0.00 sec)
|
Let's go ahead and enable gtid_domain_id on all the hosts:
MariaDB [(none)]> select @@hostname;
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> show global variables like 'server_id';
|
+---------------+-------+
|
| Variable_name | Value |
|
+---------------+-------+
|
| server_id | 3 |
|
+---------------+-------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> set global gtid_domain_id=3;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> show global variables like 'gtid_domain_id';
|
+----------------+-------+
|
| Variable_name | Value |
|
+----------------+-------+
|
| gtid_domain_id | 3 |
|
+----------------+-------+
|
1 row in set (0.00 sec)
|
|
|
MariaDB [(none)]> select @@hostname;
|
+------------+
|
| @@hostname |
|
+------------+
|
| master1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> show global variables like 'server_id';
|
+---------------+-------+
|
| Variable_name | Value |
|
+---------------+-------+
|
| server_id | 1 |
|
+---------------+-------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> set global gtid_domain_id=1;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> show global variables like 'gtid_domain_id';
|
+----------------+-------+
|
| Variable_name | Value |
|
+----------------+-------+
|
| gtid_domain_id | 1 |
|
+----------------+-------+
|
1 row in set (0.00 sec)
|
|
|
|
MariaDB [(none)]> select @@hostname;
|
+------------+
|
| @@hostname |
|
+------------+
|
| master2 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> show global variables like 'server_id';
|
+---------------+-------+
|
| Variable_name | Value |
|
+---------------+-------+
|
| server_id | 2 |
|
+---------------+-------+
|
1 row in set (0.00 sec)
|
|
MariaDB [(none)]> set global gtid_domain_id=2;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> show global variables like 'gtid_domain_id';
|
+----------------+-------+
|
| Variable_name | Value |
|
+----------------+-------+
|
| gtid_domain_id | 2 |
|
+----------------+-------+
|
1 row in set (0.00 sec)
|
`
Let's generate more writes, flush logs and so forth and we can see the slaves has picked up the new coordinates:
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show all slaves status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+----------------------+
|
| Variable_name | Value |
|
+------------------------+----------------------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | 0-2-20,1-1-18,2-2-16 |
|
| gtid_domain_id | 3 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | 0-2-20,1-1-18,2-2-16 |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+----------------------+
|
9 rows in set (0.00 sec)
|
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 43 | 2 | 19 |
|
| 0 | 44 | 2 | 20 |
|
| 1 | 77 | 1 | 17 |
|
| 1 | 78 | 1 | 18 |
|
| 2 | 71 | 2 | 15 |
|
| 2 | 72 | 2 | 16 |
|
+-----------+--------+-----------+--------+
|
6 rows in set (0.00 sec)
|
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000019
|
Read_Master_Log_Pos: 887
|
Relay_Log_File: mysqld-relay-bin-1.000038
|
Relay_Log_Pos: 1177
|
Relay_Master_Log_File: mariadb-bin.000019
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 887
|
Relay_Log_Space: 2375
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 337
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-20,1-1-18,2-2-16
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000012
|
Read_Master_Log_Pos: 1014
|
Relay_Log_File: mysqld-relay-bin-2.000024
|
Relay_Log_Pos: 1304
|
Relay_Master_Log_File: mariadb-bin.000012
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 1014
|
Relay_Log_Space: 2502
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 231
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-20,1-1-18,2-2-16
|
2 rows in set (0.00 sec)
|
Now, let's enable and see how it breaks:
MariaDB [(none)]> stop all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.00 sec)
|
|
MariaDB [(none)]> change master '1' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> change master '2' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.01 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State:
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000019
|
Read_Master_Log_Pos: 887
|
Relay_Log_File: mysqld-relay-bin-1.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: mariadb-bin.000019
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 887
|
Relay_Log_Space: 249
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 0-2-20, which is not in the master's binlog. Since the master's binlog contains GTIDs with higher sequence numbers, it probably means that the slave has diverged due to executing extra erroneous transactions'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-18,0-2-20,2-2-16
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 338
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-20,1-1-18,2-2-16
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000012
|
Read_Master_Log_Pos: 1014
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 714
|
Relay_Master_Log_File: mariadb-bin.000012
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 1014
|
Relay_Log_Space: 1015
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-18,0-2-20,2-2-16
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 240
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-20,1-1-18,2-2-16
|
Let's see the error coming from the channel '1'
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 0-2-20, which is not in the master's binlog. Since the master's binlog contains GTIDs with higher sequence numbers, it probably means that the slave has diverged due to executing extra erroneous transactions'
|
The error makes sense, as there is no way that that binlog contains position "0-2-20" as it is a position from the master2 when it had gtid_domain_id=0.
As we can see in the earlier outputs (server_id=2 and seq_no=20):
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 43 | 2 | 19 |
|
| 0 | 44 | 2 | 20 |
|
| 1 | 77 | 1 | 17 |
|
| 1 | 78 | 1 | 18 |
|
| 2 | 71 | 2 | 15 |
|
| 2 | 72 | 2 | 16 |
|
+-----------+--------+-----------+--------+
|
Unsuccessful first attempt to fix replication, resetting masters and slaves and starting replication again:
MariaDB [mysql]> reset master;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [mysql]> show master status\G
|
*************************** 1. row ***************************
|
File: mariadb-bin.000001
|
Position: 314
|
Binlog_Do_DB:
|
Binlog_Ignore_DB:
|
1 row in set (0.00 sec)
|
|
MariaDB [mysql]> show global variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 1 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
+------------------------+-------+
|
7 rows in set (0.00 sec)
|
|
|
MariaDB [(none)]> reset master;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> show global variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 2 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
+------------------------+-------+
|
7 rows in set (0.00 sec)
|
Now let's reset the slave:
`
|
MariaDB [(none)]> reset slave '1' all;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> reset slave '2' all;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> delete from mysql.gtid_slave_pos;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
|
Now let's configure replication again directly with gtid:
MariaDB [(none)]> change master '2' to master_host='192.168.56.22', master_user='replication', master_password='password', master_log_pos=314, master_log_file='mariadb-bin.000001';
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '1' to master_host='192.168.56.21', master_user='replication', master_password='password', master_log_pos=314, master_log_file='mariadb-bin.000001';
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '2' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> change master '1' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.02 sec)
|
|
MariaDB [(none)]> select * from mysql.gtid_slave_pos;
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000001
|
Read_Master_Log_Pos: 314
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 604
|
Relay_Master_Log_File: mariadb-bin.000001
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 314
|
Relay_Log_Space: 905
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-18,0-2-20,2-2-16
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 7
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-20,1-1-18,2-2-16,3-3-2
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000001
|
Read_Master_Log_Pos: 314
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 604
|
Relay_Master_Log_File: mariadb-bin.000001
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 314
|
Relay_Log_Space: 905
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-18,0-2-20,2-2-16
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 7
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-20,1-1-18,2-2-16,3-3-2
|
2 rows in set (0.00 sec)
|
If we check the `Gtid_Slave_Pos` it is still pointing to the OLD transactions. Even though it has been reset, the mysql.gtid_slave_pos table is empty and the multi-master.info is empty.
Let's write on both masters just one transaction on each:
MariaDB [mysql]> show global variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | 1-1-1 |
|
| gtid_binlog_state | 1-1-1 |
|
| gtid_current_pos | 1-1-1 |
|
| gtid_domain_id | 1 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
+------------------------+-------+
|
7 rows in set (0.00 sec)
|
|
|
|
|
MariaDB [(none)]> show global variables like '%gtid%';
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | 2-2-1 |
|
| gtid_binlog_state | 2-2-1 |
|
| gtid_current_pos | 2-2-1 |
|
| gtid_domain_id | 2 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
+------------------------+-------+
|
7 rows in set (0.00 sec)
|
|
|
And replication breaks on both slaves, because they are attempting to look for OLD transactions
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State:
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000001
|
Read_Master_Log_Pos: 314
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 604
|
Relay_Master_Log_File: mariadb-bin.000001
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 314
|
Relay_Log_Space: 905
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 1-1-18, which is not in the master's binlog'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-18,0-2-20,2-2-16
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 7
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-20,1-1-18,2-2-16,3-3-2
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State:
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000001
|
Read_Master_Log_Pos: 314
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 604
|
Relay_Master_Log_File: mariadb-bin.000001
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 314
|
Relay_Log_Space: 905
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 2-2-16, which is not in the master's binlog'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-18,0-2-20,2-2-16
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 7
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-2-20,1-1-18,2-2-16,3-3-2
|
Looks like there is some sort of cache (or something) that requires a restart to get cleaned so when we set up the replication threads after resetting them we do not get the same old position? I am able to only get a completely first start if I delete the file: mariadb.state.
Otherwise I cannot:
root@slave1:/var/log/mysql# /etc/init.d/mysql stop
|
[ ok ] Stopping mysql (via systemctl): mysql.service.
|
root@slave1:/var/log/mysql# ls
|
mariadb-bin.000001 mariadb-bin.000002 mariadb-bin.index mariadb-bin.state
|
root@slave1:/var/log/mysql# rm mariadb-bin.00000* mariadb-bin.index
|
root@slave1:/var/log/mysql# /etc/init.d/mysql start
|
[ ok ] Starting mysql (via systemctl): mysql.service.
|
root@slave1:/var/log/mysql# mysql
|
Welcome to the MariaDB monitor. Commands end with ; or \g.
|
Your MariaDB connection id is 3
|
Server version: 10.1.21-MariaDB-1~jessie mariadb.org binary distribution
|
|
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
|
|
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
|
|
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show all slaves status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | 3-3-1 |
|
| gtid_binlog_state | 3-3-1 |
|
| gtid_current_pos | 3-3-1 |
|
| gtid_domain_id | 3 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> Ctrl-C -- exit!
|
Aborted
|
root@slave1:/var/log/mysql# mysql
|
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show all slaves status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | 3-3-1 |
|
| gtid_binlog_state | 3-3-1 |
|
| gtid_current_pos | 3-3-1 |
|
| gtid_domain_id | 3 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------+
|
Now let's remove it:
root@slave1:/var/log/mysql# /etc/init.d/mysql stop
|
[ ok ] Stopping mysql (via systemctl): mysql.service.
|
root@slave1:/var/log/mysql# ls
|
mariadb-bin.000001 mariadb-bin.index mariadb-bin.state
|
root@slave1:/var/log/mysql# rm mariadb-bin.state
|
root@slave1:/var/log/mysql# /etc/init.d/mysql start
|
[ ok ] Starting mysql (via systemctl): mysql.service.
|
root@slave1:/var/log/mysql# mysql
|
Welcome to the MariaDB monitor. Commands end with ; or \g.
|
Your MariaDB connection id is 3
|
Server version: 10.1.21-MariaDB-1~jessie mariadb.org binary distribution
|
|
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
|
|
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
|
|
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show all slaves status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | 3-3-1 |
|
| gtid_binlog_state | 3-3-1 |
|
| gtid_current_pos | 3-3-1 |
|
| gtid_domain_id | 3 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------+
|
So let's go back from a fresh start and remove all the replication related files, including mariadb.state file that file just before we reenable GTID, so BEFORE it crashes:
So basically I reset all the masters, slave, delete all the logs, relay logs and set gtid_domain_id to 0 on all the servers, disable gtid, stop the servers etc to go back to the original issue.
So we are at the point where gtid_domain_id has been enabled on all hosts, but GTID remains disabled. Instead of setting up GTID (as it will break) let's clean up all the stuff first
So this is what I do:
Let's first clean up all the replication threads
MariaDB [(none)]> stop all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.01 sec)
|
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State:
|
Slave_IO_State:
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000014
|
Read_Master_Log_Pos: 4639
|
Relay_Log_File: mysqld-relay-bin-1.000028
|
Relay_Log_Pos: 4929
|
Relay_Master_Log_File: mariadb-bin.000014
|
Slave_IO_Running: No
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 4639
|
Relay_Log_Space: 6002
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 335
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-9,1-1-42,2-2-66
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State:
|
Slave_IO_State:
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000006
|
Read_Master_Log_Pos: 4887
|
Relay_Log_File: mysqld-relay-bin-2.000012
|
Relay_Log_Pos: 5177
|
Relay_Master_Log_File: mariadb-bin.000006
|
Slave_IO_Running: No
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 4887
|
Relay_Log_Space: 10752
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 339
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-9,1-1-42,2-2-66
|
2 rows in set (0.00 sec)
|
|
MariaDB [(none)]> reset slave '1' all;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> reset slave '2' all;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
Empty set (0.00 sec)
|
|
Even though we have reseted the slaves, the old position still remains there, even if we delete it from the table (as we have seen previously)
|
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show all slaves status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+---------------------+
|
| Variable_name | Value |
|
+------------------------+---------------------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | 0-1-9,1-1-42,2-2-66 |
|
| gtid_domain_id | 3 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | 0-1-9,1-1-42,2-2-66 |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+---------------------+
|
9 rows in set (0.00 sec)
|
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 14 | 1 | 8 |
|
| 0 | 15 | 1 | 9 |
|
| 1 | 122 | 1 | 41 |
|
| 1 | 123 | 1 | 42 |
|
| 2 | 101 | 2 | 65 |
|
| 2 | 102 | 2 | 66 |
|
+-----------+--------+-----------+--------+
|
6 rows in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> delete from mysql.gtid_slave_pos;
|
Query OK, 6 rows affected (0.00 sec)
|
|
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show all slaves status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+---------------------------+
|
| Variable_name | Value |
|
+------------------------+---------------------------+
|
| gtid_binlog_pos | 3-3-1 |
|
| gtid_binlog_state | 3-3-1 |
|
| gtid_current_pos | 0-1-9,1-1-42,2-2-66,3-3-1 |
|
| gtid_domain_id | 3 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | 0-1-9,1-1-42,2-2-66 |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+---------------------------+
|
9 rows in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> Ctrl-C -- exit!
|
Aborted
|
|
So let's try the hard way:
|
root@slave1:/var/lib/mysql# /etc/init.d/mysql stop
|
[ ok ] Stopping mysql (via systemctl): mysql.service.
|
root@slave1:/var/lib/mysql# rm multi-master.info
|
root@slave1:/var/lib/mysql# cd /var/log/mysql/
|
root@slave1:/var/log/mysql# ls
|
mariadb-bin.000001 mariadb-bin.index mariadb-bin.state
|
root@slave1:/var/log/mysql# rm -fr *
|
root@slave1:/var/log/mysql# /etc/init.d/mysql restart
|
[ ok ] Restarting mysql (via systemctl): mysql.service.
|
root@slave1:/var/log/mysql# mysql
|
Welcome to the MariaDB monitor. Commands end with ; or \g.
|
Your MariaDB connection id is 3
|
Server version: 10.1.21-MariaDB-1~jessie mariadb.org binary distribution
|
|
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
|
|
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
|
|
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show all slaves status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 3 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
As it looks good now, let's configure replication in the point where it was left and enable GTID all at once:
|
MariaDB [(none)]> change master '1' to master_host='192.168.56.21', master_user='replication', master_password='password', master_log_pos=4639, master_log_file='mariadb-bin.000014';
|
Query OK, 0 rows affected (0.02 sec)
|
|
MariaDB [(none)]> change master '2' to master_host='192.168.56.22', master_user='replication', master_password='password', master_log_pos=4887, master_log_file='mariadb-bin.000006';
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '1' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '2' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State:
|
Slave_IO_State:
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000014
|
Read_Master_Log_Pos: 4639
|
Relay_Log_File: mysqld-relay-bin-1.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: mariadb-bin.000014
|
Slave_IO_Running: No
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 4639
|
Relay_Log_Space: 249
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 0
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 0
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos:
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State:
|
Slave_IO_State:
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000006
|
Read_Master_Log_Pos: 4887
|
Relay_Log_File: mysqld-relay-bin-2.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: mariadb-bin.000006
|
Slave_IO_Running: No
|
Slave_SQL_Running: No
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 4887
|
Relay_Log_Space: 249
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 0
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 0
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos:
|
2 rows in set (0.00 sec)
|
|
Let's start the slaves and see if they catch up
|
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.01 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000014
|
Read_Master_Log_Pos: 8893
|
Relay_Log_File: mysqld-relay-bin-1.000020
|
Relay_Log_Pos: 9183
|
Relay_Master_Log_File: mariadb-bin.000014
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 8893
|
Relay_Log_Space: 10256
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-63,0-1-9
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 382
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-9,1-1-63,2-2-88
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000006
|
Read_Master_Log_Pos: 9389
|
Relay_Log_File: mysqld-relay-bin-2.000012
|
Relay_Log_Pos: 9679
|
Relay_Master_Log_File: mariadb-bin.000006
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 9389
|
Relay_Log_Space: 15254
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 0-2-6,2-2-88
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 429
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-9,1-1-63,2-2-88
|
2 rows in set (0.00 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000014
|
Read_Master_Log_Pos: 13147
|
Relay_Log_File: mysqld-relay-bin-1.000020
|
Relay_Log_Pos: 13437
|
Relay_Master_Log_File: mariadb-bin.000014
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 13147
|
Relay_Log_Space: 14510
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-84,0-1-9
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 466
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-9,1-1-84,2-2-110
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000006
|
Read_Master_Log_Pos: 13891
|
Relay_Log_File: mysqld-relay-bin-2.000012
|
Relay_Log_Pos: 14181
|
Relay_Master_Log_File: mariadb-bin.000006
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 13891
|
Relay_Log_Space: 19756
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 0-2-6,2-2-110
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 517
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-9,1-1-84,2-2-110
|
2 rows in set (0.00 sec)
|
|
MariaDB [(none)]> show all slaves status\G
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000014
|
Read_Master_Log_Pos: 13147
|
Relay_Log_File: mysqld-relay-bin-1.000020
|
Relay_Log_Pos: 13437
|
Relay_Master_Log_File: mariadb-bin.000014
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 13147
|
Relay_Log_Space: 14510
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-84,0-1-9
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 466
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-9,1-1-84,2-2-110
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000006
|
Read_Master_Log_Pos: 13891
|
Relay_Log_File: mysqld-relay-bin-2.000012
|
Relay_Log_Pos: 14181
|
Relay_Master_Log_File: mariadb-bin.000006
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 13891
|
Relay_Log_Space: 19756
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 0-2-6,2-2-110
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 517
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-9,1-1-84,2-2-110
|
They did!
So looks like we HAVE TO delete that mariadb.state.
However, as soon as we restart MySQL, it crashes again as it keeps trying to look for a position in the binlog of the other channel:
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 0-1-22, which is not in the master's binlog. Since the master's binlog contains GTIDs with higher sequence numbers, it probably means that the slave has diverged due to executing extra erroneous transactions'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-189,0-1-22,2-2-199
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 1
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 0-1-22,1-1-189,2-2-199
|
The only way I have found to get it enabled is as follows
The main issue seems to come from the masters where there is no way to clean up the coordinates when they used to have gtid_domain_id=0 unless you reset master AFTER you've set up the desired value for gtid_domain_id on it AND the slave, including the removal of all the relay logs:
+------------------------+---------------+
|
| Variable_name | Value |
|
+------------------------+---------------+
|
| gtid_binlog_pos | 0-2-44,2-2-88 |
|
| gtid_binlog_state | 0-2-44,2-2-88 |
|
| gtid_current_pos | 0-2-44,2-2-88 |
|
| gtid_domain_id | 2 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
|
|
And then on the first transaction it gets the new coordinates and the old 0 dissapears, which is what confuses the slave.
|
| Variable_name | Value |
|
+------------------------+--------+
|
| gtid_binlog_pos | 2-2-22 |
|
| gtid_binlog_state | 2-2-22 |
|
| gtid_current_pos | 2-2-22 |
|
| gtid_domain_id | 2 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
|
This only works if the slave gets completely reseted as well, otherwise it will keep looking for "0-XXX"
So the procedure would also involve resetting the master, then the slave (once all the transactions with the old domain_id=0 have been executed, you also need to reset and reconfigure the slaves with):
MariaDB [(none)]> stop all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.01 sec)
|
|
MariaDB [(none)]> reset slave all ;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> reset slave '2' all;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> reset slave '1' all;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> delete from mysql.gtid_slave_pos;
|
Query OK, 2 rows affected (0.01 sec)
|
|
MariaDB [(none)]> Ctrl-C -- exit!
|
Aborted
|
root@slave1:/var/log/mysql# /etc/init.d/mysql stop
|
[ ok ] Stopping mysql (via systemctl): mysql.service.
|
root@slave1:/var/log/mysql# ls
|
mariadb-bin.000001 mariadb-bin.index mariadb-bin.state
|
root@slave1:/var/log/mysql# rm -fr *
|
root@slave1:/var/log/mysql# cd /var/lib/mysql/
|
root@slave1:/var/lib/mysql# rm -fr multi-master.info
|
Once that is done and the slave is "fresh", you need to reenable replication:
root@slave1:/var/lib/mysql# /etc/init.d/mysql restart
|
[ ok ] Restarting mysql (via systemctl): mysql.service.
|
root@slave1:/var/lib/mysql# mysql
|
Welcome to the MariaDB monitor. Commands end with ; or \g.
|
Your MariaDB connection id is 3
|
Server version: 10.1.21-MariaDB-1~jessie mariadb.org binary distribution
|
|
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
|
|
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
|
|
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show all slaves status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+-------+
|
| Variable_name | Value |
|
+------------------------+-------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | |
|
| gtid_domain_id | 3 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+-------+
|
9 rows in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
MariaDB [(none)]> change master '1' to master_host='192.168.56.21', master_user='replication', master_password='password', master_log_pos=314, master_log_file='mariadb-bin.000001';
|
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> change master '2' to master_host='192.168.56.22', master_user='replication', master_password='password', master_log_pos=314, master_log_file='mariadb-bin.000001';
|
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '2' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.01 sec)
|
|
MariaDB [(none)]> change master '1' to master_use_gtid=slave_pos;
|
Query OK, 0 rows affected (0.00 sec)
|
|
MariaDB [(none)]> start all slaves;
|
Query OK, 0 rows affected, 2 warnings (0.01 sec)
|
|
MariaDB [(none)]> select @@hostname; show slave hosts\G show global variables like '%gtid%'; select * from mysql.gtid_slave_pos; show all slaves status\G
|
+------------+
|
| @@hostname |
|
+------------+
|
| slave1 |
|
+------------+
|
1 row in set (0.00 sec)
|
|
Empty set (0.00 sec)
|
|
+------------------------+---------------+
|
| Variable_name | Value |
|
+------------------------+---------------+
|
| gtid_binlog_pos | |
|
| gtid_binlog_state | |
|
| gtid_current_pos | 1-1-21,2-2-22 |
|
| gtid_domain_id | 3 |
|
| gtid_ignore_duplicates | OFF |
|
| gtid_slave_pos | 1-1-21,2-2-22 |
|
| gtid_strict_mode | OFF |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+---------------+
|
9 rows in set (0.00 sec)
|
|
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 1 | 16 | 1 | 16 |
|
| 1 | 17 | 1 | 17 |
|
| 1 | 18 | 1 | 18 |
|
| 1 | 19 | 1 | 19 |
|
| 1 | 20 | 1 | 20 |
|
| 1 | 21 | 1 | 21 |
|
| 2 | 41 | 2 | 20 |
|
| 2 | 42 | 2 | 21 |
|
| 2 | 43 | 2 | 22 |
|
+-----------+--------+-----------+--------+
|
9 rows in set (0.00 sec)
|
|
*************************** 1. row ***************************
|
Connection_name: 1
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.21
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000001
|
Read_Master_Log_Pos: 4568
|
Relay_Log_File: mysqld-relay-bin-1.000002
|
Relay_Log_Pos: 4858
|
Relay_Master_Log_File: mariadb-bin.000001
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 4568
|
Relay_Log_Space: 5159
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-1-21
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 91
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 1-1-21,2-2-22
|
*************************** 2. row ***************************
|
Connection_name: 2
|
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 192.168.56.22
|
Master_User: replication
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000003
|
Read_Master_Log_Pos: 4816
|
Relay_Log_File: mysqld-relay-bin-2.000002
|
Relay_Log_Pos: 5106
|
Relay_Master_Log_File: mariadb-bin.000003
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 4816
|
Relay_Log_Space: 5407
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 2-2-22
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
Retried_transactions: 0
|
Max_relay_log_size: 104857600
|
Executed_log_entries: 95
|
Slave_received_heartbeats: 0
|
Slave_heartbeat_period: 1800.000
|
Gtid_Slave_Pos: 1-1-21,2-2-22
|
Stopping and starting the slave seems to work fine and the slave processes all the stuff coming from the masters. Rotating logs on the master, flushing them and all that produces no issue on the master.
So, to be able to enable multisource+gtid looks like you have to completely reset the master AND the slave while there are no writes happening. Which is not what the documentation says (at all) as apparently as per the documentation enabling just gtid_domain_id would do the trick.
Assignee | Michael Widenius [ monty ] | Andrei Elkin [ elkin ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
After having read through the description and discussion I think the following
should be summarized.
1. At connecting to a master that has changed domain-id from the default 0 to N
the slave indeed can't fetch the 0 domain-id groups generated by the 2nd master
due to the current master was never connected to the 2nd one.
2. While the situation is a clear error in gtid_strict_mode = 1, yet it is arguable whether the non-strict mode must error out as well the same applies to the non-strict one.
I thought (think) the error reaction by the non-strict mode is actually excessive.
At any rate it is jerky 'cos the slave is yet to receive any event group signed by the 0 domain-id.
In the reported case they would never arrive.
However I would stand for the excessive and relax the non-strict mode to allow connection
to the master that is outdated (in terms of gtid sequence numbers) compared to the slave.
After all the mode allows manual change of the slave server gtid state through
manipulation with @@session gtid_seq_no, gtid_domain_id and server_id to create gaps.
If there won't be any objection I am going to implementation the relaxation very soon.
Yet, after receiving feedback I am not going to relax the non-strict mode. There seems to be
consensus about basic correctness of the non-strict mode.
Suppose the master has GTID 0-1-10. But the slave has 0-1-20.
IIUC, You want to allow the slave to connect to the master in this case.
But the slave cannot distinguish this from the case where the master is
simply behind the slave. What happens when the slave receives later GTID
0-1-11, which it probably already applied once? Or if it receives 0-2-30?
Remember, GTID replication is supposed to work even if sequence numbers are
not in order and containing holes (eg. a sequence 0-1-10, 0-2-30, 0-1-20).
Another problem is if the GTID 0-1-20 is missing on the master and never
arrives - the slave will end up silently skipping everything in domain 0,
thinking it is still ahead.
Thus the error is necessary to avoid silent replication failures.
In my opinion, the real solution to this problem is to allow to delete
no longer used domains from a master. Something like PURGE BINARY LOGS TO
'master-bin.000100' DELETE DOMAIN 0;. This would be allowed provided that
after purge, no binary logs contain any GTIDs with domain 0.
Once the domain 0 is eliminated on all servers in the replication hierarchy,
it is safe to remove it and switch to GTID replication.
The ability to delete domains is an often requested feature, useful in many
cases.
Kristian, thanks for the comments!
I actually agree with the PURGE idea. But let me be brief in this reply, I am giving an extended one latter in couple of hours.
Anyway, with my suggestion we can at least defer the error until out-of-order GTID shows up.
To repeat, in the description case this does not happen. Informally that means the master ceased
to be the old domain-id relay or source, and that is without PURGE.
PURGE or something else (to be conjured yet) could be a follow-up step upon the error.
Andrei
Even if it is not a direct solution, I like a lot being able to PURGE/forget about these events/"I know what I am doing, let me delete it in a hot way"- that may be helpful for other reasons, too. After working with GTID for a while, you end up with a large string of GTID positions, of which only the last or 2 last ibes are really relevant (topology, masters hosts and out-of-band changes by mistakes find its way into production many times).
The PURGE idea is pretty good - as long as you really know what you are doing. So it should be carefully documented I believe too. Otherwise you risk breaking the whole topology if used incorrectly.
Regarding different strict modes, I do think it should keep breaking even if non-strict mode is set, as Kristian said, it can badly break slaves and you won't notice it.
As long as you have an option (PURGE) just to force it to actually delete the domain_id that isn't present anymore, you should be fine and should be able to fix that error without any topology change.
Thanks for working on this Andrei!
Given your valuable critique, dear colleagues, I've adjusted my overly-liberal stand to the non-strict mode.
I did agree extended PURGE could be a solution, and as you point out, even to a bigger issue domain.
It could be a solution of the master side binlog state manipulation. And while @@global.gtid_binlog_state allows to forget an old domain as well it requires total purge with RESET MASTER.
While such PURGE which clears the binlog state out of obsolete domains is virtue, some efforts are
necessary still to describe its properties. E.g should it do purge a file partially (sounds like it should)? Should it erase traces of the removed domain out of remained files' Gtid_list (sounds like it should)?
Anyway as elenst mentions in comments at 2017-02-14 23:44 that @@global.gtid_slave_pos can make the slave, not master, oblivious about an old domain.
And it could perhaps be recommended as a chip and general method when it is ensured there are no gtids from the old domain inside other streams defined by the new value of @@global.gtid_slave_pos.
Specifically to the refined description provided by marostegui at 2017-03-03 13:00, a corrected @@global.gtid_slave_pos should drop GTID 0-2-20 term.
When the not-interleaving guarantee is absent for some reason, MDEV-6593 must fit to the task.
Actually, I meant FLUSH BINARY LOGS DELETE DOMAIN d1, d2, ... Not PURGE.
This command would check that the current binlog state is equal to the GTID_LIST_LOG_EVENT at the start of the first binary log file, within the specified domains. If not, it is an error. But if so, the new binary log file created by FLUSH will be written with the specified domains omitted from GTID_LIST_LOG_EVENT (and the current binlog state updated accordingly).
Existing binlog files are not modified. Just a new binlog file is written with an initial binlog state that does not contain the deleted domains. This will make the master effectively forget these domains with respect to a connecting slave.
This way of removing a domain should be relatively safe, since the check ensures there are no GTIDs anywhere with that domain. It will however be up to the user to ensure that no slave tries later to request the now-forgotten domains.
I think there should be a description of this feature somewhere, it was discussed several years back already.
Andrei, if you want comments on stuff like this, you need to use the mailing list (maria-developer@). There are a number of people on that list that will often be interesting in and respond to GTID design issues (and other stuff). But if you hide the discussion in a random jira entry, most people are unlikely to ever see it.
Kristian, Manuel, Jaime, thanks for your response!
Thanks, Kristian, for also explaining how to conduct design discussion matter, noted.
After looking around and orienting myself in the mariadb gtid "universe" I see it's pretty fair to
regard the current issue as the master side gtid-domain delete/garbage-collect.
I am trying to dig out earlier discussion to possibly reuse results though superficially it's all rather direct.
I'll mail that specifically maria-developer@ anyway.
Andrei.
Link |
This issue duplicates |
From: andrei.elkin@pp.inet.fi
Subject: [Maria-developers] Obsolete GTID domain delete on master
(MDEV-12012, MDEV-11969)
To: maria-developers@lists.launchpad.net
Cc: knielsen@knielsen-hq.org
Date: Wed, 06 Sep 2017 19:07:45 +0300 (1 hour, 27 minutes, 46 seconds ago)
Organization: Home sweet home
Reply-To: andrei.elkin@mariadb.com
Hello.
Let me propose methods to clean master off unused gtid domains.
The issue is quite practical, as a couple of references on the subject
line tell. Either of them narrates a scenario of two default domain-id
masters
serving to one slave initially by "legacy" non-gtid protocol.
When later the masters have changed their common domain to different
private ones and the slave turns gtid replication ON, it can't
connect to
a master due to the gtid protocol.
And the reason is the prior-to-gtid-connect gained slave's gtid
position
consisting of gtids generated by the other master obviously (the two
masters never replicated each other) does not fit to the current
master
binlog state.
The past default domain-id is actually permanent past from the user
perspective in these cases. Its events have been already replicated
and
none new will be generated and replicated.
Therefore such domain conceptually may be cleaned away from either the
masters and slave states.
Once it's done, the gtid-enabled slave will successfully connect to
any
master.
The slave state purge is simple SET @@global.gtid_slave_pos to a
sequence free of the purged domain.
The master side binlog state one requires a new "command" unless the
user
is happy with RESET MASTER.
While setting the new gtid binlog state to be old-domain-free
we would like for the new "command" to preserve the existing binlog
files. This could be accomplished as Kristian suggests in MDEV-12012
(I could not find any earlier references to the idea) as a new option
to
FLUSH BINARY LOGS DELETE DOMAIN d1, d2
KN> This command would check that the current binlog state is equal
KN> to
KN> the GTID_LIST_LOG_EVENT at the start of the first binary log
KN> file,
KN> within the specified domains. If not, it is an error. But if so,
KN> the new binary log file created by FLUSH will be written with the
KN> specified domains omitted from GTID_LIST_LOG_EVENT (and the
KN> current
KN> binlog state updated accordingly).
The idea looks quite sane, I only could not grasp why presence of being
deleted domains in the very first binlog's GTID_LIST_LOG_EVENT list is
warrant for throwing error.
Maybe we should leave it out to the user, Kristian? That is to decide
what domain is garbage regardless of the binlog state history.
While the FLUSH way looks sufficient and robust I could not help to
think over an alternative.
Consider a scenario when a domain's sequence number
got run out of range. While deems unrealistic in practice we can
simulate it with
SET @@SESSION.gtid_domain_id=11;
SET @@SESSION.server_id=1;
SET @@SESSION.gtid_seq_no=18446744073709551615;
/* Exec some dummy loggable query, e.g */
CREATE TABLE IF NOT EXISTS `table_dummy`;
SHOW LOCAL VARIABLES LIKE '%gtid_binlog_pos%';
11-1-18446744073709551615
SET @@SESSION.gtid_seq_no=0
DROP TABLE `table_dummy`;
SHOW LOCAL VARIABLES LIKE '%gtid_binlog_pos%';
11-1-0
I've used two gtids to show domain overflow because I also liked
to read the zero of the last gtid as ... there's nothing in this
domain. So it's actually a new "namesake" one, replacing the old that is wrapped
around. The 1st group of events created in the new domain - 11-1-1 -
could shadow the old domain's 11-1-1 as well as all the rest of the old
domain from gtid replication protocol. And that means the old domain is actually deleted.
So if my reading of the zero is correct the binlog status would be displayed empty instead as above.
That's how we also can approach the master side old gtid domain purging:
1. Leave wrapping around an old domain to the user via running
the queries like above;
2. The binary logger would be made to react on the fact of wrap-around
with binary log rotation ("internal" FLUSH BINARY LOG). And the new
binlog file won't contain the wrapped "away" domain (because there
are no new event group in it of yet).
I would be glad to hear your opinions, dear colleagues.
Cheers,
Andrei
Hi, let me bring back my comment from 2017-02-20 on this subject.
When a slave (S) is a multi-source of 2 masters (A and B):
- without GTIDs, the file/position of S in A is independent of the position of S in B,
- with GTIDs, this position is "merged" in the mysql.gtid_slave_pos table.
To me, it looks like we are missing a column in mysql.gtid_slave_pos: the connection_name.
By adding a connection_name to the mysql.gtid_slave_pos table, the domain_id of master A and B do not need to be changed when building S. This way, the position of S in A becomes independent of the position of S in B. If the position of A is 0-a-10 and the position of B is 0-b-20, when S reconnects to A, it would only send 0-a-10 (and would ignore 0-b-20 as it is on a different connection).
To me, changing the domain_id of A and B when creating a multi-source slave S should not be needed. Adding a column to mysql.gtid_slave_pos could be the solution to that.
I think purging GTIDs from a master is still a useful feature, but I do not think it is needed in this case.
> When a slave (S) is a multi-source of 2 masters (A and B):
> without GTIDs, the file/position of S in A is independent of the position of S in B,
> with GTIDs, this position is "merged" in the mysql.gtid_slave_pos table.
This is not how GTID is designed.
GTID is global. The position is not tied to a particular connection.
Suppose you have a slave S1 replicating from A and B, and a deeper slave S2
replication from S1: A->S1, B->S1, S1->S2. Now we want to promote S2 to
replicate directly from A and B: A->S2, B->S2. This works because the GTID
position is independent of a particular master connection, or even of the
number of master connections. It can even work if A and B replicate amongst
each other with --gtid-ignore-duplicates.
In effect, the domain is the "connection name" in the GTID position.
A similar argument applies to ring topologies A->B->C->D->A. If B is
removed, C needs to be able to replicate from A instead.
The whole motivation for introducing the domain_id was to support ring
topologies and multi-source and allow moving slaves arbitrarily around the
topology.
Note that multi-source replication was introduced in 10.0, same as GTID. So
it is always possible to configure GTID domains correctly from the start
when setting up multi-source replication.
Of course, if there is a way to help fixing "incorrect" existing systems,
that is great, we should do it. But users employ replication in very many
different ways, some of them quite scary. It is enormously complex to
correctly support them all. It is important to have a consistent core design
to be able to manage that complexity. And the domain-server-seqno format of
GTID and GTID pos is an attempt - however imperfect - to achieve that.
Jean-Francois, Kristian, hello.
I actually should've followed up your Feb comments. But I only can
thank Kristian for helping out while I am getting firmer grasp
of Mariadb GTID design ideas, decisions etc.
While the GTID notion is to serve as *global* (throughout a particular
topology) replication state descriptor, in this
case, S
the slave can still maintain a view of two partitions of the global state.
No new domain-id flush/purge would be required practically^{*}
in this case when the A,B connections (from S) were configured to
IGNORE (IGNORE_DOMAIN_IDS) the zero-id domain, and DO only those
that A,B master generate respectfully.
And the single gtid_slave_pos table would still suffice because the domain-id
just matches the connection id in such a setup, as Kristian pointed out.
S could stop-resume replication from any partition.
^Footnotes:
{*} Actually there seems to exist a MDEV-9108 showstopper
Cheers,
Andrei
Hm, I think a possible work-around for the original problem is:
|
|
1. Bring the master's gtid_binlog_pos in sync by executing a dummy
|
transaction on one master with the current GTID of the other master in
|
domain 0:
|
|
SET SESSION gtid_domain_id=0;
|
SET SESSION server_id=2;
|
SET SESSION gtid_seq_no=10;
|
DROP TABLE IF EXISTS non_existing_table;
|
|
2. Set the slave's GTID position to that GTID in domain 0:
|
|
SET GLOBAL gtid_slave_pos='0-2-10,...<other domains>';
|
|
Assuming no new transactions will be made in domain 0 anywhere, this should
|
make the servers all think that they are up-to-date in domain 0. So the
|
slave should be able to connect to either master without complaints about
|
domain 0.
|
|
Not sure why I didn't think of this work-around before, I think it should
|
work. I guess it is too late to be of much use now, probably...
|
Kristian,
Indeed, the dummy group method works it out.
Thanks for pointing at it!
Andrei
PS. The last sentence's wondering applies here as well .
Sprint | 10.2.11 [ 203 ] |
Fix Version/s | 10.1.30 [ 22637 ] | |
Fix Version/s | 10.2.11 [ 22634 ] | |
Fix Version/s | 10.1 [ 16100 ] | |
Resolution | Fixed [ 1 ] | |
Status | In Progress [ 3 ] | Closed [ 6 ] |
The suggested workaround by Kristian have worked fine for me as well.
Thank you
I see lots of discussion here, but it's unclear to me exactly what was "fixed" (other than the suggested work around). Can someone please elaborate?
Ultimately I have a remnant zero domain pointer in the gtid_slave_pos of two of my slaves that I'd like to safely and permanently get rid of (there are no zero-domain GTID events in any binlogs in any upstream master). Trying to delete them from mysql.gtid_slave_pos and then manually resetting the gtid_slave_pos to the actual domains currently in use doesn't seem to make any difference (the next query against gtid_slave_pos after slave restart just has them back again).
dragle, thank you for your questions! To how/what is fixed, indeed a summary is missed here. It's provided in the commit
message https://github.com/mariadb/server/commit/aae4932775d5780fe575087b40779700eaa4fcbf though:
This patch introduces a method to discard obsolete gtid domains from
|
the server binlog state. The removal requires no event group from such
|
domain present in existing binlog files though. If there are any the
|
containing logs must be first PURGEd in order for
|
|
FLUSH BINARY LOGS DELETE_DOMAIN_ID=(list-of-domains)
|
|
succeed. Otherwise the command returns an error.
|
The feature is largely about master.
To gtid_slave_pos purging, you don't have to touch mysql.gtid_slave_pos, as it's sufficient to SET @@global.gtid_slave_pos=val where val is free of the domains you dislike. The table reflects changes done to the variable.
Also see the docs pages.
Hmmmm... what did I do wrong? I've verified that the master has the position indicated (that master keeps five days of binary logs, and this position is within the last 3 hours):
MariaDB [(none)]> STOP SLAVE;
|
Query OK, 0 rows affected, 1 warning (0.000 sec)
|
|
MariaDB [(none)]> SHOW WARNINGS;
|
+-------+------+--------------------------------+
|
| Level | Code | Message |
|
+-------+------+--------------------------------+
|
| Note | 1255 | Slave already has been stopped |
|
+-------+------+--------------------------------+
|
1 row in set (0.000 sec)
|
|
MariaDB [(none)]> SELECT @@global.gtid_slave_pos;
|
+------------------------------------------------+
|
| @@global.gtid_slave_pos |
|
+------------------------------------------------+
|
| 0-303-67739600,1-301-1314447552,100-301-176774 |
|
+------------------------------------------------+
|
1 row in set (0.000 sec)
|
|
MariaDB [(none)]> SET @@global.gtid_slave_pos = '1-301-1314447552,100-301-176774';
|
Query OK, 0 rows affected (0.002 sec)
|
|
MariaDB [(none)]> SELECT @@global.gtid_slave_pos;
|
+---------------------------------+
|
| @@global.gtid_slave_pos |
|
+---------------------------------+
|
| 1-301-1314447552,100-301-176774 |
|
+---------------------------------+
|
1 row in set (0.000 sec)
|
|
MariaDB [(none)]> START SLAVE;
|
Query OK, 0 rows affected (0.646 sec)
|
|
MariaDB [(none)]> SHOW SLAVE STATUS \G
|
*************************** 1. row ***************************
|
Slave_IO_State:
|
Master_Host: 192.168.100.23
|
Master_User: repl
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: binary-log.011058
|
Read_Master_Log_Pos: 956169617
|
Relay_Log_File: mysql-relay-log.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File: binary-log.011058
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 956169617
|
Relay_Log_Space: 256
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 303
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-301-1314447552,100-301-176774
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
SQL_Delay: 3600
|
SQL_Remaining_Delay: NULL
|
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_DDL_Groups: 8003
|
Slave_Non_Transactional_Groups: 131
|
Slave_Transactional_Groups: 3329917
|
1 row in set (0.000 sec)
|
|
MariaDB [(none)]> SELECT @@global.gtid_slave_pos;
|
+---------------------------------+
|
| @@global.gtid_slave_pos |
|
+---------------------------------+
|
| 1-301-1314447552,100-301-176774 |
|
+---------------------------------+
|
1 row in set (0.000 sec)
|
|
MariaDB [(none)]> STOP SLAVE;
|
Query OK, 0 rows affected (0.001 sec)
|
|
MariaDB [(none)]> SELECT @@global.gtid_slave_pos;
|
+---------------------------------+
|
| @@global.gtid_slave_pos |
|
+---------------------------------+
|
| 1-301-1314447552,100-301-176774 |
|
+---------------------------------+
|
1 row in set (0.000 sec)
|
> Daniel Ragle added a comment - 2 days ago
> Hmmmm... what did I do wrong?
What went wrong here (IIUC) is that multi-source replication was set up
without correctly configuring the gtid_domain_id on the masters (they both
had domain 0). There is therefore no single GTID that can represent the
slave's current position in domain 0 - a different position is needed on
each master.
Setting the slave's position in domain 0 to empty just means that it should
start from the very first GTID in that domain - and that GTID (0-303-1
presumably) is long gone, of course.
One option to fix is to remove the record of domain on on the masters -
using the FLUSH BINARY LOGS DELETE_DOMAIN_ID=0 command. You can see the
effect of this command on the master in the @@gtid_binlog_state variable
which keeps a record of all domains that were ever used on the master.
Another option is to artificially make it look like one GTID is the latest
in domain 0 on both masters (say 0-303-67739600) and then inject a dummy
transaction with this GTID on the master that's missing it. This tells the
slave that it can start at this GTID on both masters without missing any
events.
(I am assuming here that the problem Daniel Ragle has is the same as the
original reporter).
Thanks, still missing something. Now on the master:
MariaDB [(none)]> SELECT @@global.gtid_binlog_state;
|
+----------------------------------------------------------------------------------------------------------------+
|
| @@global.gtid_binlog_state |
|
+----------------------------------------------------------------------------------------------------------------+
|
| 0-302-67690294,0-301-67719794,0-303-67739600,1-303-1381518211,1-302-1300356654,1-301-1381751266,100-301-266595 |
|
+----------------------------------------------------------------------------------------------------------------+
|
1 row in set (0.000 sec)
|
|
MariaDB [(none)]> SHOW VARIABLES LIKE 'version';
|
+---------------+---------------------+
|
| Variable_name | Value |
|
+---------------+---------------------+
|
| version | 10.3.14-MariaDB-log |
|
+---------------+---------------------+
|
1 row in set (0.001 sec)
|
|
MariaDB [(none)]> FLUSH LOCAL BINARY LOGS DELETE_DOMAIN_ID=0;
|
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '0' at line 1
|
If it makes any difference, the zero domain has been gone from all the servers in the chain for months, it shouldn't exist in any binlog, anywhere.
Labels | gtid mariadb multisource | DELETE_DOMAIN_ID gtid mariadb multisource |
Daniel, probably too late already, but leaving it here for posterity to avoid: https://xkcd.com/979/
FLUSH BINARY LOGS DELETE_DOMAIN_ID=(0);
|
Workflow | MariaDB v3 [ 79550 ] | MariaDB v4 [ 151677 ] |
Zendesk Related Tickets | 149963 |
How to reproduce: