Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.2.26, 10.3.17, 10.4.7
Description
When gtid_slave_pos is set to the empty string, in my opinion, that should mean "start from the beginning of your current binary logs and give me everything that you have".
In reality, when gtid_slave_pos is set to the empty string, this currently means "start from the binary log position that has a Gtid_list event with the contents []".
To reproduce this, let's say that we have one server that will be our master, and one server that will be our slave.
On the server that will be our master, let's do some preparation to ensure that our GTID state is in a specific condition:
-- Clear current gtid_binlog_pos
|
RESET MASTER;
|
-- Generate some GTIDs in the binary logs
|
SET SESSION gtid_domain_id=1;
|
CREATE DATABASE repltest;
|
SET SESSION gtid_domain_id=2;
|
CREATE TABLE repltest.tab (
|
id int primary key,
|
str varchar(50)
|
);
|
SET SESSION gtid_domain_id=3;
|
INSERT INTO repltest.tab VALUES (1, 'str1');
|
Output:
MariaDB [(none)]> -- Clear current gtid_binlog_pos
|
MariaDB [(none)]> RESET MASTER;
|
Query OK, 0 rows affected (0.006 sec)
|
|
MariaDB [(none)]> -- Generate some GTIDs in the binary logs
|
MariaDB [(none)]> SET SESSION gtid_domain_id=1;
|
Query OK, 0 rows affected (0.000 sec)
|
|
MariaDB [(none)]> CREATE DATABASE repltest;
|
Query OK, 1 row affected (0.004 sec)
|
|
MariaDB [(none)]> SET SESSION gtid_domain_id=2;
|
Query OK, 0 rows affected (0.000 sec)
|
|
MariaDB [(none)]> CREATE TABLE repltest.tab (
|
-> id int primary key,
|
-> str varchar(50)
|
-> );
|
Query OK, 0 rows affected (0.016 sec)
|
|
MariaDB [(none)]> SET SESSION gtid_domain_id=3;
|
Query OK, 0 rows affected (0.000 sec)
|
|
MariaDB [(none)]> INSERT INTO repltest.tab VALUES (1, 'str1');
|
Query OK, 1 row affected (0.002 sec)
|
At this point, the server's only binary log has a Gtid_list event with the contents [], and it has 3 transactions with GTIDs in three different domains:
MariaDB [(none)]> SHOW BINLOG EVENTS;
|
+--------------------+-----+-------------------+-----------+-------------+-------------------------------------------------------------------------+
|
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
|
+--------------------+-----+-------------------+-----------+-------------+-------------------------------------------------------------------------+
|
| mariadb-bin.000001 | 4 | Format_desc | 2 | 256 | Server ver: 10.3.17-MariaDB-log, Binlog ver: 4 |
|
| mariadb-bin.000001 | 256 | Gtid_list | 2 | 285 | [] |
|
| mariadb-bin.000001 | 285 | Binlog_checkpoint | 2 | 330 | mariadb-bin.000001 |
|
| mariadb-bin.000001 | 330 | Gtid | 2 | 372 | GTID 1-2-1 |
|
| mariadb-bin.000001 | 372 | Query | 2 | 467 | CREATE DATABASE repltest |
|
| mariadb-bin.000001 | 467 | Gtid | 2 | 509 | GTID 2-2-1 |
|
| mariadb-bin.000001 | 509 | Query | 2 | 643 | CREATE TABLE repltest.tab (
|
id int primary key,
|
str varchar(50)
|
) |
|
| mariadb-bin.000001 | 643 | Gtid | 2 | 685 | BEGIN GTID 3-2-1 |
|
| mariadb-bin.000001 | 685 | Annotate_rows | 2 | 751 | INSERT INTO repltest.tab VALUES (1, 'str1') |
|
| mariadb-bin.000001 | 751 | Table_map | 2 | 804 | table_id: 25 (repltest.tab) |
|
| mariadb-bin.000001 | 804 | Write_rows_v1 | 2 | 847 | table_id: 25 flags: STMT_END_F |
|
| mariadb-bin.000001 | 847 | Xid | 2 | 878 | COMMIT /* xid=115 */ |
|
+--------------------+-----+-------------------+-----------+-------------+-------------------------------------------------------------------------+
|
12 rows in set (0.000 sec)
|
Now let's purge this binary log:
FLUSH LOGS;
|
PURGE BINARY LOGS TO 'mariadb-bin.000002';
|
Output:
MariaDB [(none)]> FLUSH LOGS;
|
Query OK, 0 rows affected (0.004 sec)
|
|
MariaDB [(none)]> PURGE BINARY LOGS TO 'mariadb-bin.000002';
|
Query OK, 0 rows affected (0.003 sec)
|
At this point, the server's only binary log has a Gtid_list event that contains the previous GTIDs in three different domains:
MariaDB [(none)]> SHOW BINLOG EVENTS;
|
+--------------------+-----+-------------------+-----------+-------------+------------------------------------------------+
|
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
|
+--------------------+-----+-------------------+-----------+-------------+------------------------------------------------+
|
| mariadb-bin.000002 | 4 | Format_desc | 2 | 256 | Server ver: 10.3.17-MariaDB-log, Binlog ver: 4 |
|
| mariadb-bin.000002 | 256 | Gtid_list | 2 | 331 | [1-2-1,2-2-1,3-2-1] |
|
| mariadb-bin.000002 | 331 | Binlog_checkpoint | 2 | 376 | mariadb-bin.000001 |
|
| mariadb-bin.000002 | 376 | Binlog_checkpoint | 2 | 421 | mariadb-bin.000002 |
|
| mariadb-bin.000002 | 421 | Rotate | 2 | 470 | mariadb-bin.000003;pos=4 |
|
+--------------------+-----+-------------------+-----------+-------------+------------------------------------------------+
|
5 rows in set (0.000 sec)
|
Now let's try to set up our second server as a slave to replicate from this server when it has gtid_slave_pos set to an empty string.
SET GLOBAL gtid_slave_pos = '';
|
CHANGE MASTER TO
|
MASTER_HOST='172.30.0.105',
|
MASTER_USER='maxscale',
|
MASTER_PASSWORD='password',
|
MASTER_USE_GTID=slave_pos;
|
START SLAVE;
|
Output:
MariaDB [(none)]> SET GLOBAL gtid_slave_pos = '';
|
Query OK, 0 rows affected (0.012 sec)
|
|
MariaDB [(none)]> CHANGE MASTER TO
|
-> MASTER_HOST='172.30.0.105',
|
-> MASTER_USER='maxscale',
|
-> MASTER_PASSWORD='password',
|
-> MASTER_USE_GTID=slave_pos;
|
Query OK, 0 rows affected (0.010 sec)
|
|
MariaDB [(none)]> START SLAVE;
|
Query OK, 0 rows affected (0.006 sec)
|
At this point, the slave's I/O thread will fail:
MariaDB [(none)]> SHOW SLAVE STATUS\G
|
*************************** 1. row ***************************
|
Slave_IO_State:
|
Master_Host: 172.30.0.105
|
Master_User: maxscale
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File:
|
Read_Master_Log_Pos: 4
|
Relay_Log_File: ip-172-30-0-123-relay-bin.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File:
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 4
|
Relay_Log_Space: 256
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids: 3
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
SQL_Delay: 0
|
SQL_Remaining_Delay: NULL
|
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_DDL_Groups: 0
|
Slave_Non_Transactional_Groups: 0
|
Slave_Transactional_Groups: 0
|
1 row in set (0.000 sec)
|
In order to get it to start replicating from the start of the master's binary logs, you have to explicitly set gtid_slave_pos to the value in the Gtid_List event:
STOP SLAVE;
|
SET GLOBAL gtid_slave_pos = '1-2-1,2-2-1,3-2-1';
|
CHANGE MASTER TO
|
MASTER_HOST='172.30.0.105',
|
MASTER_USER='maxscale',
|
MASTER_PASSWORD='password',
|
MASTER_USE_GTID=slave_pos;
|
START SLAVE;
|
Output:
MariaDB [(none)]> STOP SLAVE;
|
Query OK, 0 rows affected (0.002 sec)
|
|
MariaDB [(none)]> SET GLOBAL gtid_slave_pos = '1-2-1,2-2-1,3-2-1';
|
Query OK, 0 rows affected (0.015 sec)
|
|
MariaDB [(none)]> CHANGE MASTER TO
|
-> MASTER_HOST='172.30.0.105',
|
-> MASTER_USER='maxscale',
|
-> MASTER_PASSWORD='password',
|
-> MASTER_USE_GTID=slave_pos;
|
Query OK, 0 rows affected (0.006 sec)
|
|
MariaDB [(none)]> START SLAVE;
|
Query OK, 0 rows affected (0.006 sec)
|
At this point, the slave is properly replicating:
MariaDB [(none)]> SHOW SLAVE STATUS\G
|
*************************** 1. row ***************************
|
Slave_IO_State: Waiting for master to send event
|
Master_Host: 172.30.0.105
|
Master_User: maxscale
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mariadb-bin.000003
|
Read_Master_Log_Pos: 421
|
Relay_Log_File: ip-172-30-0-123-relay-bin.000002
|
Relay_Log_Pos: 722
|
Relay_Master_Log_File: mariadb-bin.000003
|
Slave_IO_Running: Yes
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 421
|
Relay_Log_Space: 1041
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: 0
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 0
|
Last_IO_Error:
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 1-2-1,2-2-1,3-2-1
|
Replicate_Do_Domain_Ids: 3
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
SQL_Delay: 0
|
SQL_Remaining_Delay: NULL
|
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_DDL_Groups: 0
|
Slave_Non_Transactional_Groups: 0
|
Slave_Transactional_Groups: 0
|
1 row in set (0.000 sec)
|
The current behavior is not necessarily "wrong". However, it would be useful in some cases if having an empty GTID value had a special meaning that meant "start from the beginning of your current binary logs and give me everything that you have".
This would be especially useful in environments that use a lot of different gtid_domain_id values, and you know that the server is consistent to start replicating from the first GTID of a specific domain, and the empty gtid_slave_pos value is combined with the DO_DOMAIN_IDS and/or IGNORE_DOMAIN_IDS options.
For example, let's say that the Gtid_List event in the master's first and only binary log is for the GTID string '1-2-1,2-2-1,3-2-1':
MariaDB [(none)]> SHOW BINLOG EVENTS;
|
+--------------------+-----+-------------------+-----------+-------------+------------------------------------------------+
|
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
|
+--------------------+-----+-------------------+-----------+-------------+------------------------------------------------+
|
| mariadb-bin.000002 | 4 | Format_desc | 2 | 256 | Server ver: 10.3.17-MariaDB-log, Binlog ver: 4 |
|
| mariadb-bin.000002 | 256 | Gtid_list | 2 | 331 | [1-2-1,2-2-1,3-2-1] |
|
| mariadb-bin.000002 | 331 | Binlog_checkpoint | 2 | 376 | mariadb-bin.000001 |
|
| mariadb-bin.000002 | 376 | Binlog_checkpoint | 2 | 421 | mariadb-bin.000002 |
|
| mariadb-bin.000002 | 421 | Rotate | 2 | 470 | mariadb-bin.000003;pos=4 |
|
+--------------------+-----+-------------------+-----------+-------------+------------------------------------------------+
|
5 rows in set (0.000 sec)
|
And let's say that the slave only wants to replicate transactions with gtid_domain_id=4.
In this case, it would make sense to set gtid_slave_pos to the empty string, because the slave should start replicating this specific domain from the very beginning, and it doesn't care about any other domains. I would think that the following would work:
-- Start replicating from the first event with gtid_domain_id=4,
|
-- and ignore all other gtid_domain_id values
|
SET GLOBAL gtid_slave_pos = '';
|
CHANGE MASTER TO
|
MASTER_HOST='172.30.0.105',
|
MASTER_USER='maxscale',
|
MASTER_PASSWORD='password',
|
MASTER_USE_GTID=slave_pos,
|
DO_DOMAIN_IDS=(4);
|
START SLAVE;
|
Output:
MariaDB [(none)]> SET GLOBAL gtid_slave_pos = '';
|
Query OK, 0 rows affected (0.012 sec)
|
|
MariaDB [(none)]> CHANGE MASTER TO
|
-> MASTER_HOST='172.30.0.105',
|
-> MASTER_USER='maxscale',
|
-> MASTER_PASSWORD='password',
|
-> MASTER_USE_GTID=slave_pos,
|
-> DO_DOMAIN_IDS=(4);
|
Query OK, 0 rows affected (0.010 sec)
|
|
MariaDB [(none)]> START SLAVE;
|
Query OK, 0 rows affected (0.007 sec)
|
But this does not work either:
MariaDB [(none)]> SHOW SLAVE STATUS\G
|
*************************** 1. row ***************************
|
Slave_IO_State:
|
Master_Host: 172.30.0.105
|
Master_User: maxscale
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File:
|
Read_Master_Log_Pos: 4
|
Relay_Log_File: ip-172-30-0-123-relay-bin.000001
|
Relay_Log_Pos: 4
|
Relay_Master_Log_File:
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB:
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 4
|
Relay_Log_Space: 256
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: No
|
Master_SSL_CA_File:
|
Master_SSL_CA_Path:
|
Master_SSL_Cert:
|
Master_SSL_Cipher:
|
Master_SSL_Key:
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl:
|
Master_SSL_Crlpath:
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids: 4
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
SQL_Delay: 0
|
SQL_Remaining_Delay: NULL
|
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
Slave_DDL_Groups: 0
|
Slave_Non_Transactional_Groups: 0
|
Slave_Transactional_Groups: 0
|
1 row in set (0.000 sec)
|
Attachments
Issue Links
- is part of
-
MDEV-9345 Replication to enable filtering on master
- Open