[MDEV-10279] gtid_current_pos is not updated with slave transactions from old master Created: 2016-06-23  Updated: 2019-07-23

Status: Open
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.1.14
Fix Version/s: 10.2

Type: Bug Priority: Major
Reporter: Michaël de groot Assignee: Andrei Elkin
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-17853 Document that gtid_binlog_pos can lag... Closed
relates to MDEV-16834 GTID current_pos easily breaks replic... Closed
relates to MDEV-17156 Local transactions on a Slave don't u... Closed
relates to MDEV-20122 Deprecate MASTER_USE_GTID=Current_Pos... Closed

 Description   

Hi everybody,

According to the documentation, the gtid_current_pos should be updated with transactions from the slave threads as well:
https://mariadb.com/kb/en/mariadb/gtid/#gtid_current_pos

While this is not a problem in traditional replication setups (we can use slave_pos instead of current_pos) it can be a problem with setups with intermediate masters where the higher master is older version:
Server 1 (5.5) -> Server 2a (10.1) -> server 2b (10.1)

If we want to switch roles between 2a and 2b we stop replication from server 1 to 2a (to failover afterwards). However we cannot use gtid_slave_pos or gtid_current_pos because they are not updated:

MariaDB [mariadb]> select @@gtid_binlog_pos, @@gtid_slave_pos, @@gtid_current_pos;
+------------------------+--------------------------+--------------------------+
| @@gtid_binlog_pos      | @@gtid_slave_pos         | @@gtid_current_pos       |
+------------------------+--------------------------+--------------------------+
| 0-201-55312,1-1-113229 | 0-201-55312,1-102-110354 | 0-201-55312,1-102-110354 |
+------------------------+--------------------------+--------------------------+
1 row in set (0.00 sec)

If we perform a transaction on server 2a the current_pos at least is updated:

MariaDB [mariadb]> insert into test values(NULL);
Query OK, 1 row affected (0.01 sec)
 
MariaDB [mariadb]> select @@gtid_binlog_pos, @@gtid_slave_pos, @@gtid_current_pos;
+--------------------------+--------------------------+--------------------------+
| @@gtid_binlog_pos        | @@gtid_slave_pos         | @@gtid_current_pos       |
+--------------------------+--------------------------+--------------------------+
| 0-201-55312,1-101-113230 | 0-201-55312,1-102-110354 | 0-201-55312,1-101-113230 |
+--------------------------+--------------------------+--------------------------+
1 row in set (0.00 sec)

Please make it so that current_pos (and slave_pos?) is also updated from transaction that come in from old master, as the documentation states.



 Comments   
Comment by Elena Stepanova [ 2016-06-24 ]

I failed to understand the above description, hopefully knielsen will make sense out of it.

Comment by Kristian Nielsen [ 2016-06-24 ]

If I understand correctly, the problem here is that a transaction originating on a mariadb 5.5 master and replicated to a mariadb 10.1 slave, does not cause the gtid_slave_pos on the slave to be updated to the GTID of the transaction from the master?

The problem is that MariaDB 5.5 is too old to support GTID. So the transaction does not get a GTID assigned when it originates on a 5.5 master. Basically, to use GTID, all involved servers need to be of version 10.0 or greater...

Hm, but maybe the idea is to use GTID to connect server 2a as a slave to server 2b as a master, using GTID with the GTIDs assigned by server 2a to transactions originating from the 5.5 server 1. The reason those GTIDs do not get into gtid_current_pos is that they were neither replicated (since they had no GTID when they arrived at the slave), nor do they have the server_id of a2 (since they originated on server 1).

I can try to think if there is a simple way around this - I guess to assign the server_id of 2a to the GTIDs replicated from server 1? But generally, GTID switchover was not designed to work with servers that do not support GTID...

Comment by Guillaume Lefranc [ 2016-06-28 ]

How come in some cases gtid_current_pos is not updated by local writes?

For example I am here on server_id 2:

MariaDB [(none)]> show variables like 'gtid%';
-------------------------------+

Variable_name Value

-------------------------------+

gtid_binlog_pos 0-2-1
gtid_binlog_state 0-2-1
gtid_current_pos 0-1-616
gtid_domain_id 0
gtid_ignore_duplicates OFF
gtid_seq_no 0
gtid_slave_pos 0-1-616
gtid_strict_mode ON

-------------------------------+

I do a local write:

MariaDB [(none)]> show variables like 'gtid%';
-------------------------------+

Variable_name Value

-------------------------------+

gtid_binlog_pos 0-2-2
gtid_binlog_state 0-2-2
gtid_current_pos 0-1-616
gtid_domain_id 0
gtid_ignore_duplicates OFF
gtid_seq_no 0
gtid_slave_pos 0-1-616
gtid_strict_mode ON

-------------------------------+

Doc says: Such changes can either be master events (ie. local changes made by user or application)
In this case local change did not update current_pos. Is it normal?

Comment by Kristian Nielsen [ 2016-06-28 ]

As discussed on IRC, this issue is not caused by an old master. It is the result of RESET MASTER, which deletes the gtid history and causes GTID-0-1-616 to be considered as not having occured in the past (since RESET MASTER effectively deletes the past).

Comment by Michaël de groot [ 2016-06-28 ]

knielsen, tanj, I'm pretty sure I experienced a different issue. My gtid_current_pos did not get updated for multiple hundreds of transactions coming in from an old master.

Currently, the transaction on node 2b does get a GTID somewhere on the way. The gtid_slave_pos in domain 0 on that node increases with the transactions coming in (indirectly) from server 1. The issue I am reporting is that the same transaction on server 2a does not have a GTID (domain 0 in gtid_current_pos does not increase).
I don't need the transaction to get a new server_id or gtid_domain_id, I am fine with domain 0 and the server_id of server 1 (although back there it really wasn't a gtid transaction). If you think that's better or needed I have no problemw ith it though.

By the way, I am replicating from an ancient Percona server 5.6 (node 1) without GTID mode enabled. Sorry I did not mention that before, I don't think it makes a difference though.

Generated at Thu Feb 08 07:41:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.