Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20720

Galera: Replicate MariaDB GTID to other nodes in the cluster

Details

    • 10.2.4-4, 10.2.4-1, 10.3.1-2, 10.2.10, 10.2.11, 10.1.30, 10.2.12

    Description

      The MariaDB GTID is currently not transferred to other nodes in the cluster. As a result,
      receiving nodes simply use the current gtid_domain_id (or wsrep_gitd_domain_id in 10.1)
      and server id to tag the incoming transactions along with galera-assigned sequence number.

      Attachments

        Issue Links

          Activity

            nirbhay_c Nirbhay Choubey (Inactive) created issue -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            Field Original Value New Value
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            Assignee Lixun Peng [ plinux ]

            This sounds similar to what I ran into but it seemed a tad vague. I am running MariaDB 10.1.18. I have a 3 node Galera cluster and an async slave. GTID's are enabled and all nodes have 'log-bin' and 'log-slave-updates' and are using 0 for the domain (the default).

            What I found was that all Galera nodes seem to be writing all data to their binary logs, but their GTIDs do not match. I can find things by the transaction-id across all the logs, but if I try to find things by GTID, results are inconsistent. This means I cannot merely re-point the slave server to another node because that node does not have the same GTID information as the current master and, thus, the slave does not no where to begin.

            It sounds like this issue applies to this bug? I see the target is 10.2. If so, it would be ideal if it was reflected in the KB documentation for 10.1?

            m00dawg Tim Soderstrom added a comment - This sounds similar to what I ran into but it seemed a tad vague. I am running MariaDB 10.1.18. I have a 3 node Galera cluster and an async slave. GTID's are enabled and all nodes have 'log-bin' and 'log-slave-updates' and are using 0 for the domain (the default). What I found was that all Galera nodes seem to be writing all data to their binary logs, but their GTIDs do not match. I can find things by the transaction-id across all the logs, but if I try to find things by GTID, results are inconsistent. This means I cannot merely re-point the slave server to another node because that node does not have the same GTID information as the current master and, thus, the slave does not no where to begin. It sounds like this issue applies to this bug? I see the target is 10.2. If so, it would be ideal if it was reflected in the KB documentation for 10.1?
            andrew.garner Andrew Garner added a comment -

            Tim, I think you may be running into MDEV-10944 (a 10.1.18 regression). Although there are a myriad of other ways to get MariaDB GTIDs out of sync in a Galera cluster, even without that regression.

            andrew.garner Andrew Garner added a comment - Tim, I think you may be running into MDEV-10944 (a 10.1.18 regression). Although there are a myriad of other ways to get MariaDB GTIDs out of sync in a Galera cluster, even without that regression.

            Doh you are right, that sounds exactly like our problem. Bug search fail on my part - thank you for providing that!

            m00dawg Tim Soderstrom added a comment - Doh you are right, that sounds exactly like our problem. Bug search fail on my part - thank you for providing that!
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Assignee Lixun Peng [ plinux ] Sachin Setiya [ sachin.setiya.007 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.19 [ 109 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.19 [ 109 ] 10.2.4-1 [ 110 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-1 [ 110 ] 10.2.4-2 [ 113 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-2 [ 113 ] 10.2.4-3 [ 115 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-3 [ 115 ] 10.2.4-4 [ 117 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-4 [ 117 ] 10.2.4-4, 10.1.20 [ 117, 119 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-4, 10.1.20 [ 117, 119 ] 10.2.4-4, 10.2.4-1 [ 117, 121 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower

            Hi!

            I also noticed that the initial GTID is not passed on with all SST methods. If I remember correctly, only rsync SST will sync it.

            I think this implementation should make fixing that in SST unneeded, could you please confirm that?

            Thanks,
            Michaël

            michaeldg Michaël de groot added a comment - Hi! I also noticed that the initial GTID is not passed on with all SST methods. If I remember correctly, only rsync SST will sync it. I think this implementation should make fixing that in SST unneeded, could you please confirm that? Thanks, Michaël
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]

            sachin.setiya.007 can you please tell me why this issue is stalled? This is very important issue to fix. Right now it is very inconvenient to replicate 1 galera cluster to another, and with circular replication between 2 galera clusters it becomes a real pain.

            michaeldg Michaël de groot added a comment - sachin.setiya.007 can you please tell me why this issue is stalled? This is very important issue to fix. Right now it is very inconvenient to replicate 1 galera cluster to another, and with circular replication between 2 galera clusters it becomes a real pain.
            michaeldg Michaël de groot made changes -
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-4, 10.2.4-1 [ 117, 121 ] 10.2.4-4, 10.2.4-1, 10.3.1-2 [ 117, 121, 174 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            GeoffMontee Geoff Montee (Inactive) made changes -

            I assume that "Fix Version/s: 10.2" is not accurate anymore. Since 10.2 is already GA, I assume this would go into MariaDB 10.3 at the earliest. Is that correct?

            GeoffMontee Geoff Montee (Inactive) added a comment - I assume that "Fix Version/s: 10.2" is not accurate anymore. Since 10.2 is already GA, I assume this would go into MariaDB 10.3 at the earliest. Is that correct?
            valerii Valerii Kravchuk made changes -
            Support case ID 13568 14263 15259 13568 14263 14944 15259

            I am occupied by galera merges and , galera bugs ,so I did not get the time to do this , I will again start working on this hopefully in next week.

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - I am occupied by galera merges and , galera bugs ,so I did not get the time to do this , I will again start working on this hopefully in next week.
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Priority Major [ 3 ] Blocker [ 1 ]
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.2 [ 14601 ]
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Fix Version/s 10.1 [ 16100 ]
            Fix Version/s 10.2 [ 14601 ]
            serg Sergei Golubchik made changes -
            Priority Blocker [ 1 ] Critical [ 2 ]
            serg Sergei Golubchik made changes -
            Sprint 10.2.4-4, 10.2.4-1, 10.3.1-2 [ 117, 121, 174 ] 10.2.4-4, 10.2.4-1, 10.3.1-2, 10.2.10 [ 117, 121, 174, 183 ]
            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - http://lists.askmonty.org/pipermail/commits/2017-October/011552.html
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ] Andrei Elkin [ elkin ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            michaeldg Michaël de groot added a comment - - edited

            Cool, very nice that this issue is finally getting done! Thank you sachin.setiya.007.

            In the tests, please consider circular asynchronous replication between 2 or more Galera clusters:

            Cluster 1: A <> B <> C
            Cluster 2: D <> E <> F

            All nodes have log_slave_updates enabled. Bidirectional asynchronous replication is between node A and node D. Writes originate from, for example, node B.
            Node D goes down. With this change, we should now be able to change the streams easily:
            On node A: STOP SLAVE; CHANGE MASTER TO MASTER_HOST='e'; START SLAVE;
            On node E: CHANGE MASTER TO MASTER_HOST='a', MASTER_USER='repl', MASTER_PASSWORD='insecure', MASTER_USE_GTID=slave_pos; START SLAVE;

            Can you please make sure this scenario is tested?

            sachin.setiya.007 maybe the implementation done here is not enough for this use case. How does node e recognize transactions that originated from node D? Maybe we need to set up ignore domain ID on the asynchronous replication stream?

            michaeldg Michaël de groot added a comment - - edited Cool, very nice that this issue is finally getting done! Thank you sachin.setiya.007 . In the tests, please consider circular asynchronous replication between 2 or more Galera clusters: Cluster 1: A <> B <> C Cluster 2: D <> E <> F All nodes have log_slave_updates enabled. Bidirectional asynchronous replication is between node A and node D. Writes originate from, for example, node B. Node D goes down. With this change, we should now be able to change the streams easily: On node A: STOP SLAVE; CHANGE MASTER TO MASTER_HOST='e'; START SLAVE; On node E: CHANGE MASTER TO MASTER_HOST='a', MASTER_USER='repl', MASTER_PASSWORD='insecure', MASTER_USE_GTID=slave_pos; START SLAVE; Can you please make sure this scenario is tested? sachin.setiya.007 maybe the implementation done here is not enough for this use case. How does node e recognize transactions that originated from node D? Maybe we need to set up ignore domain ID on the asynchronous replication stream?

            Hi michaeldg,

            Writing a mtr test case for this situation is bit difficult. But I will try to simulate this on vms

            Regards
            sachin

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Hi michaeldg , Writing a mtr test case for this situation is bit difficult. But I will try to simulate this on vms Regards sachin
            Elkin Andrei Elkin added a comment -

            Sachin, hello.

            Please check out a review mail I sent out.

            Cheers,

            Andrei.

            Elkin Andrei Elkin added a comment - Sachin, hello. Please check out a review mail I sent out. Cheers, Andrei.
            Elkin Andrei Elkin made changes -
            Status In Review [ 10002 ] Stalled [ 10000 ]
            serg Sergei Golubchik made changes -
            Assignee Andrei Elkin [ elkin ] Sachin Setiya [ sachin.setiya.007 ]
            neiltembo Neil Skrypuch made changes -
            serg Sergei Golubchik made changes -
            Sprint 10.2.4-4, 10.2.4-1, 10.3.1-2, 10.2.10 [ 117, 121, 174, 183 ] 10.2.4-4, 10.2.4-1, 10.3.1-2, 10.2.10, 10.2.11 [ 117, 121, 174, 183, 203 ]
            anikitin Andrii Nikitin (Inactive) made changes -
            serg Sergei Golubchik made changes -
            Fix Version/s 10.3 [ 22126 ]

            Status Update:-

            Actually test case for 2X3 node galera cluster has been created, But this tests fails because of
            issue with rpl_slave_state:hash.
            More information of this bug Problem

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Status Update:- Actually test case for 2X3 node galera cluster has been created, But this tests fails because of issue with rpl_slave_state:hash. More information of this bug Problem
            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Branch Buildbot
            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - - edited

            Status Update:- All issue solved.

            So the problem was suppose A cluster like this

            A <-> B <-> C (Galera Cluster 1)

            (Circular normal replication between A < – > D (no galera))

            D <-> E <-> F (Galera Cluster 2)

            So the event group arriving from B , C was applied 2 times on A (similarly for event group of E, F to D).
            Reason being Galera event group does not contain GTID_LOG_EVENT , so say when A recieved an event group from B its rpl_slave_state::hash(gtid_slave_pos) is not updated, So when A gets the same event group from D(because of circular replication) It will apply this event again. If we set ignore_server_ids while setting circular replication this problem can be solved.
            A will ignore server id of B,C And D will ignore server id of E , F. Replicate-same-sever-id shuould be turned off.

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - - edited Status Update:- All issue solved. So the problem was suppose A cluster like this A <- > B < -> C (Galera Cluster 1) (Circular normal replication between A < – > D (no galera)) D <- > E < -> F (Galera Cluster 2) So the event group arriving from B , C was applied 2 times on A (similarly for event group of E, F to D). Reason being Galera event group does not contain GTID_LOG_EVENT , so say when A recieved an event group from B its rpl_slave_state::hash(gtid_slave_pos) is not updated, So when A gets the same event group from D(because of circular replication) It will apply this event again. If we set ignore_server_ids while setting circular replication this problem can be solved. A will ignore server id of B,C And D will ignore server id of E , F. Replicate-same-sever-id shuould be turned off.

            There is one more constraint, In the case of master slave replication to gtid cluster. Or Gtid_cluster to gtid_cluster (async or may be circular replication ) , Cluster should have different domain id wrt to master or slave

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - There is one more constraint, In the case of master slave replication to gtid cluster. Or Gtid_cluster to gtid_cluster (async or may be circular replication ) , Cluster should have different domain id wrt to master or slave
            serg Sergei Golubchik made changes -
            Sprint 10.2.4-4, 10.2.4-1, 10.3.1-2, 10.2.10, 10.2.11 [ 117, 121, 174, 183, 203 ] 10.2.4-4, 10.2.4-1, 10.3.1-2, 10.2.10, 10.2.11, 10.1.30 [ 117, 121, 174, 183, 203, 215 ]
            serg Sergei Golubchik made changes -
            Rank Ranked higher
            serg Sergei Golubchik made changes -
            Rank Ranked higher
            serg Sergei Golubchik made changes -
            Sprint 10.2.4-4, 10.2.4-1, 10.3.1-2, 10.2.10, 10.2.11, 10.1.30 [ 117, 121, 174, 183, 203, 215 ] 10.2.4-4, 10.2.4-1, 10.3.1-2, 10.2.10, 10.2.11, 10.1.30, 10.2.12 [ 117, 121, 174, 183, 203, 215, 216 ]
            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - http://lists.askmonty.org/pipermail/commits/2017-December/011761.html
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Component/s Galera [ 10124 ]
            Component/s Replication [ 10100 ]
            Fix Version/s 10.1.31 [ 22907 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.2.12 [ 22810 ]
            mstoute Mark Stoute added a comment -

            Thank you for this fix.
            I upgraded my production cluster to 10.1.32 via rolling-restart, and found GTIDs out of sync, and wsrep_provider_version is still behind (25.3.18(r3632)). Prod cluster was initially bootstrapped as v 10.1.18.

            In a dev cluster where I bootstrapped the cluster from 10.1.32, GTIDs are in sync and wsrep_provider_version is higher (25.3.23(r3789)).

            Is it true that in order to have my production cluster have GTIDs in sync, I will need to bootstrap with 10.1.32?

            mstoute Mark Stoute added a comment - Thank you for this fix. I upgraded my production cluster to 10.1.32 via rolling-restart, and found GTIDs out of sync, and wsrep_provider_version is still behind (25.3.18(r3632)). Prod cluster was initially bootstrapped as v 10.1.18. In a dev cluster where I bootstrapped the cluster from 10.1.32, GTIDs are in sync and wsrep_provider_version is higher (25.3.23(r3789)). Is it true that in order to have my production cluster have GTIDs in sync, I will need to bootstrap with 10.1.32?

            There is been some confusion, the gtid has been transferred between nodes only of the cluster is async slave , If we want to transfer gtid inside of write set that will be bugger change , and will involve changing galera code
            by change galera gtid to become same gtid format as of mariadb and use this gtid in commit instead of generating gtid.

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - There is been some confusion, the gtid has been transferred between nodes only of the cluster is async slave , If we want to transfer gtid inside of write set that will be bugger change , and will involve changing galera code by change galera gtid to become same gtid format as of mariadb and use this gtid in commit instead of generating gtid.
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Resolution Fixed [ 1 ]
            Status Closed [ 6 ] Stalled [ 10000 ]
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.2.12 [ 22810 ]
            Fix Version/s 10.1.31 [ 22907 ]
            julien.fritsch Julien Fritsch made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ] Seppo Jaakola [ seppo ]
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Seppo Jaakola [ seppo ] Teemu Ollakka [ teemu.ollakka ]
            jplindst Jan Lindström (Inactive) made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            julien.fritsch Julien Fritsch made changes -
            Epic Link PT-78 [ 68559 ]
            arjen Arjen Lentz added a comment -

            sachin.setiya.007 it would be ok if Galera just passed the MariaDB GTID around as-is (as an extra arbitrary field as part of a commit), so it will be stored in each binlog. That would not require Galera to start using MariaDB GTIDs. Just see them as separate: Galera GTID and MariaDB GTID.
            The issue is that right now, what's happening with say MDEV-14153 is just horrendous.

            arjen Arjen Lentz added a comment - sachin.setiya.007 it would be ok if Galera just passed the MariaDB GTID around as-is (as an extra arbitrary field as part of a commit), so it will be stored in each binlog. That would not require Galera to start using MariaDB GTIDs. Just see them as separate: Galera GTID and MariaDB GTID. The issue is that right now, what's happening with say MDEV-14153 is just horrendous.

            Hi arjen

            Actually that wont work , because lets mariadb server some how has to generate gtid in sync, lets say we have 3 node cluster with each node gtid 1-1-1 , and then we do simultaneous write on node 1 and node 2, So both will generate gtid 1-1-2 and this will be wrong sequence. So we need galera to manage gtid , since it is transaction coordinator not mariadb

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Hi arjen Actually that wont work , because lets mariadb server some how has to generate gtid in sync, lets say we have 3 node cluster with each node gtid 1-1-1 , and then we do simultaneous write on node 1 and node 2, So both will generate gtid 1-1-2 and this will be wrong sequence. So we need galera to manage gtid , since it is transaction coordinator not mariadb
            arjen Arjen Lentz added a comment -

            I'm sorry sachin.setiya.007 but that's just not correct. Remember that GTID also works in an async replication and master-master configuration.
            The format is S-D-# where S is the server-id (which should be unique in the cluster or replication environment), D for the replication domain (see the MariaDB docs, it tends to be 0 by default unless the application sets it to something else, and # for the # going up within that.
            So for your example, you'd actually see something like 1-0-1 and 2-0-1 on the two different servers, which is a perfectly correct flow of things, and the next transactions written on the servers after that will be something like 1-0-2 and 2-0-2.
            Hope this clarifies.

            arjen Arjen Lentz added a comment - I'm sorry sachin.setiya.007 but that's just not correct. Remember that GTID also works in an async replication and master-master configuration. The format is S-D-# where S is the server-id (which should be unique in the cluster or replication environment), D for the replication domain (see the MariaDB docs, it tends to be 0 by default unless the application sets it to something else, and # for the # going up within that. So for your example, you'd actually see something like 1-0-1 and 2-0-1 on the two different servers, which is a perfectly correct flow of things, and the next transactions written on the servers after that will be something like 1-0-2 and 2-0-2. Hope this clarifies.

            When we have different domain id , then user ensures that the bingol events don't conflict each other , but this is not the case with galera, galera can handle conflicts , so I think within one cluster we should have one domain id, and this is what galera internally does , it has one uuid for one cluster

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - When we have different domain id , then user ensures that the bingol events don't conflict each other , but this is not the case with galera, galera can handle conflicts , so I think within one cluster we should have one domain id, and this is what galera internally does , it has one uuid for one cluster
            danblack Daniel Black added a comment -

            Format is D-S-# and I'm fairly sure arjen is talking about different server IDs on each galera node (despite a little dyslexia).

            danblack Daniel Black added a comment - Format is D-S-# and I'm fairly sure arjen is talking about different server IDs on each galera node (despite a little dyslexia).

            danblack, right Format is D-S-X , Actually my first comment is slightly wrong , each node will have gtid 1(constant)-X(node server id) -Y(seq no ), server id will be different on each node , but still seq no will be wrt to domain id

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - danblack , right Format is D-S-X , Actually my first comment is slightly wrong , each node will have gtid 1(constant)-X(node server id) -Y(seq no ), server id will be different on each node , but still seq no will be wrt to domain id
            arjen Arjen Lentz added a comment -

            yes thanks Dan - I had it right in a blogpost the other day.

            sachin.setiya.007 The seq# component its own is not unique, it's the GTID as a whole that needs to be unique.
            The UUID you're referring to is the Galera cluster identifier, which is indeed a single unique ID across the entire cluster - it never changes; this is how a node can see whether it belongs in a cluster or not. If you bootstrap a new cluster, a new UUID is generated.

            arjen Arjen Lentz added a comment - yes thanks Dan - I had it right in a blogpost the other day. sachin.setiya.007 The seq# component its own is not unique, it's the GTID as a whole that needs to be unique. The UUID you're referring to is the Galera cluster identifier, which is indeed a single unique ID across the entire cluster - it never changes; this is how a node can see whether it belongs in a cluster or not. If you bootstrap a new cluster, a new UUID is generated.
            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - - edited

            arjen, I never said that the sequence no is unique its own , it is unique with respect to domain id , For example 1-1-1 and 1-2-1 and conflicting gtid , However 1-1-1 and 2-1-1 and perfectly okay gtid https://mariadb.com/kb/en/library/gtid/#the-domain-id

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - - edited arjen , I never said that the sequence no is unique its own , it is unique with respect to domain id , For example 1-1-1 and 1-2-1 and conflicting gtid , However 1-1-1 and 2-1-1 and perfectly okay gtid https://mariadb.com/kb/en/library/gtid/#the-domain-id
            danblack Daniel Black added a comment -

            GTIDs need to pass through the cluster. Consider this requirement:

            • A DB connection occurs through a DB load balancer, at the end of the updating a user's profile transaction, the GTID is selected by the application
            • The gtid is placed in the web session information for that user.
            • The user in the next web fetches a new web page going though the load balancer to a different cluster member (or even async slave for that matter).
            • Because galera transactions or async slaves aren't applied immediately, a query of the user's profile may retrieve an out of date version. To prevent this the DB application should be able to

              SELECT master_gtid_wait(@gtid, 0.1)

              to ensure it has the latest data that the user previously updated (it can deal with the timeout).

            I'm sure I'm not the only one of the 19 voters and 31 watchers wanting this.

            There should be no need for the application to consider that a *G*TID is anything but a global identifier.

            Galera needs to ensure that the sequential visibility in applying each D-S pair (i.e 0-1-33 isn't visible when 0-1-22 isn't) so of course the gtid needs to be transferred in the writeset.
            Galera should handle that 0-1-33 and 0-2-33 are unique transactions from different servers no matter what the replication process was taken to deliver them.

            Each server has its own server-id and can be responsible for GTID generation without coordination. If the certification fails then the server skips a GTID value. The galera GTID has a different purpose so its needed to stay independent.

            If that server is part of the cluster then the galera mechanism can ensure that can be applied without conflict however this is independent on what the GTID actually is.

            danblack Daniel Black added a comment - GTIDs need to pass through the cluster. Consider this requirement: A DB connection occurs through a DB load balancer, at the end of the updating a user's profile transaction, the GTID is selected by the application The gtid is placed in the web session information for that user. The user in the next web fetches a new web page going though the load balancer to a different cluster member (or even async slave for that matter). Because galera transactions or async slaves aren't applied immediately, a query of the user's profile may retrieve an out of date version. To prevent this the DB application should be able to SELECT master_gtid_wait(@gtid, 0.1) to ensure it has the latest data that the user previously updated (it can deal with the timeout). I'm sure I'm not the only one of the 19 voters and 31 watchers wanting this. There should be no need for the application to consider that a *G*TID is anything but a global identifier. Galera needs to ensure that the sequential visibility in applying each D-S pair (i.e 0-1-33 isn't visible when 0-1-22 isn't) so of course the gtid needs to be transferred in the writeset. Galera should handle that 0-1-33 and 0-2-33 are unique transactions from different servers no matter what the replication process was taken to deliver them. Each server has its own server-id and can be responsible for GTID generation without coordination. If the certification fails then the server skips a GTID value. The galera GTID has a different purpose so its needed to stay independent. If that server is part of the cluster then the galera mechanism can ensure that can be applied without conflict however this is independent on what the GTID actually is.
            GeoffMontee Geoff Montee (Inactive) made changes -
            valerii Valerii Kravchuk added a comment - - edited

            We should also consider the case of ALTER running on node by node in RSU mode. We should end up with consisetnt GTIDs in cluster after this, or invent some workaround (do not generate local GTIDs while in RSU mode, request to do everything with sql_log_bin=0?).

            valerii Valerii Kravchuk added a comment - - edited We should also consider the case of ALTER running on node by node in RSU mode. We should end up with consisetnt GTIDs in cluster after this, or invent some workaround (do not generate local GTIDs while in RSU mode, request to do everything with sql_log_bin=0?).

            RSU=rolling schema upgrade, perhaps?

            If you want the ALTERs to replicate to async slaves not part of the cluster, the GTID way is to binlog the ALTER in a separate domain id (SET SESSION gtid_domain_id=xxx). This will make them independent of the normal binlog stream. Grab the @@last_gtid from the first node, and use it to set server_id / gtid_seq_no on the other nodes to get the same GTID on all nodes for the ALTER.

            If you do not want the ALTERs to replicate async to slaves, SET SESSION sql_log_bin=0 is the way.

            knielsen Kristian Nielsen added a comment - RSU=rolling schema upgrade, perhaps? If you want the ALTERs to replicate to async slaves not part of the cluster, the GTID way is to binlog the ALTER in a separate domain id (SET SESSION gtid_domain_id=xxx). This will make them independent of the normal binlog stream. Grab the @@last_gtid from the first node, and use it to set server_id / gtid_seq_no on the other nodes to get the same GTID on all nodes for the ALTER. If you do not want the ALTERs to replicate async to slaves, SET SESSION sql_log_bin=0 is the way.
            ralf.gebhardt Ralf Gebhardt made changes -
            Rank Ranked higher
            GeoffMontee Geoff Montee (Inactive) made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.4 [ 22408 ]
            NRE Projects RM_104_galera RM_104_galera RM_removed_104
            ralf.gebhardt Ralf Gebhardt made changes -
            Epic Link PT-78 [ 68559 ]
            jira-update-service Jira Update Service made changes -
            Support case ID 26488 not-13568 not-14263 not-14944 not-15259 not-16077 not-16827 not-21020 not-13568 not-14263 not-14944 not-15259 not-16077 not-16827 not-21020 not-26488
            ralf.gebhardt Ralf Gebhardt made changes -
            NRE Projects RM_105_CANDIDATE RM_removed_104 RM_105_CANDIDATE RM_removed_104 RM_105_GALERA
            danblack Daniel Black made changes -

            Would there be any issue using the galera seqno as the last part of the mariadb gtid ? apart for RSU DDL i mean ?

            SylvainArbaudie Sylvain ARBAUDIE added a comment - Would there be any issue using the galera seqno as the last part of the mariadb gtid ? apart for RSU DDL i mean ?
            teemu.ollakka Teemu Ollakka added a comment -

            There are at least two major issues which need to be resolved in order to use Galera seqno as part of the MariaDB GTID:

            1. Occasionally Galera seqno is generated for a write set which do not commit a transaction, these include (but not limited to) write sets that fail certification and intermediate streaming replication fragments. In order to keep GTID sequences continuous, all of these events should be logged in binlog as dummy events, which could cause excessive clutter under certain workloads.
            2. Master-slave topology where Galera cluster acts as a slave: It is required that the original GTID from the master should be preserved in binlog events. However, as Galera will generate a write set/seqno for the applied transaction, there will be two GTIDs which should be persisted in binlog for each transaction. It is not clear how this could be handled to preserve compatibility with async master/slave replication.
            teemu.ollakka Teemu Ollakka added a comment - There are at least two major issues which need to be resolved in order to use Galera seqno as part of the MariaDB GTID: Occasionally Galera seqno is generated for a write set which do not commit a transaction, these include (but not limited to) write sets that fail certification and intermediate streaming replication fragments. In order to keep GTID sequences continuous, all of these events should be logged in binlog as dummy events, which could cause excessive clutter under certain workloads. Master-slave topology where Galera cluster acts as a slave: It is required that the original GTID from the master should be preserved in binlog events. However, as Galera will generate a write set/seqno for the applied transaction, there will be two GTIDs which should be persisted in binlog for each transaction. It is not clear how this could be handled to preserve compatibility with async master/slave replication.
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Teemu Ollakka [ teemu.ollakka ] Jan Lindström [ jplindst ]
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Jan Lindström [ jplindst ] Andrei Elkin [ elkin ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            julien.fritsch Julien Fritsch made changes -
            Assignee Andrei Elkin [ elkin ] Sachin Setiya [ sachin.setiya.007 ]
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Status In Review [ 10002 ] Stalled [ 10000 ]
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ] Mario Karuza [ mkaruza ]
            jplindst Jan Lindström (Inactive) made changes -
            Component/s Galera [ 14918 ]
            Component/s Replication [ 14976 ]
            Component/s Galera [ 10124 ]
            Component/s Replication [ 10100 ]
            Fix Version/s 10.5 [ 23608 ]
            Fix Version/s 10.5 [ 23123 ]
            Key MDEV-10715 MENT-400
            Project MariaDB Server [ 10000 ] MariaDB Enterprise [ 11500 ]
            GeoffMontee Geoff Montee (Inactive) made changes -
            jplindst Jan Lindström (Inactive) made changes -
            Component/s Galera [ 10124 ]
            Component/s Replication [ 10100 ]
            Component/s Galera [ 14918 ]
            Component/s Replication [ 14976 ]
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.5 [ 23608 ]
            Key MENT-400 MDEV-20720
            Project MariaDB Enterprise [ 11500 ] MariaDB Server [ 10000 ]
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Mario Karuza [ mkaruza ] Jan Lindström [ jplindst ]

            A feature like MDEV-20715 could also improve Galera's support for MariaDB GTIDs. Specifically, it could prevent each node from generating GTIDs for local transactions, which could make it easier for replication slaves to use any cluster node as master, without risking inconsistent GTIDs.

            GeoffMontee Geoff Montee (Inactive) added a comment - A feature like MDEV-20715 could also improve Galera's support for MariaDB GTIDs. Specifically, it could prevent each node from generating GTIDs for local transactions, which could make it easier for replication slaves to use any cluster node as master, without risking inconsistent GTIDs.
            GeoffMontee Geoff Montee (Inactive) made changes -
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Jan Lindström [ jplindst ] Mario Karuza [ mkaruza ]
            danblack Daniel Black made changes -
            mkaruza Mario Karuza (Inactive) made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            michaeldg Michaël de groot made changes -
            mkaruza Mario Karuza (Inactive) made changes -
            Assignee Mario Karuza [ mkaruza ] Sachin Setiya [ sachin.setiya.007 ]
            Status In Progress [ 3 ] In Review [ 10002 ]

            Okay to push

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Okay to push
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Status In Review [ 10002 ] Stalled [ 10000 ]
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ] Jan Lindström [ jplindst ]
            jplindst Jan Lindström (Inactive) made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            jplindst Jan Lindström (Inactive) made changes -
            issue.field.resolutiondate 2020-01-29 13:57:10.0 2020-01-29 13:57:10.942
            jplindst Jan Lindström (Inactive) made changes -
            Fix Version/s 10.5.1 [ 24029 ]
            Fix Version/s 10.5 [ 23123 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            greenman Ian Gilfillan added a comment -

            The pull request linked with this issue is still marked as open, although the task has been closed.

            greenman Ian Gilfillan added a comment - The pull request linked with this issue is still marked as open, although the task has been closed.
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 76873 ] MariaDB v4 [ 132934 ]
            Richard Richard Stracke made changes -
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 165006 143802 152572 156786 167024 125122 187291 163119

            People

              jplindst Jan Lindström (Inactive)
              nirbhay_c Nirbhay Choubey (Inactive)
              Votes:
              28 Vote for this issue
              Watchers:
              41 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.