Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26632

multi source replication filters breaking GTID semantic

Details

    Description

      Howto reproduce:
      Lets take a leader - replica cluster on domain 0 on leader of this cluster add extra source of replication on an domain 1 server with a table or database filter. Insert 1 record on domain 0 and insert 1 record on domain 1 on the filter.

      Issue:
      Switchover is not possible on domain 0 as the gtid_slave_pos on the old leader is increase by the filter but not increase on his replica as not in binlog.

      Fixing:
      The possible fix should be not to increase gtid_slave_pos and gtid_current_pos when the event is filtered , this would impact reconnect of the replication to refetch prior position to any non filtered replication events . Can slow down the reconnect in case of a majority of filtered events but can be mitigated by having the filtered events in the relay log and check here if the last event is filtered before connection if not it is safe to set slave_pos to the highest gtid in relay log

      An other fix would be to write the event in the binlog with an extra flag filtered and stream it to the replica as well , so the position exits and a parameter can be added at any layer of the replication tree to restore those events
      replicate-ingore-filters = boolean [ON] don't apply filtered event OFF apply filtered event

      Attachments

        Issue Links

          Activity

            stephane@skysql.com VAROQUI Stephane created issue -
            Elkin Andrei Elkin added a comment - - edited

            > The possible fix should be not to increase gtid_slave_pos and gtid_current_pos when the event is filtered

            I'd agree with such concept. The filtered out info should not affect the slave state. We need a more clear policy for that. Thank you for pointing to this issue!

            Elkin Andrei Elkin added a comment - - edited > The possible fix should be not to increase gtid_slave_pos and gtid_current_pos when the event is filtered I'd agree with such concept. The filtered out info should not affect the slave state. We need a more clear policy for that. Thank you for pointing to this issue!
            Elkin Andrei Elkin made changes -
            Field Original Value New Value
            Assignee Andrei Elkin [ elkin ]
            stephane@skysql.com VAROQUI Stephane made changes -
            serg Sergei Golubchik made changes -
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.6 [ 24028 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 125123 ] MariaDB v4 [ 143177 ]

            Hello,

            Just tested with GTID Domain filtering, and it works fine.

            Filtered events are not written onto the binlogs nor the parents relay-logs.

            Best regards,

            rdem Richard DEMONGEOT added a comment - Hello, Just tested with GTID Domain filtering, and it works fine. Filtered events are not written onto the binlogs nor the parents relay-logs. Best regards,
            Elkin Andrei Elkin added a comment -

            rdem, thanks for trying. Yet the case is about stopping account the filtered-out/ignored gtids in gtid_slave_pos.

            Elkin Andrei Elkin added a comment - rdem , thanks for trying. Yet the case is about stopping account the filtered-out/ignored gtids in gtid_slave_pos .
            knielsen Kristian Nielsen made changes -
            Attachment rpl_mdev26632.cnf [ 71753 ]
            Attachment rpl_mdev26632.test [ 71754 ]

            I wrote a test case to reproduce what I think is the scenario described here.
            This setup is 1 ->2 -> 3.
            Server 1 binlogs GTID 1-1-6, 1-1-7, 1-1-8.
            Server 2 filters out (--replicate-ignore-table) 1-1-7, so it binlogs 1-1-6, 1-1-8 (with a hole for the filtered event).
            Server 3 similarly has only 1-1-6, 1-1-8.

            When server 2 has gtid_slave_pos=1-1-7, we stop it and CHANGE MASTER to be a slave of server 3.
            If server 3 has not yet replicated 1-1-8, this fails with "GTID 1-1-7 is not in the master's binlog".
            Even if server 3 has replicated 1-1-8, it can still fail with "even though both a prior and a subsequent sequence number does exist" if gtid_strict_mode is set to 1.

            But as the test case demonstrates, this scenario can work by configuring --gtid-strict-mode=0 and --gtid-ignore-duplicates=1.

            Let me explain why.

            The server tries hard to protect the user from common mistakes with GTID in simple topologies. If a slave would be allowed to connect with a GTID that never existed on the master, the slave could skip events infinitely searching for the missing GTID, which would probably be unexpected. That is why the server errors on the GTID 1-1-7 by default.

            In this case though, the user has set up a complex topology, where filtering is used so that server data and binlogs are not identical between servers. In particular, user wants to allow the slave to connect at a "hole" in the master's binlog, eg. at position 1-1-7 which is the position between adjacent events 1-1-6 and 1-1-8 in the master's binlog.

            Since the binlogs are not identical across the topology, this is not a "strict" GTID setup, so we need to configure --gtid-strict-mode=0. This will allow the slave to use 1-1-7 to denote the "hole" between 1-1-6 and 1-1-8.

            We also need to configure --gtid-ignore-duplicates=1 to allow the slave to connect at a GTID position that is not yet available in the binlog. Even though we do not have duplicate GTIDs in play here, the point of --gtid-ignore-duplicates is to allow GTIDs to arrive in non-strict ways; in this case, the GTID 1-1-7 can be seen only as a "hole" when 1-1-8 is received later, and setting --gtid-ignore-duplicates=1 requests the server to allow this situation without giving an error.

            So in summary, the server behaviour is correct and supports this application by configuring --gtid-strict-mode=0 and --gtid-ignore-duplicates=1.

            I think it's correct to update the GTID for filtered events. If not, the gtid_slave_pos could become so far behind that binlog purge on the master makes the slave unable to connect, even though the slave is fully caught up.

            I also don't think the ignored events should be binlogged and replicated down the topology. This is an explicit design of MariaDB GTID that it can tolerate holes in the GTID sequence and events can be properly filtered without poluting the entire replication topology.

            Does it make sense, and solve the case at hand?

            knielsen Kristian Nielsen added a comment - I wrote a test case to reproduce what I think is the scenario described here. This setup is 1 ->2 -> 3. Server 1 binlogs GTID 1-1-6, 1-1-7, 1-1-8. Server 2 filters out (--replicate-ignore-table) 1-1-7, so it binlogs 1-1-6, 1-1-8 (with a hole for the filtered event). Server 3 similarly has only 1-1-6, 1-1-8. When server 2 has gtid_slave_pos=1-1-7, we stop it and CHANGE MASTER to be a slave of server 3. If server 3 has not yet replicated 1-1-8, this fails with "GTID 1-1-7 is not in the master's binlog". Even if server 3 has replicated 1-1-8, it can still fail with "even though both a prior and a subsequent sequence number does exist" if gtid_strict_mode is set to 1. But as the test case demonstrates, this scenario can work by configuring --gtid-strict-mode=0 and --gtid-ignore-duplicates=1. Let me explain why. The server tries hard to protect the user from common mistakes with GTID in simple topologies. If a slave would be allowed to connect with a GTID that never existed on the master, the slave could skip events infinitely searching for the missing GTID, which would probably be unexpected. That is why the server errors on the GTID 1-1-7 by default. In this case though, the user has set up a complex topology, where filtering is used so that server data and binlogs are not identical between servers. In particular, user wants to allow the slave to connect at a "hole" in the master's binlog, eg. at position 1-1-7 which is the position between adjacent events 1-1-6 and 1-1-8 in the master's binlog. Since the binlogs are not identical across the topology, this is not a "strict" GTID setup, so we need to configure --gtid-strict-mode=0. This will allow the slave to use 1-1-7 to denote the "hole" between 1-1-6 and 1-1-8. We also need to configure --gtid-ignore-duplicates=1 to allow the slave to connect at a GTID position that is not yet available in the binlog. Even though we do not have duplicate GTIDs in play here, the point of --gtid-ignore-duplicates is to allow GTIDs to arrive in non-strict ways; in this case, the GTID 1-1-7 can be seen only as a "hole" when 1-1-8 is received later, and setting --gtid-ignore-duplicates=1 requests the server to allow this situation without giving an error. So in summary, the server behaviour is correct and supports this application by configuring --gtid-strict-mode=0 and --gtid-ignore-duplicates=1. I think it's correct to update the GTID for filtered events. If not, the gtid_slave_pos could become so far behind that binlog purge on the master makes the slave unable to connect, even though the slave is fully caught up. I also don't think the ignored events should be binlogged and replicated down the topology. This is an explicit design of MariaDB GTID that it can tolerate holes in the GTID sequence and events can be properly filtered without poluting the entire replication topology. Does it make sense, and solve the case at hand?

            Your test case is not covering scenario we are after ,

            Source A, server 1 -> source B, server 2 + filter -> source B server 3 + filter

            We stop server 2 replication on a filtered event and we are electing server 3 to replace server 2 .

            The question is can a start slave on named source can connect and get a success just taking the last event that match his domain vector in the leader GTID , ignore-duplicates scares us in such scenario as could lead to some write not being applied on old leader but if gtid-strict-mode=0 enable this named replication to connect it could work ?

            stephane@skysql.com VAROQUI Stephane added a comment - Your test case is not covering scenario we are after , Source A, server 1 -> source B, server 2 + filter -> source B server 3 + filter We stop server 2 replication on a filtered event and we are electing server 3 to replace server 2 . The question is can a start slave on named source can connect and get a success just taking the last event that match his domain vector in the leader GTID , ignore-duplicates scares us in such scenario as could lead to some write not being applied on old leader but if gtid-strict-mode=0 enable this named replication to connect it could work ?

            Hello Kristian Nielsen;

            Thanks for time.

            After verifiying, my setup is :
            On all the setup, i only use NAMED replicas.

            Primary cluster : srv 1 and srv2(gtid domain 11).
            There is many writes, with some of them who are useless on the downstream clusters.

            Replicas connected to this server are named : C1

            Second cluster : srv3 and srv4
            Writes made on this cluster are gtid domain 3.
            use only a subset of Cluster 1; and add many tables arround it.
            Replicas conencted to srv3 or srv4 (dependting the primary) are named : C2

            On cluster 2; i have :
            gtid_strict_mode=OFF
            gtid_ignore_duplicates=OFF
            C1.replicate-rewrite-db="DB1->DB2"
            C1.replicate-do-table=DB2TBL1
            C1.replicate-do-table=DB2.TBL2

            There is no filters on C2 named flow.

            For now, the setup is :

            srv2 <--(C1)-srv1-(C1+filters)->srv3-(C2)-->srv4

            I'll plan to change gtid_ignore_duplicates to ON; and test again fail-overs between srv3 and srv4.

            rdem Richard DEMONGEOT added a comment - Hello Kristian Nielsen; Thanks for time. After verifiying, my setup is : On all the setup, i only use NAMED replicas. Primary cluster : srv 1 and srv2(gtid domain 11). There is many writes, with some of them who are useless on the downstream clusters. Replicas connected to this server are named : C1 Second cluster : srv3 and srv4 Writes made on this cluster are gtid domain 3. use only a subset of Cluster 1; and add many tables arround it. Replicas conencted to srv3 or srv4 (dependting the primary) are named : C2 On cluster 2; i have : gtid_strict_mode=OFF gtid_ignore_duplicates=OFF C1.replicate-rewrite-db="DB1->DB2" C1.replicate-do-table=DB2TBL1 C1.replicate-do-table=DB2.TBL2 There is no filters on C2 named flow. For now, the setup is : srv2 <-- (C1) - srv1 - (C1+filters) - >srv3 - (C2) -->srv4 I'll plan to change gtid_ignore_duplicates to ON; and test again fail-overs between srv3 and srv4.

            stephane@skysql.com, my test is intended to cover exactly the scenario you describe.

            rdem This is the setup I try to cover, I just omitted srv2 as it is not involved in the failover. Using named slave->master connections should not matter I think. The test uses domain_id=1 for srv1 and domain_id=0 for srv3/srv4.

            My understanding is the issue is with CHANGE MASTER on srv3 to replicate from srv4?

            In my test, GTID 1-1-7 is filtered. So srv4 has in its binlog 1-1-6,1-1-8, it is missing GTID 1-1-7. The srv3 has gtid_slave_pos="1-1-7", and it gets this error:

            'Error: connecting slave requested to start from GTID 1-1-7, which is not in the master's binlog'
            

            If this is not the error you are describing, let me know which error it is.

            In sql/sql_repl.cc, there is code to disable exactly this error:

              if (info->slave_gtid_ignore_duplicates && domain_gtid.seq_no < slave_gtid->seq_no) {
                continue;
            

            That is why setting --gtid-ignore-duplicates=1 is needed. With this setting, your scenario is valid and should work. The errors are only to help users with incorrect domain_id configuration.

            When domain_id is configured correctly, --gtid-ignore-duplicates=1 should not be scary and not lead to events being lost. It only ignores events that have the same domain_id but a smaller seq_no than the previous event.

            To explain the wrong configuration the errors are there to prevent, imagine a user with your setup that did not configure different domain id (maybe upgrade from 5.5 to 10.0). The events from srv1 and srv3 will be duplicating each other's seq_no, e.g.:

            0-1-10, 0-1-11, 0-3-9, 0-3-12, 0-1-12, 0-3-13, ...

            Now imagine that srv3 and srv4 filter out event 0-1-12. There is no way on srv3 and srv4 to know if 0-1-12 should come before or after 0-3-12. Therefore the server code acts safe and throws an error.

            The --gtid-ignore-duplicate=1 means the user did configure domains correctly, and sequence numbers will always be strictly increasing in each domain_id. Then this problem can not occur, and the error can be safely silenced.

            I'm not sure this is documented anywhere outside the server source code, so your question/concerns are very valid.

            Also I'm not sure if --gtid-ignore-duplicates will allow to connect a missing GTID if there's a switchover from srv1 to srv2 at the same time (eg. if 1-1-7 is followed by 1-4-8 in my test, not by 1-1-8). If not, that may be a bug that should be fixed.

            knielsen Kristian Nielsen added a comment - stephane@skysql.com , my test is intended to cover exactly the scenario you describe. rdem This is the setup I try to cover, I just omitted srv2 as it is not involved in the failover. Using named slave->master connections should not matter I think. The test uses domain_id=1 for srv1 and domain_id=0 for srv3/srv4. My understanding is the issue is with CHANGE MASTER on srv3 to replicate from srv4? In my test, GTID 1-1-7 is filtered. So srv4 has in its binlog 1-1-6,1-1-8, it is missing GTID 1-1-7. The srv3 has gtid_slave_pos="1-1-7", and it gets this error: 'Error: connecting slave requested to start from GTID 1-1-7, which is not in the master's binlog' If this is not the error you are describing, let me know which error it is. In sql/sql_repl.cc, there is code to disable exactly this error: if (info->slave_gtid_ignore_duplicates && domain_gtid.seq_no < slave_gtid->seq_no) { continue; That is why setting --gtid-ignore-duplicates=1 is needed. With this setting, your scenario is valid and should work. The errors are only to help users with incorrect domain_id configuration. When domain_id is configured correctly, --gtid-ignore-duplicates=1 should not be scary and not lead to events being lost. It only ignores events that have the same domain_id but a smaller seq_no than the previous event. To explain the wrong configuration the errors are there to prevent, imagine a user with your setup that did not configure different domain id (maybe upgrade from 5.5 to 10.0). The events from srv1 and srv3 will be duplicating each other's seq_no, e.g.: 0-1-10, 0-1-11, 0-3-9, 0-3-12, 0-1-12, 0-3-13, ... Now imagine that srv3 and srv4 filter out event 0-1-12. There is no way on srv3 and srv4 to know if 0-1-12 should come before or after 0-3-12. Therefore the server code acts safe and throws an error. The --gtid-ignore-duplicate=1 means the user did configure domains correctly, and sequence numbers will always be strictly increasing in each domain_id. Then this problem can not occur, and the error can be safely silenced. I'm not sure this is documented anywhere outside the server source code, so your question/concerns are very valid. Also I'm not sure if --gtid-ignore-duplicates will allow to connect a missing GTID if there's a switchover from srv1 to srv2 at the same time (eg. if 1-1-7 is followed by 1-4-8 in my test, not by 1-1-8). If not, that may be a bug that should be fixed.

            Hello,

            Yes, the scenario is the same.
            In my memory, this is exactly the same error - need to plan a test on the pre-production platform (next week) to verify that gtid-ignore-duplicates=ON solves the problem.

            I'll update you soon.

            Regards,

            rdem Richard DEMONGEOT added a comment - Hello, Yes, the scenario is the same. In my memory, this is exactly the same error - need to plan a test on the pre-production platform (next week) to verify that gtid-ignore-duplicates=ON solves the problem. I'll update you soon. Regards,
            knielsen Kristian Nielsen made changes -
            Assignee Andrei Elkin [ elkin ] Kristian Nielsen [ knielsen ]

            I made a patch that will allow to do the described scenario with --gtid-strict-mode enabled (as long as --gtid-ignore-duplicate is also enabled):

            https://github.com/MariaDB/server/commits/knielsen_mdev26632

            Note that the scenario already works as described without the patch when --gtid-strict-mode=0. The patch is just to allow to run with --gtid-strict-mode=1 to help avoid incorrect GTID sequence.

            I also wrote some additional documentation of --gtid-ignore-duplicate for the KB, given below for reference.

            - Kristian.

            -----------------------------------------------------------------------
            New subsection for https://mariadb.com/kb/en/gtid/#use-with-multi-source-replication-and-other-multi-primary-setups
            To be added under "Use With Multi-Source Replication and Other Multi-Primary Setups",
            just before "Deleting Unused Domains"
             
            Multiple redundant replication paths
             
            Using GTID with multi-source replication, it is possible to set up multiple
            redundant replication paths. For example:
             
              M1 <-> M2
              M1 -> S1
              M1 -> S2
              M2 -> S1
              M2 -> S2
             
            Here, M1 and M2 are setup in a master-master ring. S1 and S2 both replicate
            from each of M1 and M2. Each event generated on M1 will now arrive twice at
            S1, through the paths M1->S1 and M1->M2->S1. This way, if the network
            connection between M1 and S1 is broken, the replication can continue
            uninterrupted through the alternate path through M2. Note that this is an
            advanced setup, and good familiarity with MariaDB replication is recommended
            to successfully operate such a setup.
             
            The option --gtid-ignore-duplicates must be enabled to use multiple
            redundant replication paths. This is necessary to avoid each event being
            applied twice on the slave as it arrives through each path. The GTID of
            every event will be compared against the sequence number of the current GTID
            slave position (within each domain), and will be skipped if less than or
            equal. Thus it is required that sequence numbers are strictly increasing
            within each domain for --gtid-ignore-duplicates to function correctly, and
            setting --gtid-strict-mode=1 to help enforce this is recommended.
             
            The --gtid-ignore-duplicates options also relaxes the requirement for
            connection to the master. In the above example, when S1 connects to M2, it
            may connect at a GTID position from M1 that has not yet been applied on M2.
            When --gtid-ignore-duplicates is enabled, the connection will be allowed,
            and S1 will start receiving events from M2 once the GTID has been replicated
            from M1 to M2. This can also be used to use replication filters in parts of
            a replication topology, to allow a slave to connect to a GTID position which
            was filtered on a master. When --gtid-ignore-duplicates is enabled, the
            connecting slave will start receiving events from the master at the first
            GTID sequence number that is larger than the connect-position.
            -----------------------------------------------------------------------
            And extend the documentation of the --gtid-ignore-duplicate with a link to
            the above section and some extra text, on
            https://mariadb.com/kb/en/gtid/#gtid_ignore_duplicates :
             
            When --gtid-ignore-duplicate is set, a slave is allowed to connect at a GTID
            position that does not exist on the master. The slave will start receiving
            events once a GTID with a higher sequence number is available on the master
            (within that domain). This can be used to allow a slave to connect at a GTID
            position that was filtered on the master, eg. using --replicate-ignore-table.
            -----------------------------------------------------------------------
            

            knielsen Kristian Nielsen added a comment - I made a patch that will allow to do the described scenario with --gtid-strict-mode enabled (as long as --gtid-ignore-duplicate is also enabled): https://github.com/MariaDB/server/commits/knielsen_mdev26632 Note that the scenario already works as described without the patch when --gtid-strict-mode=0. The patch is just to allow to run with --gtid-strict-mode=1 to help avoid incorrect GTID sequence. I also wrote some additional documentation of --gtid-ignore-duplicate for the KB, given below for reference. - Kristian. ----------------------------------------------------------------------- New subsection for https://mariadb.com/kb/en/gtid/#use-with-multi-source-replication-and-other-multi-primary-setups To be added under "Use With Multi-Source Replication and Other Multi-Primary Setups", just before "Deleting Unused Domains"   Multiple redundant replication paths   Using GTID with multi-source replication, it is possible to set up multiple redundant replication paths. For example:   M1 <-> M2 M1 -> S1 M1 -> S2 M2 -> S1 M2 -> S2   Here, M1 and M2 are setup in a master-master ring. S1 and S2 both replicate from each of M1 and M2. Each event generated on M1 will now arrive twice at S1, through the paths M1->S1 and M1->M2->S1. This way, if the network connection between M1 and S1 is broken, the replication can continue uninterrupted through the alternate path through M2. Note that this is an advanced setup, and good familiarity with MariaDB replication is recommended to successfully operate such a setup.   The option --gtid-ignore-duplicates must be enabled to use multiple redundant replication paths. This is necessary to avoid each event being applied twice on the slave as it arrives through each path. The GTID of every event will be compared against the sequence number of the current GTID slave position (within each domain), and will be skipped if less than or equal. Thus it is required that sequence numbers are strictly increasing within each domain for --gtid-ignore-duplicates to function correctly, and setting --gtid-strict-mode=1 to help enforce this is recommended.   The --gtid-ignore-duplicates options also relaxes the requirement for connection to the master. In the above example, when S1 connects to M2, it may connect at a GTID position from M1 that has not yet been applied on M2. When --gtid-ignore-duplicates is enabled, the connection will be allowed, and S1 will start receiving events from M2 once the GTID has been replicated from M1 to M2. This can also be used to use replication filters in parts of a replication topology, to allow a slave to connect to a GTID position which was filtered on a master. When --gtid-ignore-duplicates is enabled, the connecting slave will start receiving events from the master at the first GTID sequence number that is larger than the connect-position. ----------------------------------------------------------------------- And extend the documentation of the --gtid-ignore-duplicate with a link to the above section and some extra text, on https://mariadb.com/kb/en/gtid/#gtid_ignore_duplicates :   When --gtid-ignore-duplicate is set, a slave is allowed to connect at a GTID position that does not exist on the master. The slave will start receiving events once a GTID with a higher sequence number is available on the master (within that domain). This can be used to allow a slave to connect at a GTID position that was filtered on the master, eg. using --replicate-ignore-table. -----------------------------------------------------------------------

            I have pushed a testcase to 10.4 that demostrates this kind of setup and verifies that it is working (with --gtid-strict-mode=0 and --gtid-ignore-duplicates=1).

            There appears to be no consensus to change --gtid-strict-mode=1 to allow holes in the binlog stream (due to filtering) even in --gtid-ignore-duplicates=1, so this change is left out for now. So no functional changes, this is already working in the server with suitable configuration, just a testcase pushed to make sure this keeps working.

            knielsen Kristian Nielsen added a comment - I have pushed a testcase to 10.4 that demostrates this kind of setup and verifies that it is working (with --gtid-strict-mode=0 and --gtid-ignore-duplicates=1). There appears to be no consensus to change --gtid-strict-mode=1 to allow holes in the binlog stream (due to filtering) even in --gtid-ignore-duplicates=1, so this change is left out for now. So no functional changes, this is already working in the server with suitable configuration, just a testcase pushed to make sure this keeps working.
            knielsen Kristian Nielsen made changes -
            Fix Version/s 10.4.33 [ 29516 ]
            Fix Version/s 10.5.24 [ 29517 ]
            Fix Version/s 10.6.17 [ 29518 ]
            Fix Version/s 10.11.7 [ 29519 ]
            Fix Version/s 11.0.5 [ 29520 ]
            Fix Version/s 11.1.4 [ 29024 ]
            Fix Version/s 11.2.3 [ 29521 ]
            Fix Version/s 11.3.2 [ 29522 ]
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.6 [ 24028 ]
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Closed [ 6 ]

            People

              knielsen Kristian Nielsen
              stephane@skysql.com VAROQUI Stephane
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.