Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27247

Add keywords "SQL_BEFORE_GTIDS" and "SQL_AFTER_GTIDS" for START SLAVE UNTIL

Details

    Description

      In MariaDB the UNTIL keyword is inclusive.

      "The replica will start replication from the current GTID position, run up to and including the event with the GTID specified, and then stop. "
      https://mariadb.com/kb/en/gtid/#start-slave-until-master_gtid_posxxx

      Keyword UNTIL origin is form imperative languages, where UNTIL is the end condition of a loop.

      Depending from the position of the counter (before or after reply), the UNTIL condition will be processed or not.

      SQL is a declarative language, so the counter position in the loop is hidden, so maybe UNTIL is different implemented in different databases (if used in an SQL-statement, not procedures). Procedures are also based on imperative paradigm.

      So I suggest to add keywords EXCLUSIVE and INCLUSIVE.

      The default would be INCLUSIVE, which is the current MariaDB behaviour.

      Example:

      START SLAVE UNTIL EXCLUSIVE master_gtid_pos = <GTID position>;
      

      Attachments

        Issue Links

          Activity

            Richard Richard Stracke created issue -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            Field Original Value New Value
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            julien.fritsch Julien Fritsch made changes -
            Assignee Ralf Gebhardt [ ralf.gebhardt@mariadb.com ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Assignee Ralf Gebhardt [ ralf.gebhardt@mariadb.com ] Richard Stracke [ richard ]
            Status Open [ 1 ] Needs Feedback [ 10501 ]
            Richard Richard Stracke made changes -
            Status Needs Feedback [ 10501 ] Open [ 1 ]
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            AirFocus AirFocus made changes -
            Description In MariaDB the until keyword is inclusive.

            "The replica will start replication from the current GTID position, run up to and including the event with the GTID specified, and then stop. "

            https://mariadb.com/kb/en/gtid/#start-slave-until-master_gtid_posxxx

            Keyword "UNTIL" origin is form imperative languages,
            where UNTIL is the end condition of a loop.

            Depending from the position of the counter (before or after reply), the UNTIL condition wlll be processed or not.

            SQL is a declarative language, so the counter position in the loop is hidden,
            so maybe "UNTIL" is different implemented in different Databases. (if used in a SQL-statement, not procedures. Procedures also based on imperative paradigma.

            So I suggest to add keywords "exclusive" and "inclusive".

            Default is "inclusive" , which is the current MariaDB behaviour.
            example:

            {code:java}

            START SLAVE UNTIL EXLUSIVE master_gtid_pos = <GTID position>
            {code}

            In MariaDB the until keyword is inclusive.

            "The replica will start replication from the current GTID position, run up to and including the event with the GTID specified, and then stop. "

            https://mariadb.com/kb/en/gtid/#start-slave-until\-master_gtid_posxxx

            Keyword "UNTIL" origin is form imperative languages,
            where UNTIL is the end condition of a loop.

            Depending from the position of the counter (before or after reply), the UNTIL condition wlll be processed or not.

            SQL is a declarative language, so the counter position in the loop is hidden,
            so maybe "UNTIL" is different implemented in different Databases. (if used in a SQL\-statement, not procedures. Procedures also based on imperative paradigma.

            So I suggest to add keywords "exclusive" and "inclusive".

            Default is "inclusive" , which is the current MariaDB behaviour.
            example:

            {code:java}

            START SLAVE UNTIL EXLUSIVE master_gtid_pos = <GTID position>
            {code}
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            julien.fritsch Julien Fritsch made changes -
            Assignee Richard Stracke [ richard ] Ralf Gebhardt [ ralf.gebhardt@mariadb.com ]
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            Elkin Andrei Elkin added a comment - - edited

            Thanks serg for the tip. Indeed if the user wants the exclusive handling of, say a gtid 0-1-5, START SLAVE UNTIL master_gtid_pos = "0-1-4" does the trick.
            Notice gtid = 0-1-4 does not have to exist.

            Elkin Andrei Elkin added a comment - - edited Thanks serg for the tip. Indeed if the user wants the exclusive handling of, say a gtid 0-1-5 , START SLAVE UNTIL master_gtid_pos = "0-1-4" does the trick. Notice gtid = 0-1-4 does not have to exist.
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -

            yes, for mysql compatibility we can use the same syntax

            serg Sergei Golubchik added a comment - yes, for mysql compatibility we can use the same syntax
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            Assignee Ralf Gebhardt [ ralf.gebhardt@mariadb.com ] Andrei Elkin [ elkin ]
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            Elkin Andrei Elkin made changes -
            Assignee Andrei Elkin [ elkin ] Brandon Nesterenko [ JIRAUSER48702 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 11.2 [ 28603 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            Elkin Andrei Elkin added a comment -

            ralf.gebhardt, I believe yes.

            Elkin Andrei Elkin added a comment - ralf.gebhardt , I believe yes.
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 11.3 [ 28565 ]
            Fix Version/s 11.2 [ 28603 ]
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            bnestere Brandon Nesterenko made changes -
            Status Open [ 1 ] In Progress [ 3 ]

            GTID position in MariaDB GTID is always inclusive.
            We should not introduce some "exclusive" semantics just because of some reference to how the word "UNTIL" might be used in other contexts. What is the usecase of this, and what is the justification for the added code complexity this will introduce?

            START SLAVE UNTIL master_gtid_pos="0-1-4" is not a way to run until "exclusive" 0-1-5 in the general case. The event before 0-1-5 can be 0-2-4; or almost anything else in case of out-of-order GTID.

            From a quick look at the MySQL documentation, MySQL does not use the START SLAVE INCLUSIVE/EXCLUSIVE syntax.
            They have a different syntax, SQL_BEFORE_GTIDS | SQL_AFTER_GTIDS. But this means something different, since MySQL GTID does not have the concept of domain_id and ordered binlog sequence within each domain.

            knielsen Kristian Nielsen added a comment - GTID position in MariaDB GTID is always inclusive. We should not introduce some "exclusive" semantics just because of some reference to how the word "UNTIL" might be used in other contexts. What is the usecase of this, and what is the justification for the added code complexity this will introduce? START SLAVE UNTIL master_gtid_pos="0-1-4" is not a way to run until "exclusive" 0-1-5 in the general case. The event before 0-1-5 can be 0-2-4; or almost anything else in case of out-of-order GTID. From a quick look at the MySQL documentation, MySQL does not use the START SLAVE INCLUSIVE/EXCLUSIVE syntax. They have a different syntax, SQL_BEFORE_GTIDS | SQL_AFTER_GTIDS. But this means something different, since MySQL GTID does not have the concept of domain_id and ordered binlog sequence within each domain.

            To elaborate, consider the following GTID sequence in the master binlog:

              2-2-9
              1-1-9
              1-1-10
              2-2-10
              1-1-11
              2-2-11
            

            Current START SLAVE UNTIL master_gtid_pos="1-1-10,2-2-11" will stop with @@gtid_slave_pos="1-1-10,2-2-11".
            If I understand correctly (also reading the work-in-progress patch on branch bb-11.3-MDEV-27247), after
            START SLAVE UNTIL EXCLUSIVE master_gtid_pos="1-1-10,2-2-11" will stop with @@gtid_slave_pos="1-1-9,2-2-10".

            If I read the MySQL documentation correctly, in MySQL START SLAVE UNTIL SQL_BEFORE_GTIDS=""1-1-10,2-2-11" would stop with @@gtid_slave_pos="1-1-9,2-2-9". I don't think it's valuable to introduce a syntax to be compatible with MySQL, when that syntax does something completely different in MariaDB.

            Except for that discrepancy with UNTIL SQL_BEFORE_GTIDS, the WIP patch in bb-11.3-MDEV-27247 is quite simple and elegant.
            I still would be interested if there's an actual use case for this? If so, I suppose this looks fine. But if not, I think it's just causing more confusion to introduce an "exlusive" semantics to GTID positions, when everywhere else GTID positions are always "inclusive" by the fundamental design of MariaDB GTID.

            knielsen Kristian Nielsen added a comment - To elaborate, consider the following GTID sequence in the master binlog: 2-2-9 1-1-9 1-1-10 2-2-10 1-1-11 2-2-11 Current START SLAVE UNTIL master_gtid_pos="1-1-10,2-2-11" will stop with @@gtid_slave_pos="1-1-10,2-2-11". If I understand correctly (also reading the work-in-progress patch on branch bb-11.3- MDEV-27247 ), after START SLAVE UNTIL EXCLUSIVE master_gtid_pos="1-1-10,2-2-11" will stop with @@gtid_slave_pos="1-1-9,2-2-10". If I read the MySQL documentation correctly, in MySQL START SLAVE UNTIL SQL_BEFORE_GTIDS=""1-1-10,2-2-11" would stop with @@gtid_slave_pos="1-1-9,2-2-9". I don't think it's valuable to introduce a syntax to be compatible with MySQL, when that syntax does something completely different in MariaDB. Except for that discrepancy with UNTIL SQL_BEFORE_GTIDS, the WIP patch in bb-11.3- MDEV-27247 is quite simple and elegant. I still would be interested if there's an actual use case for this? If so, I suppose this looks fine. But if not, I think it's just causing more confusion to introduce an "exlusive" semantics to GTID positions, when everywhere else GTID positions are always "inclusive" by the fundamental design of MariaDB GTID.
            bnestere Brandon Nesterenko added a comment - - edited

            Hi knielsen!

            To your use-case concern, here is a paste from an old conversation from rob.schwyzer@mariadb.com:

            "Notice gtid = 0-1-4 does not have to exist" is really the concern here the customer is trying to solve- rather than the customer needing to make automation which parses GTIDs from mysqlbinlog to figure out what's viable, the customer wants to be able to take just the GTID thrown into MariaDB error log and to have some way of easily telling MariaDB to replicate up to the event prior to the failing event (which is why being able to specify ex START SLAVE UNTIL EXCLUSIVE master_gtid_pos = 0-1-5; would be ideal)

            To me, it seems the exclusivity would ideally be on a per-domain basis, and forcing the entire GTID set to be exclusive seems a bit awkward. I wonder if the syntax could be embedded into the GTID itself. Perhaps with a token to delimit an endpoint, such that it can be placed before/after each GTID to denote whether the replica should stop at the respective location in that domin. Extending your example GTID set, and using '*' because it resembles a closed dot on a number line, something like `START SLAVE UNTIL master_gtid_pos=*1-1-10,2-2-10*` would result in an end gtid_slave_pos of 1-1-9,2-2-10. Another alternative could be specifying both EXCLUSIVE and INCLUSIVE GTID sets in the same START SLAVE command, e.g. `START SLAVE UNTIL master_gtid_pos EXCLUSIVE=1-1-10,.. INCLUSIVE=2-2-10,..`.

            Your point about MySQL semantics of SQL_BEFORE_GTID and SQL_AFTER_GTID being different with MariaDB seems valid, as I don't think we can adopt the MySQL semantics because it wouldn't satisfy the use case (it could stop too early from a different domain). So I am okay removing it from my patch.

            bnestere Brandon Nesterenko added a comment - - edited Hi knielsen ! To your use-case concern, here is a paste from an old conversation from rob.schwyzer@mariadb.com : "Notice gtid = 0-1-4 does not have to exist" is really the concern here the customer is trying to solve- rather than the customer needing to make automation which parses GTIDs from mysqlbinlog to figure out what's viable, the customer wants to be able to take just the GTID thrown into MariaDB error log and to have some way of easily telling MariaDB to replicate up to the event prior to the failing event (which is why being able to specify ex START SLAVE UNTIL EXCLUSIVE master_gtid_pos = 0-1-5; would be ideal) To me, it seems the exclusivity would ideally be on a per-domain basis, and forcing the entire GTID set to be exclusive seems a bit awkward. I wonder if the syntax could be embedded into the GTID itself. Perhaps with a token to delimit an endpoint, such that it can be placed before/after each GTID to denote whether the replica should stop at the respective location in that domin. Extending your example GTID set, and using '*' because it resembles a closed dot on a number line, something like `START SLAVE UNTIL master_gtid_pos=*1-1-10,2-2-10*` would result in an end gtid_slave_pos of 1-1-9,2-2-10. Another alternative could be specifying both EXCLUSIVE and INCLUSIVE GTID sets in the same START SLAVE command, e.g. `START SLAVE UNTIL master_gtid_pos EXCLUSIVE=1-1-10,.. INCLUSIVE=2-2-10,..`. Your point about MySQL semantics of SQL_BEFORE_GTID and SQL_AFTER_GTID being different with MariaDB seems valid, as I don't think we can adopt the MySQL semantics because it wouldn't satisfy the use case (it could stop too early from a different domain). So I am okay removing it from my patch.

            Right, so the use case here is "replicate everything in the master's binlog sequentially, until the point just before this GTID".

            I think the key word here is "point". The usecase is not to find a multi-dimensional GTID position, it is actually to find a specific point (file/offset) at which to stop. In fact, that file/offset is also specified in the error message, so likely there is no need to use GTID, unless they want to do this after switching to a different master.

            This is exactly the semantics of SQL_BEFORE_GTID. Replicate until any of the listed GTIDs is reached, and then stop just before that event group.

            A main difference is what happens with domains not specified in START SLAVE UNTIL master_gtid_pos. In current code (which we now call "inclusive"), such domains stop immediately, nothing is replicated from them. This is not what is wanted in the usecase discussed here, which could leave other domains arbitrarily behind before the stop position is reached.

            The semantics of current START SLAVE UNTIL master_gtid_pos: replicate until the specified GTID position in each domain. A not-specified domain denotes the start of that domain, just as in @@gtid_slave_pos. The usecase here as described in https://mariadb.com/kb/en/gtid/#start-slave-until-master_gtid_posxxx is to make sure the specified GTID position will be valid for slaves to connect to after promoting this server as a master.

            So what we need here is: Replicate up to (but not including) the first GTID in the list is reached on the current master, and then stop all domains at that point. The usecase will be to stop at a @@gtid_slave_pos corresponding exactly to the point just before the (first of the) specified events. Here the meaning of a not specified domain is to not specify any stop position in that domain.

            It seems INCLUSIVE/EXCLUSIVE syntax is too ambiguous. BEFORE_GTID and AFTER_GTID seems better to me, though AFTER_GTID will not match exactly with the MySQL semantics due to having domains_ids.

            I don't think it's a good idea to make this even more complex with specifying "*" markers in the list or combining "inclusive" and "exclusive" in the same command, without any valid use case for it. GTID is already much too complex for most users as it is

            But the above two different semantics does seem to correspond to valid use cases, and looks from your code to be relatively easy to implement, just need a bit different handling for not-specified domains. And to stop when the first GTID is reached, not continue the other domains until all have been reached.

            Be sure to describe the intended usecase for the new option in the documentation

            - Kristian.

            knielsen Kristian Nielsen added a comment - Right, so the use case here is "replicate everything in the master's binlog sequentially, until the point just before this GTID". I think the key word here is "point". The usecase is not to find a multi-dimensional GTID position, it is actually to find a specific point (file/offset) at which to stop. In fact, that file/offset is also specified in the error message, so likely there is no need to use GTID, unless they want to do this after switching to a different master. This is exactly the semantics of SQL_BEFORE_GTID. Replicate until any of the listed GTIDs is reached, and then stop just before that event group. A main difference is what happens with domains not specified in START SLAVE UNTIL master_gtid_pos. In current code (which we now call "inclusive"), such domains stop immediately, nothing is replicated from them. This is not what is wanted in the usecase discussed here, which could leave other domains arbitrarily behind before the stop position is reached. The semantics of current START SLAVE UNTIL master_gtid_pos: replicate until the specified GTID position in each domain. A not-specified domain denotes the start of that domain, just as in @@gtid_slave_pos. The usecase here as described in https://mariadb.com/kb/en/gtid/#start-slave-until-master_gtid_posxxx is to make sure the specified GTID position will be valid for slaves to connect to after promoting this server as a master. So what we need here is: Replicate up to (but not including) the first GTID in the list is reached on the current master, and then stop all domains at that point. The usecase will be to stop at a @@gtid_slave_pos corresponding exactly to the point just before the (first of the) specified events. Here the meaning of a not specified domain is to not specify any stop position in that domain. It seems INCLUSIVE/EXCLUSIVE syntax is too ambiguous. BEFORE_GTID and AFTER_GTID seems better to me, though AFTER_GTID will not match exactly with the MySQL semantics due to having domains_ids. I don't think it's a good idea to make this even more complex with specifying "*" markers in the list or combining "inclusive" and "exclusive" in the same command, without any valid use case for it. GTID is already much too complex for most users as it is But the above two different semantics does seem to correspond to valid use cases, and looks from your code to be relatively easy to implement, just need a bit different handling for not-specified domains. And to stop when the first GTID is reached, not continue the other domains until all have been reached. Be sure to describe the intended usecase for the new option in the documentation - Kristian.

            I see your "point"

            To

            A main difference is what happens with domains not specified in START SLAVE UNTIL master_gtid_pos. In current code (which we now call "inclusive"), such domains stop immediately, nothing is replicated from them. This is not what is wanted in the usecase discussed here, which could leave other domains arbitrarily behind before the stop position is reached.
            ...
            So what we need here is: Replicate up to (but not including) the first GTID in the list is reached on the current master, and then stop all domains at that point.

            I wonder if we should allow for flexibility with other unspecified domains though. Perhaps by specifying an "ISOLATE" or "ALL" keyword to indicate whether the user wants to isolate just the provided domains (as is the default base behavior), or include all other domains from the primary as well (as you propose). "ISOLATE" could allow a user to focus just on a problematic stream, without any potential adverse effects from other domains (e.g. simply execution time). Although I imagine running CHANGE MASTER TO DO_DOMAIN_IDS=() could achieve a similar effect.

            bnestere Brandon Nesterenko added a comment - I see your "point" To A main difference is what happens with domains not specified in START SLAVE UNTIL master_gtid_pos. In current code (which we now call "inclusive"), such domains stop immediately, nothing is replicated from them. This is not what is wanted in the usecase discussed here, which could leave other domains arbitrarily behind before the stop position is reached. ... So what we need here is: Replicate up to (but not including) the first GTID in the list is reached on the current master, and then stop all domains at that point. I wonder if we should allow for flexibility with other unspecified domains though. Perhaps by specifying an "ISOLATE" or "ALL" keyword to indicate whether the user wants to isolate just the provided domains (as is the default base behavior), or include all other domains from the primary as well (as you propose). "ISOLATE" could allow a user to focus just on a problematic stream, without any potential adverse effects from other domains (e.g. simply execution time). Although I imagine running CHANGE MASTER TO DO_DOMAIN_IDS=() could achieve a similar effect.

            Again, I don't think we should introduce extra complexity/features without a reasonable usecase.

            Normally, one would not want to leave one domain arbitrarily behind another. If a single domain is far behind, then on next slave start, the dump thread will need to read back through a lot of old binlog files, skipping a lot of events, or at worst failing because the necessary binlog files have been purged.

            In fact, I'm not sure why I designed the START SLAVE UNTIL master_gtid_pos feature this way that unspecified domains are stopped immediately. Maybe I didn't think it through, and just used the normal semantics of "unspecified domain means start of binlog". It seems not very useful, and a nasty surprise for users that forget to specify one domain in a long START UNTIL, and then is left with a slave position spanning a large range of binlog files.

            It actually makes more sense that unspecified domain means "start at the beginning" for a start position, and "stop only at the end" for a stop position. So I'd actually prefer that unspecified domain in START SLAVE UNTIL means "don't stop until the other domain(s) do" instead of "stop immediately". It also works fine for the documented usecase. Not sure how dangerous it is to change it now.

            Maybe your ISOLATE/ALL idea could be used for this, making ISOLATE default for AFTER_GTID (for historic reasons) and ALL default for BEFORE_GTID. But I'd still prefer to not do this unless there's a real usecase for it. We can always add it later, but removing after GA release is much harder.

            CHANGE MASTER TO DO_DOMAIN_IDS=() I think would do something very different, which is to discard events but still update the @@gtid_slave_pos, wouldn't it?

            knielsen Kristian Nielsen added a comment - Again, I don't think we should introduce extra complexity/features without a reasonable usecase. Normally, one would not want to leave one domain arbitrarily behind another. If a single domain is far behind, then on next slave start, the dump thread will need to read back through a lot of old binlog files, skipping a lot of events, or at worst failing because the necessary binlog files have been purged. In fact, I'm not sure why I designed the START SLAVE UNTIL master_gtid_pos feature this way that unspecified domains are stopped immediately. Maybe I didn't think it through, and just used the normal semantics of "unspecified domain means start of binlog". It seems not very useful, and a nasty surprise for users that forget to specify one domain in a long START UNTIL, and then is left with a slave position spanning a large range of binlog files. It actually makes more sense that unspecified domain means "start at the beginning" for a start position, and "stop only at the end" for a stop position. So I'd actually prefer that unspecified domain in START SLAVE UNTIL means "don't stop until the other domain(s) do" instead of "stop immediately". It also works fine for the documented usecase. Not sure how dangerous it is to change it now. Maybe your ISOLATE/ALL idea could be used for this, making ISOLATE default for AFTER_GTID (for historic reasons) and ALL default for BEFORE_GTID. But I'd still prefer to not do this unless there's a real usecase for it. We can always add it later, but removing after GA release is much harder. CHANGE MASTER TO DO_DOMAIN_IDS=() I think would do something very different, which is to discard events but still update the @@gtid_slave_pos, wouldn't it?
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            bnestere Brandon Nesterenko made changes -
            Summary Add keywords "exclusive" and "inclusive" for START SLAVE UNTIL Add keywords "SQL_BEFORE_GTIDS" and "SQL_AFTER_GTIDS" for START SLAVE UNTIL
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -

            Howdy Andrei!

            This is ready for review PR-2736.

            bnestere Brandon Nesterenko added a comment - Howdy Andrei! This is ready for review PR-2736 .
            bnestere Brandon Nesterenko made changes -
            Assignee Brandon Nesterenko [ JIRAUSER48702 ] Andrei Elkin [ elkin ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            Elkin Andrei Elkin added a comment -

            knielsen, bnestere: to my view the BEFORE_GTID = a-list-of-gtids semantics fit well to a use case of stopping at a multi-domain gtid state without reaching it.

            If a user intent is to stop "at a point" (e.g an error message of the stopped slave), then the list of gtids is actually reduced to one element. Otherwise it contain redundant elements.

            Maybe this reasoning will lead you to a conclusion that both intents can be distinctly specified (with simply the option value *type* - a list or an "atom") and (were) covered by an initial version of the patch?

            Elkin Andrei Elkin added a comment - knielsen , bnestere : to my view the BEFORE_GTID = a-list-of-gtids semantics fit well to a use case of stopping at a multi-domain gtid state without reaching it. If a user intent is to stop "at a point" (e.g an error message of the stopped slave), then the list of gtids is actually reduced to one element. Otherwise it contain redundant elements. Maybe this reasoning will lead you to a conclusion that both intents can be distinctly specified (with simply the option value * type * - a list or an "atom") and (were) covered by an initial version of the patch?
            Elkin Andrei Elkin added a comment -

            Browsed the patch to approve the code part.
            Had a question to semantics of the new option.
            If my doubt will appeal it can be fixed as a bug.

            Elkin Andrei Elkin added a comment - Browsed the patch to approve the code part. Had a question to semantics of the new option. If my doubt will appeal it can be fixed as a bug.
            Elkin Andrei Elkin made changes -
            Assignee Andrei Elkin [ elkin ] Brandon Nesterenko [ JIRAUSER48702 ]
            Status In Review [ 10002 ] Stalled [ 10000 ]
            bnestere Brandon Nesterenko made changes -
            Status Stalled [ 10000 ] In Testing [ 10301 ]
            bnestere Brandon Nesterenko made changes -
            Assignee Brandon Nesterenko [ JIRAUSER48702 ] Roel Van de Paar [ roel ]
            Roel Roel Van de Paar made changes -
            Description In MariaDB the until keyword is inclusive.

            "The replica will start replication from the current GTID position, run up to and including the event with the GTID specified, and then stop. "

            https://mariadb.com/kb/en/gtid/#start-slave-until\-master_gtid_posxxx

            Keyword "UNTIL" origin is form imperative languages,
            where UNTIL is the end condition of a loop.

            Depending from the position of the counter (before or after reply), the UNTIL condition wlll be processed or not.

            SQL is a declarative language, so the counter position in the loop is hidden,
            so maybe "UNTIL" is different implemented in different Databases. (if used in a SQL\-statement, not procedures. Procedures also based on imperative paradigma.

            So I suggest to add keywords "exclusive" and "inclusive".

            Default is "inclusive" , which is the current MariaDB behaviour.
            example:

            {code:java}

            START SLAVE UNTIL EXLUSIVE master_gtid_pos = <GTID position>
            {code}
            In MariaDB the {{UNTIL}} keyword is inclusive.

            "The replica will start replication from the current GTID position, run up to and including the event with the GTID specified, and then stop. "
            https://mariadb.com/kb/en/gtid/#start-slave-until\-master_gtid_posxxx

            Keyword {{UNTIL}} origin is form imperative languages, where {{UNTIL}} is the end condition of a loop.

            Depending from the position of the counter (before or after reply), the {{UNTIL}} condition will be processed or not.

            SQL is a declarative language, so the counter position in the loop is hidden, so maybe {{UNTIL}} is different implemented in different databases (if used in an SQL\-statement, not procedures). Procedures are also based on imperative paradigm.

            So I suggest to add keywords {{EXCLUSIVE}} and {{INCLUSIVE}}.

            The default would be {{INCLUSIVE}}, which is the current MariaDB behaviour.

            Example:
            {code:sql}
            START SLAVE UNTIL EXCLUSIVE master_gtid_pos = <GTID position>;
            {code}

            Elkin, I'm sorry, but I couldn't understand what you are asking.

            The usecase for what is SQL_AFTER_GTIDS is described here: https://mariadb.com/kb/en/gtid/#start-slave-until-master_gtid_posxxx
            This usecase requires that the specified GTID is applied, and then the slave stops.

            The usecase for SQL_BEFORE_GTIDS is to not apply the specified GTID and stop just before it.

            So even with a single element GTID position, say 0-1-100, SQL_BEFORE_GTIDS and SQL_AFTER_GTIDS need to do different things. But maybe you were refering to something else when you wrote "with simply the option value type - a list or an 'atom'" ?

            knielsen Kristian Nielsen added a comment - Elkin , I'm sorry, but I couldn't understand what you are asking. The usecase for what is SQL_AFTER_GTIDS is described here: https://mariadb.com/kb/en/gtid/#start-slave-until-master_gtid_posxxx This usecase requires that the specified GTID is applied, and then the slave stops. The usecase for SQL_BEFORE_GTIDS is to not apply the specified GTID and stop just before it. So even with a single element GTID position, say 0-1-100, SQL_BEFORE_GTIDS and SQL_AFTER_GTIDS need to do different things. But maybe you were refering to something else when you wrote "with simply the option value type - a list or an 'atom'" ?
            Elkin Andrei Elkin added a comment - - edited

            knielsen: I meant the following two use case (should've come up with an example at once).

            SQL_BEFORE_GTIDS = A

            of course unambiguously stops w/o applying the gtid A.

            SQL_BEFORE_GTIDS = A,B

            would stop either way. And that may be ambiguous to some, myself incl .
            Indeed when A and B are of different domains, they can intertwine in the binlog arbitrary. Could be A,B on one server and B,A on another.
            At the same time if the user wants to stop at "a point" (e.g before to reach an expected error), (s)he just needs to specify one "atom" value.
            Oth the list type of the new option's value, like "("A,B")" ("()" are imaginary of course), describe pretty distinctly a multi-domain state "minus one gtid from each domain". That's what the original implementation aimed to capture.

            And that's what I wanted to contrast.

            Elkin Andrei Elkin added a comment - - edited knielsen : I meant the following two use case (should've come up with an example at once). SQL_BEFORE_GTIDS = A of course unambiguously stops w/o applying the gtid A . SQL_BEFORE_GTIDS = A,B would stop either way. And that may be ambiguous to some, myself incl . Indeed when A and B are of different domains, they can intertwine in the binlog arbitrary. Could be A,B on one server and B,A on another. At the same time if the user wants to stop at "a point" (e.g before to reach an expected error), (s)he just needs to specify one "atom" value. Oth the list type of the new option's value, like "("A,B")" ("()" are imaginary of course), describe pretty distinctly a multi-domain state "minus one gtid from each domain". That's what the original implementation aimed to capture. And that's what I wanted to contrast.
            serg Sergei Golubchik made changes -

            Ok, I think I see what you're saying.
            And yes, I agree it's not obvious what should be the expected semantics for all kinds of different ways to specify a stop position.

            That is why I suggested to focus on concrete usecases.

            One is the motivation for my original implementation: Replicate until you've reached all of these GTIDs, inclusive.
            In this usecase, we expect each domain to stop independently, as we are in a multi-domain setup.

            The new usecase is: replicate to a specific point in this particular master's binlog, as far as possible without applying any of these GTIDs.

            It does feel somewhat arbitrary that SQL_AFTER_GTIDS stops each domain individually, while SQL_BEFORE_GTIDS stops all domains at the same point in the specific master's binlog. Reading the previous comments, it looks like I mostly suggested this to not use the same syntax as MYSQL, but then do something differently. But MySQL doesn't really have domains and independent streams in the binlog, so maybe that's a flawed argument on my part?

            As you say, when specifying a single domain for SQL_BEFORE_GTID, it doesn't matter. But if we specify the full slave position in UNTIL SQL_BEFORE_GTID, this will not stop at the point in the master's binlog corresponding to the @@gtid_slave_pos. It will stop at the point before whatever domain that did not have any more events, which could easily be to stop immediately if one domain was active for a while. I agree that would not be expected behaviour.

            So now I'm not sure what the correct behaviour should be

            Maybe stopping each domain individually also for SQL_BEFORE_GTIDS is better after all. At least then it's consistent with SQL_AFTER_GTIDs. I was worried about leaving domains stopped very far apart in the binlog, but maybe that's me being overly cautious, it doesn't hurt as such unless the binlogs get purged - and that's a general problem, not specific to START SLAVE UNTIL.

            And for the main usecases, stopping each domain individually for SQL_BEFORE_GTIDS does seem to do the right thing:

            • Specifying a single domain stops at the point just before that GTID.
            • Specifying a multi-domain @@gtid_slave_pos on the same master also stops at the single point just before that position, since all domains in @@gtid_slave_pos match that point.

            Just for this we still need that SQL_BEFORE_GTIDS continues replicating any not specified domain freely until all specified domains are stopped, so that specifying a single GTID as the stop point works.

            bnestere, Elkin, what do you think, does it make sense?

            knielsen Kristian Nielsen added a comment - Ok, I think I see what you're saying. And yes, I agree it's not obvious what should be the expected semantics for all kinds of different ways to specify a stop position. That is why I suggested to focus on concrete usecases. One is the motivation for my original implementation: Replicate until you've reached all of these GTIDs, inclusive. In this usecase, we expect each domain to stop independently, as we are in a multi-domain setup. The new usecase is: replicate to a specific point in this particular master's binlog, as far as possible without applying any of these GTIDs. It does feel somewhat arbitrary that SQL_AFTER_GTIDS stops each domain individually, while SQL_BEFORE_GTIDS stops all domains at the same point in the specific master's binlog. Reading the previous comments, it looks like I mostly suggested this to not use the same syntax as MYSQL, but then do something differently. But MySQL doesn't really have domains and independent streams in the binlog, so maybe that's a flawed argument on my part? As you say, when specifying a single domain for SQL_BEFORE_GTID, it doesn't matter. But if we specify the full slave position in UNTIL SQL_BEFORE_GTID, this will not stop at the point in the master's binlog corresponding to the @@gtid_slave_pos. It will stop at the point before whatever domain that did not have any more events, which could easily be to stop immediately if one domain was active for a while. I agree that would not be expected behaviour. So now I'm not sure what the correct behaviour should be Maybe stopping each domain individually also for SQL_BEFORE_GTIDS is better after all. At least then it's consistent with SQL_AFTER_GTIDs. I was worried about leaving domains stopped very far apart in the binlog, but maybe that's me being overly cautious, it doesn't hurt as such unless the binlogs get purged - and that's a general problem, not specific to START SLAVE UNTIL. And for the main usecases, stopping each domain individually for SQL_BEFORE_GTIDS does seem to do the right thing: Specifying a single domain stops at the point just before that GTID. Specifying a multi-domain @@gtid_slave_pos on the same master also stops at the single point just before that position, since all domains in @@gtid_slave_pos match that point. Just for this we still need that SQL_BEFORE_GTIDS continues replicating any not specified domain freely until all specified domains are stopped, so that specifying a single GTID as the stop point works. bnestere , Elkin , what do you think, does it make sense?

            Hm, no, I think my previous comment was wrong regarding the multi-domain SQL_BEFORE_GTIDS. Stopping each domain individually before the GTID in each domain doesn't stop at the point just before @@gtid_slave_pos.
            To do that, we need to stop after the GTID specified for each domain, except for the last one where we need to stop just before.

            It seems at the root of this problem is that GTID position in MariaDB is fundamentally inclusive, it denotes the point in the binlog just after the specified GTID. This matches poorly with a feature that tries to specify a binlog position using an exclusive semantics, denoting a position just before the GTID.

            knielsen Kristian Nielsen added a comment - Hm, no, I think my previous comment was wrong regarding the multi-domain SQL_BEFORE_GTIDS. Stopping each domain individually before the GTID in each domain doesn't stop at the point just before @@gtid_slave_pos. To do that, we need to stop after the GTID specified for each domain, except for the last one where we need to stop just before . It seems at the root of this problem is that GTID position in MariaDB is fundamentally inclusive, it denotes the point in the binlog just after the specified GTID. This matches poorly with a feature that tries to specify a binlog position using an exclusive semantics, denoting a position just before the GTID.
            Elkin Andrei Elkin added a comment - - edited

            knielsen, to your own critique of "Specifying a multi-domain @@gtid_slave_pos on the same master also stops at the single point", I've suggested to stick to a perception of multi-domain "list" as a vector "state".
            So when a list is SQL_BEFORE_GTIDS we stop receiving as the result a (executed gtid) state described by the list shy of the last gtid in each domain.

            While it's consistent of course, I think it's a practical property, esp when gaps in the seq_no are allowed (so one can't compute the inclusive seq-no).

            Elkin Andrei Elkin added a comment - - edited knielsen , to your own critique of "Specifying a multi-domain @@gtid_slave_pos on the same master also stops at the single point", I've suggested to stick to a perception of multi-domain "list" as a vector "state". So when a list is SQL_BEFORE_GTIDS we stop receiving as the result a (executed gtid) state described by the list shy of the last gtid in each domain. While it's consistent of course, I think it's a practical property, esp when gaps in the seq_no are allowed (so one can't compute the inclusive seq-no).
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            Roel Roel Van de Paar made changes -

            Created MDEV-32184 rpl.rpl_gtid_until_before_after_gtids regularly times out

            Roel Roel Van de Paar added a comment - Created MDEV-32184 rpl.rpl_gtid_until_before_after_gtids regularly times out
            Roel Roel Van de Paar made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            Labels Preview_11.3
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -

            bnestere OK to push

            Roel Roel Van de Paar added a comment - bnestere OK to push
            Roel Roel Van de Paar made changes -
            Assignee Roel Van de Paar [ roel ] Brandon Nesterenko [ JIRAUSER48702 ]
            Status In Testing [ 10301 ] Stalled [ 10000 ]
            Roel Roel Van de Paar made changes -
            Comment [ Testing completed ]

            Pushed into 11.3 as 0c1bf5e24

            bnestere Brandon Nesterenko added a comment - Pushed into 11.3 as 0c1bf5e24
            bnestere Brandon Nesterenko made changes -
            Fix Version/s 11.3.1 [ 29416 ]
            Fix Version/s 11.3 [ 28565 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 201658 201791 122248
            Zendesk active tickets 201658 201791
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk active tickets 201658 201791 201658

            People

              bnestere Brandon Nesterenko
              Richard Richard Stracke
              Votes:
              2 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.