Details

    Description

      Idea

      The purpose of this task is to create an easy-to-use facility for setting up a
      new MariaDB replication slave.

      Setting up a new slave currently involves: 1) installing MariaDB with initial
      database; 2) point the slave to the master with CHANGE MASTER TO; 3) copying
      initial data from the master to the slave; and 4) starting the slave with
      START SLAVE. The idea is to automate step (3), which currently needs to be
      done manually.

      The syntax could be something as simple as

      LOAD DATA FROM MASTER

      This would then connect to the master that is currently configured. It will
      load a snapshot of all the data on the master, and leave the slave position at
      the point of the snapshot, ready for START SLAVE to continue replication from
      that point.

      Implementation:

      The idea is to do this non-blocking on the master, in a way that works for any
      storage engine. It will rely on row-based replication to be used between the
      master and the slave.

      At the start of LOAD DATA FROM MASTER, the slave will enter a special
      provisioning mode. It will start replicating events from the master at the
      master's current position.

      The master dump thread will send binlog events to the slave as normal. But in
      addition, it will interleave a dump of all the data on the master contained in
      tables, views, or stored functions. Whenever the dump thread would normally go
      to sleep waiting for more data to arrive in the binlog, the dump thread will
      instead send another chunk of data in the binlog stream for the slave to apply.

      A "chunk of data" can be:

      • A CREATE OR REPLACE TABLE / VIEW / PROCEDURE / FUNCTION
      • A range of N rows (N=100, for example). Each successive chunk will do a
        range scan on the primary key from the end position of the last chunk.

      Sending data in small chunks avoids the need for long-lived table locks or
      transactions that could adversely affect master performance.

      The slave will connect in GTID mode. The master will send dumped chunks in a
      separate domain id, allowing the slave to process chunks in parallel with
      normal data.

      During the provisioning, all normal replication events from the master will
      arrive on the slave, and the slave will attempt to apply them locally. Some of
      these events will fail to apply, since the affected table or row may not yet
      have been loaded. In the provisioning mode, all such errors will be silently
      ignored. Proper locking (isolation mode, eg.) must be used on the master when
      fetching chunks, to ensure that updates for any row will always be applied
      correctly on the slave, either in a chunk, or in a later row event.

      In order to make the first version of this feature feasible to implement in a
      reasonable amount of time, it should set a number of reasonable restrictions
      (which could be relaxed in a later version of the feature):

      • Give up with an error if the slave is not configured for GTID mode
        (MASTER_USE_GTID != NO).
      • Give up with error if the slave receives any event in statement-based
        binlogging (so the master must be running in row-based replication mode,
        and no DDL must be done while the provisioning is running).
      • Give up with an error if the master has a table without primary key.
      • Secondary indexes will be enabled during the provisioning; this means that
        tables with large secondary indexes could be expensive to provision.

      Attachments

        Issue Links

          Activity

            stephane@skysql.com VAROQUI Stephane added a comment - - edited

            There is an other strategy that would enable resynchronization of a table as well

            1 - Create DDL on slave
            2 - Start the Replication if needed in modify slave_exec_mode=IDEMPOTENT
            3 - INSERT IGNORE in chunk on slave

            slave_exec_mode=LOADING

            IDEMPOTENT mode can not be use in this scenario because UPDATE ROW EVENTS is silently ignored on KEY NOT FOUND , the LOADING mode would not ignore and will INSERT ROW EVENT on KEY NOT found

            A special DELETE row log is needed to be replayed at the end of the dirty dump

            This would enable a RESYNC TABLE FROM MASTER command

            stephane@skysql.com VAROQUI Stephane added a comment - - edited There is an other strategy that would enable resynchronization of a table as well 1 - Create DDL on slave 2 - Start the Replication if needed in modify slave_exec_mode=IDEMPOTENT 3 - INSERT IGNORE in chunk on slave slave_exec_mode=LOADING IDEMPOTENT mode can not be use in this scenario because UPDATE ROW EVENTS is silently ignored on KEY NOT FOUND , the LOADING mode would not ignore and will INSERT ROW EVENT on KEY NOT found A special DELETE row log is needed to be replayed at the end of the dirty dump This would enable a RESYNC TABLE FROM MASTER command

            One option could be MDEV-8925, protocol extension. Another option would be to reuse galera SST scripts (may be generalizing them appropriately).

            serg Sergei Golubchik added a comment - One option could be MDEV-8925 , protocol extension. Another option would be to reuse galera SST scripts (may be generalizing them appropriately).
            ibrararshad92 Ibrar Arshad added a comment -

            Hi,

            I am a prospective participant of Google summer of code 2017 and I am very interested in taking up this project. I have read the summary on the ticket and have achieved a fair understanding of the problem to be solved even if the implementation details are not clear(due to unfamiliarity with the codebase). The use-case as I understand is that we want the slave to auto-replicate the data from master once pointed the master and we want to do it in such a manner that the current as well as the past modifications to data on the master are relayed to the slave parallely. I have a few questions related to the implementation suggested in the summary:

            1. Does an infrastructure already exist for exchanging the data between 2 network nodes which can be utilized to implement this data exchange between master and the slave? What I mean is that is there an existing network socket wrapper library or something of the sort to take care of serializaiton/deserialization of data and transmitting it?
            2. As per my understanding, the model of data transfer is PUSH based from master's perspective. So after slave connects with the master, master will launch some sort of new process/thread/fiber to send data to the slave based of GTID location. Can we make it pull based instead? where slave asks for new data and master obliges? This could be some sort of remote method invocation.

            I would like to discuss the project further in detail before submitting a proposal but I don't know who to contact specifically since no name is mentioned in the mentor field of this project on the MariaDb GSOC page. If the project is happening, may I please know who is mentoring it? Also, it would be great if someone can respond to my queries so that we can discuss the details. Thanks.

            ibrararshad92 Ibrar Arshad added a comment - Hi, I am a prospective participant of Google summer of code 2017 and I am very interested in taking up this project. I have read the summary on the ticket and have achieved a fair understanding of the problem to be solved even if the implementation details are not clear(due to unfamiliarity with the codebase). The use-case as I understand is that we want the slave to auto-replicate the data from master once pointed the master and we want to do it in such a manner that the current as well as the past modifications to data on the master are relayed to the slave parallely. I have a few questions related to the implementation suggested in the summary: Does an infrastructure already exist for exchanging the data between 2 network nodes which can be utilized to implement this data exchange between master and the slave? What I mean is that is there an existing network socket wrapper library or something of the sort to take care of serializaiton/deserialization of data and transmitting it? As per my understanding, the model of data transfer is PUSH based from master's perspective. So after slave connects with the master, master will launch some sort of new process/thread/fiber to send data to the slave based of GTID location. Can we make it pull based instead? where slave asks for new data and master obliges? This could be some sort of remote method invocation. I would like to discuss the project further in detail before submitting a proposal but I don't know who to contact specifically since no name is mentioned in the mentor field of this project on the MariaDb GSOC page. If the project is happening, may I please know who is mentoring it? Also, it would be great if someone can respond to my queries so that we can discuss the details. Thanks.

            Hi,

            My name is Vishwanath Bharadwaj and I'm a student at Amrita School of engineering, India. I have fair knowledge on clustered system, Socket programming and data transfer between the nodes. After reading the description I was very much fascinated with the idea because I have experienced the same problem while working on Hadoop. This can be solved by using same mechanism Hadoop uses to replicate its nodes. I am very much interested in doing this project. I would also like to know if this project is being done by any other student so that I can opt another project, I'm interested in, from the list. I would like to discuss further proceeding of the project if you can provide me with the details of the mentor.

            Thank you.

            vishwanath_bharadwaj Vishwanath Bharadwaja added a comment - Hi, My name is Vishwanath Bharadwaj and I'm a student at Amrita School of engineering, India. I have fair knowledge on clustered system, Socket programming and data transfer between the nodes. After reading the description I was very much fascinated with the idea because I have experienced the same problem while working on Hadoop. This can be solved by using same mechanism Hadoop uses to replicate its nodes. I am very much interested in doing this project. I would also like to know if this project is being done by any other student so that I can opt another project, I'm interested in, from the list. I would like to discuss further proceeding of the project if you can provide me with the details of the mentor. Thank you.

            Hey,
            If you could provide me with an email I can contact to, it would be really helpful.
            Thank you

            vishwanath_bharadwaj Vishwanath Bharadwaja added a comment - Hey, If you could provide me with an email I can contact to, it would be really helpful. Thank you

            Use maria-developers@lists.launchpad.net, as GSoC page says.

            serg Sergei Golubchik added a comment - Use maria-developers@lists.launchpad.net, as GSoC page says.

            Thank you Mr. Sergei

            vishwanath_bharadwaj Vishwanath Bharadwaja added a comment - Thank you Mr. Sergei

            There are some important issues in this review of an earlier attempt:

            https://lists.launchpad.net/maria-developers/msg08867.html

            knielsen Kristian Nielsen added a comment - There are some important issues in this review of an earlier attempt: https://lists.launchpad.net/maria-developers/msg08867.html
            ibrararshad92 Ibrar Arshad added a comment -

            Thanks for pointing to that Kristian. I will definitely look in to that. Also, I sent a query to the dev mailing list earlier, I'll appreciate if you can clarify a few things there.

            ibrararshad92 Ibrar Arshad added a comment - Thanks for pointing to that Kristian. I will definitely look in to that. Also, I sent a query to the dev mailing list earlier, I'll appreciate if you can clarify a few things there.

            Thank you Kristian. It helped me gain some insight on how the semaphores were used in the code and gave me a clear view of what we are trying to achieve here.
            Thank you

            vishwanath_bharadwaj Vishwanath Bharadwaja added a comment - Thank you Kristian. It helped me gain some insight on how the semaphores were used in the code and gave me a clear view of what we are trying to achieve here. Thank you
            danblack Daniel Black added a comment -

            submission are open if you'd like to write a proposal https://summerofcode.withgoogle.com

            danblack Daniel Black added a comment - submission are open if you'd like to write a proposal https://summerofcode.withgoogle.com

            IMO the best way to implement this is KISS.

            Since MariaDB already contains Galera libraries, the SST portion could be leveraged to allow async slave creation after a fresh install.

            Flow:

            New server is provisioned
            MariaDB Installed
            Server is brought up
            root user issues CHANGE MASTER TO ...... MASTER_SST=1;
            Server initiates SST utilizing replication user/pass
            After server comes back up, it connects as an asynchronous or semi-sync slave

            manjot Manjot Singh (Inactive) added a comment - IMO the best way to implement this is KISS. Since MariaDB already contains Galera libraries, the SST portion could be leveraged to allow async slave creation after a fresh install. Flow: New server is provisioned MariaDB Installed Server is brought up root user issues CHANGE MASTER TO ...... MASTER_SST=1; Server initiates SST utilizing replication user/pass After server comes back up, it connects as an asynchronous or semi-sync slave

            You can found SST like reseeding in replication-manager 1.2 feature, this task was intended for a improving logical replication, cross engine and parallel replication. Implementation can be open to enable other type of logical operation like checksum and sync table using an existing replication stream and to possibly improve binlog management when they are used for big table seeding or reseeding. Note that Galera library does not implement SST but a collection of bash script with external dependencies and tools like xtrabackup and mariadb-backup, In the other hand the lack of such generic API on replication force galera or similar implementation to maintain AKA gcache and to use external tools to reseed .

            stephane@skysql.com VAROQUI Stephane added a comment - You can found SST like reseeding in replication-manager 1.2 feature, this task was intended for a improving logical replication, cross engine and parallel replication. Implementation can be open to enable other type of logical operation like checksum and sync table using an existing replication stream and to possibly improve binlog management when they are used for big table seeding or reseeding. Note that Galera library does not implement SST but a collection of bash script with external dependencies and tools like xtrabackup and mariadb-backup, In the other hand the lack of such generic API on replication force galera or similar implementation to maintain AKA gcache and to use external tools to reseed .

            My point is lets leverage those scripts and what works already to have a quick feature win which we then work to make better over time.

            manjot Manjot Singh (Inactive) added a comment - My point is lets leverage those scripts and what works already to have a quick feature win which we then work to make better over time.

            I like the concept of this idea, but it could take a week+ to restore the replica.

            I like the idea of leveraging galera's pluggable SST method (ie mysqldump,rsync, mariabackup can all be used, with mariabackup preferred). This would allow much faster restore of 1TB+ databases.

            Great ideas overall.

            manjot Manjot Singh (Inactive) added a comment - I like the concept of this idea, but it could take a week+ to restore the replica. I like the idea of leveraging galera's pluggable SST method (ie mysqldump,rsync, mariabackup can all be used, with mariabackup preferred). This would allow much faster restore of 1TB+ databases. Great ideas overall.

            If MDEV-21106 were implemented in MariaDB Server, then the clone plugin could be used for this.

            GeoffMontee Geoff Montee (Inactive) added a comment - If MDEV-21106 were implemented in MariaDB Server, then the clone plugin could be used for this.

            I really don't think it's a matter of week because // replication is faster than a mysqldump restore and can saturate any bandwidth, more over, after 6 years MDEV-11675 is still not progressing a variation of this worklog for any non blocking ddl would have cover MDEV-11675 in many cases , just by producing a copy table and atomic rename a la pecona toolkit and no one would live with day delayed replications because of DDL

            stephane@skysql.com VAROQUI Stephane added a comment - I really don't think it's a matter of week because // replication is faster than a mysqldump restore and can saturate any bandwidth, more over, after 6 years MDEV-11675 is still not progressing a variation of this worklog for any non blocking ddl would have cover MDEV-11675 in many cases , just by producing a copy table and atomic rename a la pecona toolkit and no one would live with day delayed replications because of DDL
            stephane@skysql.com VAROQUI Stephane added a comment - - edited

            This task well designed can enable any blocking DDL to be done without blocking slave , a table copy can be created injected row in chunk in an other replication domain and rename in the main domain at the end

            stephane@skysql.com VAROQUI Stephane added a comment - - edited This task well designed can enable any blocking DDL to be done without blocking slave , a table copy can be created injected row in chunk in an other replication domain and rename in the main domain at the end
            rjasdfiii Rick James added a comment -

            I like the feature, but think there are some details to work out.

            • TRIGGERS will not be sent, correct?
            • Does the default, hidden, InnoDB PK qualify as a PK in your rule about what to do if a table does not have a PK?
            • For the rows that fail to apply during provisioning, how will they be applied later? Won't this lead to applying changes out of order? Think about 2 UPDATEs to the same row, but the first is not applied because the row has not arrived yet. Hence the second update is applied first. And what about transactions where some of the rows exist? I think the 'real' replication stream cannot be applied until all the data has arrived.
            • The provisioning stream will (potentially) be too large to keep in relay logs, hence the logs may need to be purged. Then, non-applied regular replication items should not be interspersed with that stream. Either have a separate logging mechanism, or copy the regular stream over as the provisioning is purged. (OK, the first version may require lots of disk on the Replica.)
            • How does Galera's pluggable SST address the above issues?
            rjasdfiii Rick James added a comment - I like the feature, but think there are some details to work out. TRIGGERS will not be sent, correct? Does the default, hidden, InnoDB PK qualify as a PK in your rule about what to do if a table does not have a PK? For the rows that fail to apply during provisioning, how will they be applied later? Won't this lead to applying changes out of order? Think about 2 UPDATEs to the same row, but the first is not applied because the row has not arrived yet. Hence the second update is applied first. And what about transactions where some of the rows exist? I think the 'real' replication stream cannot be applied until all the data has arrived. The provisioning stream will (potentially) be too large to keep in relay logs, hence the logs may need to be purged. Then, non-applied regular replication items should not be interspersed with that stream. Either have a separate logging mechanism, or copy the regular stream over as the provisioning is purged. (OK, the first version may require lots of disk on the Replica.) How does Galera's pluggable SST address the above issues?
            monty Michael Widenius added a comment - - edited

            The current solution for this is to use MaxScale 22.08, which can do automatic rebuild of server.
            More information at MXS-2542.

            monty Michael Widenius added a comment - - edited The current solution for this is to use MaxScale 22.08, which can do automatic rebuild of server . More information at MXS-2542 .

            It's completely fair to not prioritise this, but why close it? It's a very valuable task, and a feature that anyone would expect to find in a replication system, if it wasn't because we've gotten used to not having it for so many years with mysql/mariadb replication.

            knielsen Kristian Nielsen added a comment - It's completely fair to not prioritise this, but why close it? It's a very valuable task, and a feature that anyone would expect to find in a replication system, if it wasn't because we've gotten used to not having it for so many years with mysql/mariadb replication.

            People

              knielsen Kristian Nielsen
              knielsen Kristian Nielsen
              Votes:
              31 Vote for this issue
              Watchers:
              40 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.