Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20769

MASTER_GTID_WAIT to work for replicated galera GTIDS

Details

    Description

      To solve the ability to handle a replication delay (async or galera) in a variety of replication topologies, without an application needing to know the topology (beyond reader/writer nodes), MASTER_GTID_WAIT should wait for both async and galera GTID replication to be applied.

      The WSREP_LAST_WRITTEN_GTID/WSREP_LAST_SEEN_GTID/ WSREP_SYNC_WAIT_UPTO_GTID that got dumped in MariaDB 10.4/ Galera-4 fails to consider async replication topologies exist with Galera or understand the deep frustration in MDEV-20720 nor does it try to correct it (https://github.com/MariaDB/server/pull/1317).

      10.4 target based on [~ratzpo] comment https://lists.launchpad.net/maria-developers/msg11691.html.

      Attachments

        Issue Links

          Activity

            mkaruza Does current Galera GTID wait mechanism full-will this requirement?

            jplindst Jan Lindström (Inactive) added a comment - mkaruza Does current Galera GTID wait mechanism full-will this requirement?
            mkaruza Mario Karuza (Inactive) added a comment - - edited

            There is galera_sync_wait_upto test which test wait functionality with and without binlog.

            WSREP_LAST_WRITTEN_GTID/WSREP_LAST_SEEN_GTID/ WSREP_SYNC_WAIT_UPTO_GTID used before galera UUID now it is based on GTID format so there was no functionality change.

            So to conclude, with this changes we reduced "noise" seqnos which galera could produce, for example galera could increase seqno on any "internal" mechanism while now GTID's correspond to committed transactions. If there is real life scenarios which are not covered with current implementation they can be reported and we will look into way to improve this functionality.

            mkaruza Mario Karuza (Inactive) added a comment - - edited There is galera_sync_wait_upto test which test wait functionality with and without binlog. WSREP_LAST_WRITTEN_GTID/WSREP_LAST_SEEN_GTID/ WSREP_SYNC_WAIT_UPTO_GTID used before galera UUID now it is based on GTID format so there was no functionality change. So to conclude, with this changes we reduced "noise" seqnos which galera could produce, for example galera could increase seqno on any "internal" mechanism while now GTID's correspond to committed transactions. If there is real life scenarios which are not covered with current implementation they can be reported and we will look into way to improve this functionality.
            ralf.gebhardt Ralf Gebhardt added a comment -

            Hi danblack, can this task be closed from your point of view, given that MDEV-20720 is closed in 10.5?

            ralf.gebhardt Ralf Gebhardt added a comment - Hi danblack , can this task be closed from your point of view, given that MDEV-20720 is closed in 10.5?
            danblack Daniel Black added a comment -

            MDEV-20720 comment was the described usage scenario and the galera only implementation is covered by the WSREP* functions as implemented. Extending this to a scenario where a classical replication is the the mix (per 3rd bullet point) would require a different set of SQL calls.

            galera_sync_wait_upto has binlog enabled, however it tests only WSREP_SYNC_WAIT_UPTO_GTID doesn't have a async topology

            An application dealing with a mixed replication topology to handle an up to date read from something previously written would be:

             if (connection.{function to determine if this is a classical slave}):
               r=connection.query('SELECT MASTER_GTID_WAIT(%s,0.01)', expect_gtid)
            else
               r=connection.query('SELECT WSREP_SYNC_WAIT_UPTO_GTID(%s,0.01)', expect_gtid)
             
            if r[0].val[0] == -1:
               connection=master_connection
            

            The trouble is the function to determine if this is a classical slave is:

            • would still be wrong if the transaction that is written from a galera member that is a peer of a replication slave; and
            • requires more privileges that a standard application user; or
            • requires dependence of querying an API, (orchestrator, maxscale, proxysql)
            • some other fragile configuration different between application understanding of a topology and reality.

            Maybe the code should be:

               r=connection.query('SELECT MASTER_GTID_WAIT(%s,0.01), WSREP_SYNC_WAIT_UPTO_GTID(%s,0.01)', expect_gtid, expect_gtid)
             
            if r[0].val[0] == -1 and r[0].val[1] == -1:
               connection=master_connection
            

            However this is going to wait for both, right?

            The horizontal scaling of galera and mariadb classical replication is one of the great things about MariaDB. The introduction of `MASTER_GTID_WAIT` back in 10.0 meant a basic TCP load balancer between client and DB server could facilitate a greater utilization of read only replica servers which still giving the client application a consistent view of their data, despite the async delays.

            Application frameworks like django and drupal have multi databases sources and and easy mechanism to extend this account read/write splits regardless of topology would be useful in producing an ecosystem compatible product.

            So I'm asking for a single functionality `GTID_WAIT` function (however named - WAIT_FOR_EXECUTED_GTID_SET same as MySQL for consistent community ecosystem use) that doesn't depend on the underlying technology which as far as I can tell, doesn't exist yet. Delivering a server that has a historically fractured view of GTID is well on the way to be fixed with MDEV-20720 however the one GTID needs to continue to permeate though all functions in how MariaDB delivers its product hence this MDEV.

            This would put MariaDB functionality on equal footing with MySQL's WAIT_FOR_EXECUTED_GTID_SET which notably doesn't have different functions for group replication vs classical replication because to the end application, it doesn't, and shouldn't, actually matter.

            danblack Daniel Black added a comment - MDEV-20720 comment was the described usage scenario and the galera only implementation is covered by the WSREP* functions as implemented. Extending this to a scenario where a classical replication is the the mix (per 3rd bullet point) would require a different set of SQL calls. galera_sync_wait_upto has binlog enabled, however it tests only WSREP_SYNC_WAIT_UPTO_GTID doesn't have a async topology An application dealing with a mixed replication topology to handle an up to date read from something previously written would be: if (connection.{function to determine if this is a classical slave}): r = connection.query( 'SELECT MASTER_GTID_WAIT(%s,0.01)' , expect_gtid) else r = connection.query( 'SELECT WSREP_SYNC_WAIT_UPTO_GTID(%s,0.01)' , expect_gtid)   if r[ 0 ].val[ 0 ] = = - 1 : connection = master_connection The trouble is the function to determine if this is a classical slave is: would still be wrong if the transaction that is written from a galera member that is a peer of a replication slave; and requires more privileges that a standard application user; or requires dependence of querying an API, (orchestrator, maxscale, proxysql) some other fragile configuration different between application understanding of a topology and reality. Maybe the code should be: r = connection.query( 'SELECT MASTER_GTID_WAIT(%s,0.01), WSREP_SYNC_WAIT_UPTO_GTID(%s,0.01)' , expect_gtid, expect_gtid)   if r[ 0 ].val[ 0 ] = = - 1 and r[ 0 ].val[ 1 ] = = - 1 : connection = master_connection However this is going to wait for both, right? The horizontal scaling of galera and mariadb classical replication is one of the great things about MariaDB. The introduction of `MASTER_GTID_WAIT` back in 10.0 meant a basic TCP load balancer between client and DB server could facilitate a greater utilization of read only replica servers which still giving the client application a consistent view of their data, despite the async delays. Application frameworks like django and drupal have multi databases sources and and easy mechanism to extend this account read/write splits regardless of topology would be useful in producing an ecosystem compatible product. So I'm asking for a single functionality `GTID_WAIT` function (however named - WAIT_FOR_EXECUTED_GTID_SET same as MySQL for consistent community ecosystem use) that doesn't depend on the underlying technology which as far as I can tell, doesn't exist yet. Delivering a server that has a historically fractured view of GTID is well on the way to be fixed with MDEV-20720 however the one GTID needs to continue to permeate though all functions in how MariaDB delivers its product hence this MDEV. This would put MariaDB functionality on equal footing with MySQL's WAIT_FOR_EXECUTED_GTID_SET which notably doesn't have different functions for group replication vs classical replication because to the end application, it doesn't, and shouldn't, actually matter.

            I have discussed with Daniel his specific scenario which is: async replication between 2 galera clusters. Currently WSREP GTID functions will work in cluster where node is, so there is no possibility to use WSREP_SYNC_WAIT_UPTO with GTID that are not in same cluster (i.e. one can not wait for GTID's defined in other cluster connected with async replication).

            So to synchronize 2 clusters with async replication, native functions should be used.

            For reference one can look into similar situation which we test with galera_3nodes.galera_gtid_2_cluster.

            Anyhow, it looks that providing universal sync wait for all topologies could be beneficial - but this it outside of current GTID scope.

            mkaruza Mario Karuza (Inactive) added a comment - I have discussed with Daniel his specific scenario which is: async replication between 2 galera clusters. Currently WSREP GTID functions will work in cluster where node is, so there is no possibility to use WSREP_SYNC_WAIT_UPTO with GTID that are not in same cluster (i.e. one can not wait for GTID's defined in other cluster connected with async replication). So to synchronize 2 clusters with async replication, native functions should be used. For reference one can look into similar situation which we test with galera_3nodes.galera_gtid_2_cluster . Anyhow, it looks that providing universal sync wait for all topologies could be beneficial - but this it outside of current GTID scope.

            ralf.gebhardt MDEV-20720 is 10.5, and this talks about 10.4. Furthermore, based on Mario you can wait using native gtids inside a cluster. If async replication is used then on second cluster it seems you can't wait gtid originated from first cluster using .WSREP_SYNC_WAIT_UPTO you need to use native one. If we have customers requesting universal sync wait for all topologies, it is a feature request.

            janlindstrom Jan Lindström added a comment - ralf.gebhardt MDEV-20720 is 10.5, and this talks about 10.4. Furthermore, based on Mario you can wait using native gtids inside a cluster. If async replication is used then on second cluster it seems you can't wait gtid originated from first cluster using .WSREP_SYNC_WAIT_UPTO you need to use native one. If we have customers requesting universal sync wait for all topologies, it is a feature request.

            Hi sysprg!

            From the discussion above, it reads as though the desired functionality exists. Can you check, and if so, close this issue.

            bnestere Brandon Nesterenko added a comment - Hi sysprg ! From the discussion above, it reads as though the desired functionality exists. Can you check, and if so, close this issue.

            People

              sysprg Julius Goryavsky
              danblack Daniel Black
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.