Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33551

Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency

Details

    Description

      Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency

      When using semi-sync replication with rpl_semi_sync_master_wait_point=AFTER_COMMIT, the performance of the primary can significantly reduce to about 10% of AFTER_SYNC's performance for workloads with many concurrent users executing transactions. See the attached graph, where the "Async" bar reports the performance of a primary using asynchronous replication, the "Semi-sync AFTER_COMMIT" and "Semi-sync AFTER_SYNC" reports the performance of the primary using the respective rpl_semi_sync_master_wait_point variable, and "Semi-sync GROUP_ACK" is a prototype of the work outlined by MDEV-33491.

      It can be seen that the performance of the AFTER_COMMIT mode of semi-sync is many times worse than that of the AFTER_SYNC mode. This is because all connections on the primary share the same cond_wait variable/mutex pair, so any time an ACK is recieved from a replica, all waiting connections are awoken to check if the ACK was for itself, which is done in a mutual exclusion zone. See the attached Semi-sync Group Ack Proposal PDF for more details.

      Instead, each connection should use its own condition variable (e.g. THD::COND_wakeup_ready), and the ACK receiver thread should only signal connections which have been ACKed for wakeup.

      Attachments

        Issue Links

          Activity

            bnestere Brandon Nesterenko created issue -
            bnestere Brandon Nesterenko made changes -
            Field Original Value New Value
            bnestere Brandon Nesterenko made changes -
            Attachment semisync_patched.png [ 73209 ]
            bnestere Brandon Nesterenko made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            bnestere Brandon Nesterenko made changes -
            Attachment Semi-sync Group Ack Proposal.pdf [ 73210 ]
            bnestere Brandon Nesterenko made changes -
            Description Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency

            When using semi-sync replication with rpl_semi_sync_master_wait_point=AFTER_COMMIT, the performance of the primary can significantly reduce to about 10% of AFTER_SYNC's performance for workloads with many concurrent users executing transactions. See the attached graph, where the "Async" bar reports the performance of a primary using asynchronous replication, the "Semi-sync AFTER_COMMIT" and "Semi-sync AFTER_SYNC" reports the performance of the primary using the respective rpl_semi_sync_master_wait_point variable, and "Semi-sync GROUP_ACK" is a prototype of the work outlined by MDEV-33491.

            It can be seen that the performance of the AFTER_COMMIT mode of semi-sync is many times worse than that of the AFTER_SYNC mode. This is because all connections on the primary share the same cond_wait variable/mutex pair, so any time an ACK is recieved from a replica, all waiting connections are awoken to check if the ACK was for itself, which is done in a mutual exclusion zone.

            Instead, each connection should use its own condition variable (e.g. THD::COND_wakeup_ready), and the ACK receiver thread should only signal connections which have been ACKed for wakeup.
            Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency

            When using semi-sync replication with rpl_semi_sync_master_wait_point=AFTER_COMMIT, the performance of the primary can significantly reduce to about 10% of AFTER_SYNC's performance for workloads with many concurrent users executing transactions. See the attached graph, where the "Async" bar reports the performance of a primary using asynchronous replication, the "Semi-sync AFTER_COMMIT" and "Semi-sync AFTER_SYNC" reports the performance of the primary using the respective rpl_semi_sync_master_wait_point variable, and "Semi-sync GROUP_ACK" is a prototype of the work outlined by MDEV-33491.

            It can be seen that the performance of the AFTER_COMMIT mode of semi-sync is many times worse than that of the AFTER_SYNC mode. This is because all connections on the primary share the same cond_wait variable/mutex pair, so any time an ACK is recieved from a replica, all waiting connections are awoken to check if the ACK was for itself, which is done in a mutual exclusion zone. See the attached Semi-sync Group Ack Proposal PDF for more details.

            Instead, each connection should use its own condition variable (e.g. THD::COND_wakeup_ready), and the ACK receiver thread should only signal connections which have been ACKed for wakeup.
            bnestere Brandon Nesterenko made changes -
            Assignee Brandon Nesterenko [ JIRAUSER48702 ] Kristian Nielsen [ knielsen ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            bnestere Brandon Nesterenko made changes -
            Attachment semisync_mdev_33551.png [ 73216 ]
            bnestere Brandon Nesterenko made changes -
            Assignee Kristian Nielsen [ knielsen ] Brandon Nesterenko [ JIRAUSER48702 ]
            bnestere Brandon Nesterenko made changes -
            Fix Version/s 10.6.18 [ 29627 ]
            Fix Version/s 10.6 [ 24028 ]
            Fix Version/s 10.11 [ 27614 ]
            Fix Version/s 11.0 [ 28320 ]
            Fix Version/s 11.1 [ 28549 ]
            Fix Version/s 11.3 [ 28565 ]
            Fix Version/s 11.2 [ 28603 ]
            Resolution Fixed [ 1 ]
            Status In Review [ 10002 ] Closed [ 6 ]
            JIraAutomate JiraAutomate made changes -
            Fix Version/s 10.11.8 [ 29630 ]
            Fix Version/s 11.0.6 [ 29628 ]
            Fix Version/s 11.1.5 [ 29629 ]
            Fix Version/s 11.2.4 [ 29631 ]
            bnestere Brandon Nesterenko made changes -
            Attachment semisync_mdev33551_patched.png [ 73298 ]
            elenst Elena Stepanova made changes -
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 164270
            knielsen Kristian Nielsen made changes -
            knielsen Kristian Nielsen made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            Labels crash
            ralf.gebhardt Ralf Gebhardt made changes -
            Labels crash

            People

              bnestere Brandon Nesterenko
              bnestere Brandon Nesterenko
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.