Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7249

Performance problem in parallel replication with multi-level slaves

    XMLWordPrintable

Details

    Description

      In MariaDB 10.0, the primary way to get parallelism on the slave is applying
      batches transactions that group committed together on the master. Thus, to get
      good parallelism, it is necessary to have many transactions in each group
      commit.

      However, when using a multi-level replication hierarchy, like M->S1->S2, the
      group commits on the slave S1 are done independently, they do not necessarily
      match the group commits on M. Thus, it is easily possible that the groups on
      S1 will be smaller than groups on M, which reduces parallelism on S2 compared
      to on S1, possibly causing S2 to not be able to keep up.

      However, on S1, we often in fact know that some transactions T1, T2, T3 were
      group committed together on the master, and thus very likely could group
      commit together on the slave also (we know this as long as the I/O thread has
      had time to fetch all transactions from the master). Thus, we could have some
      heuristics to wait more aggressively for all of T1, T2, T3 to queue up for
      group commit before committing T1. This would preserve more of the group
      commit batches on the original master M, making S2 able to have parallelism
      similar to S1.

      Another possibility to increase parallelism on S2 is to utilise
      -binlog-commit-wait* options to delay commits slightly and increase group
      commit batch sizes. The slave S1 is able to do group commit of transactions T1
      and T2, even if T1 and T2 are in different group commit batches, and so could
      not replicate their query execution in parallel.

      The -binlog-commit-wait* options can be particularly effective on a slave,
      as we have future transactions available in the relay log. So delaying commit
      of T1 does not delay starting T2, unlike on the master where an application
      may be waiting for T1 to commit before initiating T2. Thus, in theory, S1
      could achieve very large group commit batches without any reduction in
      throughput; the only visible effect would be a moderate increase in the
      latency before an application sees a transaction on S1.

      However, this theory fails if T2 has a row lock conflict on T1. Then T2 will
      have to wait for T1 to commit. So if T1 has a high --binlog-commit-wait-usec
      delay, then the slave will waste a lot of time waiting. Thus, increasing
      --binlog-commit-wait-usec is currently dangerous on a slave, as depending on
      the precise application load it might cause the slave to lag behind.

      We could avoid this problem, as the slave in fact already has the information
      that T2 is waiting for a row lock of T1. (This information is provided by
      InnoDB, and is needed to break a possible deadlock if T2 would be waiting for
      a later T3). So whenever we see T2 waiting for T1, we could notify the group
      commit code to abort any --binlog-commit-wait-usec delay and group commit
      immediately. This way, we avoid stalling replication on a high
      --binlog-commit-wait-usec value, and still are able to collect large batches
      of group commits when possible.

      We could have --binlog-commit-wait-heuristics=follow_master_commit|detect_conflict.
      follow_master_commit would break the wait if the group commit is as large as
      the one from the master. detect_conflict would break the wait if a row lock
      conflict is detected. If too large a change for GA, we could have it off by
      default in 10.0 and default to detect_conflict in 10.1.

      We have a user who tested and was able to get very good parallel replication
      speedup on S1, but was limited on S2 due to these issue.

      Attachments

        Issue Links

          Activity

            People

              knielsen Kristian Nielsen
              knielsen Kristian Nielsen
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.