Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16509

MariaDB 10.1+ with wsrep_on=ON performs better when log_bin is enabled

Details

    Description

      Galera clusters do not necessary need binary logging enabled on nodes, so configuration with binary logging disabled is quite common. It turned out that after upgrade to 10.1.x, these configurations started to perform worse, as notable time is spent on keeping tc.log updates (in pre-10.1 versions tc.log was not maintained in such cases, see MDEV-16442).

      Simple sysbench tests may show this effect. For example, I've prepared the following table on my 2 cores old netbook running under Ubuntu 14.04:

      sysbench --test=/usr/local/share/sysbench/tests/include/oltp_legacy/common.lua --mysql-host=127.0.0.1 --db-driver=mysql --mysql-user=root --mysql-db=test --oltp-table-size=2500000 prepare
      

      Then I started single node cluster on the same datadir with the following command (all defaults unless specified):

      openxs@ao756:~/dbs/maria10.1$ bin/mysqld_safe --no-defaults --binlog_format=row --wsrep_provider=/usr/lib/libgalera_smm.so --wsrep_on=1 --wsrep_cluster_address=gcomm://127.0.0.1:4567 --wsrep-new-cluster &
      [1] 6991
       
      openxs@ao756:~/dbs/maria10.1$ bin/mysql -uroot -e"select @@log_bin, @@wsrep_on, @@wsrep_cluster_address"
      +-----------+------------+-------------------------+
      | @@log_bin | @@wsrep_on | @@wsrep_cluster_address |
      +-----------+------------+-------------------------+
      |         0 |          1 | gcomm://127.0.0.1:4567  |
      +-----------+------------+-------------------------+
      

      and the run sysbench test with N concurrent threads, where N in (4,8,16):

       sysbench --test=/usr/local/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-host=127.0.0.1 --db-driver=mysql --mysql-user=root --mysql-db=test --oltp-table-size=2500000 --report-interval=1 --max_requests=0 --max-time=300 --threads=*N* run
      

      I've got the following (low) TPS values:

      threads | TPS
      4       |  6.08
      8       |  8.46
      16      |  8.96
      

      Then I stopped the node and restarted it again with log_bin enabled:

      openxs@ao756:~/dbs/maria10.1$ bin/mysqld_safe --no-defaults --binlog_format=row --wsrep_provider=/usr/lib/libgalera_smm.so --wsrep_on=1 --wsrep_cluster_address=gcomm://127.0.0.1:4567 --wsrep-new-cluster --log_bin=1 &
       
      openxs@ao756:~/dbs/maria10.1$ 180618 13:46:14 mysqld_safe Logging to '/home/openxs/dbs/maria10.1/data/ao756.err'.
      180618 13:46:14 mysqld_safe Starting mysqld daemon with databases from /home/openxs/dbs/maria10.1/data
       
      openxs@ao756:~/dbs/maria10.1$ bin/mysql -uroot -e"select @@log_bin, @@wsrep_on, @@wsrep_cluster_address"
      +-----------+------------+-------------------------+
      | @@log_bin | @@wsrep_on | @@wsrep_cluster_address |
      +-----------+------------+-------------------------+
      |         1 |          1 | gcomm://127.0.0.1:4567  |
      +-----------+------------+-------------------------+
      

      I've got the following results, with TPS values still small, but notably higher:

      threads | TPS
      4       |  14.78
      8       |  15.00
      16      |  15.56
      

      It seems we should do something to reduce the impact of log_bin NOT specified or clearly recommend to set it explicitly in the KB articles for Galera nodes on 10.1.x+

      Side node: I do NOT see the same effect without wsrep_on = ON - there without log_bin enabled we run a bit faster.

      Attachments

        Issue Links

          Activity

            seppo Seppo Jaakola added a comment -

            I did some performance benchmarks with write only load to focus only in on binlog phase of execution.
            The write load consists on updates, both as autocommit and 1-5 updates inside transactions, and updates were for a large table, to avoid conflicts between test clients. Test clients raised from 1 -> 32, to see when all CPU cores were saturated

            Benchmarks were run against MariaDB 10.1 and 10.2 versions and for comparison, also for MySQL based 5.6 and 5.7 versions.
            I also used singe node only, and testing with log_bin = ON and OFF.
            I have set sync_binlog = 1 for all tests, and experimented innodb_flush_log_at_trx_commit values.

            Tests were run with my 4 core laptop, with SSD disk.

            My results show that there is practically no difference in performance with binlog enabled or disabled, with any of the tested MariaDB/MySQL versions, write rates varied between 600-1600/sec depending on test client count and other configuration, innodb_flush_log_at_trx_commit has highest impact on performance, overall.

            valerii your bechmark has probably high number of reads, but have you tested with more write intensive benchmarks? When testing with read related load, please make sure wsrep_sync_wait=0 to rule out read causality waits.
            Does your sysbench run saturate all CPUs? The transaction rate is really low, even for sysbench.

            seppo Seppo Jaakola added a comment - I did some performance benchmarks with write only load to focus only in on binlog phase of execution. The write load consists on updates, both as autocommit and 1-5 updates inside transactions, and updates were for a large table, to avoid conflicts between test clients. Test clients raised from 1 -> 32, to see when all CPU cores were saturated Benchmarks were run against MariaDB 10.1 and 10.2 versions and for comparison, also for MySQL based 5.6 and 5.7 versions. I also used singe node only, and testing with log_bin = ON and OFF. I have set sync_binlog = 1 for all tests, and experimented innodb_flush_log_at_trx_commit values. Tests were run with my 4 core laptop, with SSD disk. My results show that there is practically no difference in performance with binlog enabled or disabled, with any of the tested MariaDB/MySQL versions, write rates varied between 600-1600/sec depending on test client count and other configuration, innodb_flush_log_at_trx_commit has highest impact on performance, overall. valerii your bechmark has probably high number of reads, but have you tested with more write intensive benchmarks? When testing with read related load, please make sure wsrep_sync_wait=0 to rule out read causality waits. Does your sysbench run saturate all CPUs? The transaction rate is really low, even for sysbench.

            seppo, had you tried that exact sysbench test that I've used, with 4, 8 and 16 concurrent threads, sysbench running on the same box and single Galera node? Note that I've used default values for all variables besides those explicitly listed:

            bin/mysqld_safe --no-defaults --binlog_format=row --wsrep_provider=/usr/lib/libgalera_smm.so --wsrep_on=1 --wsrep_cluster_address=gcomm://127.0.0.1:4567 --wsrep-new-cluster --log_bin=1 &
            

            wsrep_sync_wait is 0 by default and probably should not matter for a single node setup.

            These tests were run on an old and slow laptop, 2 cores, HDD. Surely TPS rate is too low. But the difference with log_bin=1 was very obvious as long as have wsrep_on=1. I can surely repeat them again or try to find better hardware (no SSD here, sorry), but I'd like to see your results for literally the same test, first.

            valerii Valerii Kravchuk added a comment - seppo , had you tried that exact sysbench test that I've used, with 4, 8 and 16 concurrent threads, sysbench running on the same box and single Galera node? Note that I've used default values for all variables besides those explicitly listed: bin/mysqld_safe --no-defaults --binlog_format=row --wsrep_provider=/usr/lib/libgalera_smm.so --wsrep_on=1 --wsrep_cluster_address=gcomm://127.0.0.1:4567 --wsrep-new-cluster --log_bin=1 & wsrep_sync_wait is 0 by default and probably should not matter for a single node setup. These tests were run on an old and slow laptop, 2 cores, HDD. Surely TPS rate is too low. But the difference with log_bin=1 was very obvious as long as have wsrep_on=1. I can surely repeat them again or try to find better hardware (no SSD here, sorry), but I'd like to see your results for literally the same test, first.

            Tested with mariadb-10.1.38 and sysbench 1.0.15 using my laptop with Debian and SSD.

            • log_bin=default
              • 4 173.04
              • 8 162.69
              • 16 194.05
            • log_bin=1
              • 4 267.99
              • 8 239.30
              • 16 239.45

            Based on this there is clear difference is log_bin set or not. This is only single node performance so not much to say about e.g. 3 node galera cluster.

            jplindst Jan Lindström (Inactive) added a comment - - edited Tested with mariadb-10.1.38 and sysbench 1.0.15 using my laptop with Debian and SSD. log_bin=default 4 173.04 8 162.69 16 194.05 log_bin=1 4 267.99 8 239.30 16 239.45 Based on this there is clear difference is log_bin set or not. This is only single node performance so not much to say about e.g. 3 node galera cluster.
            rgpublic Ranjan Ghosh added a comment -

            log_bin=default seems a bit misleading to me. The default is: No log bin. If you specify log_bin=default then it means: log_bin with a filename called "default", same with "log_bin=1". Sorry for the nitpicking, but I just wanted to make crystal clear what's going on.

            Interestingly, as I already wrote under (possible duplicate?) MDEV-14339, I cannot really see any difference. Seppo Jaakola also didnt note any difference, so the question here seems to be: Why do some people see noticable differences and others don't?

            I'm using MariaDB 10.3.11, btw. According to MDEV-14339 this bug is also supposed to existing in recent versions, right?

            rgpublic Ranjan Ghosh added a comment - log_bin=default seems a bit misleading to me. The default is: No log bin. If you specify log_bin=default then it means: log_bin with a filename called "default", same with "log_bin=1". Sorry for the nitpicking, but I just wanted to make crystal clear what's going on. Interestingly, as I already wrote under (possible duplicate?) MDEV-14339 , I cannot really see any difference. Seppo Jaakola also didnt note any difference, so the question here seems to be: Why do some people see noticable differences and others don't? I'm using MariaDB 10.3.11, btw. According to MDEV-14339 this bug is also supposed to existing in recent versions, right?
            seppo Seppo Jaakola added a comment -

            My results with:
            sysbench 0.5, same parameters as with original
            ubuntu laptop, 4 cores, 16G ram, SSD disk
            MariaDB 10.3 source build, release binaries
            single node started with mtr

            TPS rate with binlog:
            1 thread 39.65
            4 threads 121.45
            8 threads 146.27
            16 threads 147.62

            TPS rate without binlog
            1 thread 41.18
            4 threads 120.39
            8 threads 160.81
            16 threads 165.86

            Sysbench OLTP test does ~25% writes, so most the test execution does not test binlogging at all
            sysbench and mariadb server were running on same laptop, CPU saturates between 8-16 thread tests

            seppo Seppo Jaakola added a comment - My results with: sysbench 0.5, same parameters as with original ubuntu laptop, 4 cores, 16G ram, SSD disk MariaDB 10.3 source build, release binaries single node started with mtr TPS rate with binlog: 1 thread 39.65 4 threads 121.45 8 threads 146.27 16 threads 147.62 TPS rate without binlog 1 thread 41.18 4 threads 120.39 8 threads 160.81 16 threads 165.86 Sysbench OLTP test does ~25% writes, so most the test execution does not test binlogging at all sysbench and mariadb server were running on same laptop, CPU saturates between 8-16 thread tests

            Tested this again with

            • Environment: ASUS ZenBook UX305 laptop with SSD and Debian Unstable,
            • Linux jan-laptop-asus 4.16.0-041600-generic #201804012230 SMP Sun Apr 1 22:31:39 UTC 2018 x86_64 GNU/Linux

            MariaDB 10.1.38 release build with
            commit 29c91e4219dc9dc113b51d0725b547424a52baaa (HEAD -> bb-10.1-galera, origin/bb-10.1-galera, 10.1)
            Merge: b22354680ef 169b9d0d4cf
            Author: Jan Lindström <jan.lindstrom@mariadb.com>
            Date:   Fri Jan 25 08:18:18 2019 +0200
             
                Merge branch 'codership-bb-10.1-MDEV-16509' into 10.1
            
            

            Server started by:

            default: ./mysqld --no-defaults --binlog_format=row --wsrep_provider=/usr/lib/libgalera_smm.so --wsrep_on=1 --wsrep_cluster_address=gcomm://127.0.0.1:4567 --wsrep-new-cluster --datadir=/home/jan/testdb --log-err=/home/jan/testdb/err.log
             
            log-bin: ./mysqld --no-defaults --binlog_format=row --wsrep_provider=/usr/lib/libgalera_smm.so --wsrep_on=1 --wsrep_cluster_address=gcomm://127.0.0.1:4567 --wsrep-new-cluster --datadir=/home/jan/testdb --log-err=/home/jan/testdb/err.log --log-bin
            

            sysbench: sysbench 1.0.15-49f1757 (using bundled LuaJIT 2.1.0-beta2)

            ./sysbench --mysql-host=127.0.0.1 --db-driver=mysql --mysql-user=root --mysql-db=test --oltp-table-size=2500000 --report-interval=1 --max_requests=0 --max-time=300 --test=/home/jan/mysql/sysbench/tests/include/oltp_legacy/oltp.lua prepare
            

            After this I shutdown server and restart

            ./sysbench --mysql-host=127.0.0.1 --db-driver=mysql --mysql-user=root --mysql-db=test --oltp-table-size=2500000 --report-interval=1 --max_requests=0 --max-time=300 --test=/home/jan/mysql/sysbench/tests/include/oltp_legacy/oltp.lua --threads=4 run
            

            Still seeing clear difference between default setting and log-bin setting:

            #threads	default		log-bin
            4		148.31		279.58
            8		157.40		238.48
            16		145.42		303.83
            

            jplindst Jan Lindström (Inactive) added a comment - Tested this again with Environment: ASUS ZenBook UX305 laptop with SSD and Debian Unstable, Linux jan-laptop-asus 4.16.0-041600-generic #201804012230 SMP Sun Apr 1 22:31:39 UTC 2018 x86_64 GNU/Linux MariaDB 10.1.38 release build with commit 29c91e4219dc9dc113b51d0725b547424a52baaa (HEAD -> bb-10.1-galera, origin/bb-10.1-galera, 10.1) Merge: b22354680ef 169b9d0d4cf Author: Jan Lindström <jan.lindstrom@mariadb.com> Date: Fri Jan 25 08:18:18 2019 +0200   Merge branch 'codership-bb-10.1-MDEV-16509' into 10.1 Server started by: default: ./mysqld --no-defaults --binlog_format=row --wsrep_provider=/usr/lib/libgalera_smm.so --wsrep_on=1 --wsrep_cluster_address=gcomm://127.0.0.1:4567 --wsrep-new-cluster --datadir=/home/jan/testdb --log-err=/home/jan/testdb/err.log   log-bin: ./mysqld --no-defaults --binlog_format=row --wsrep_provider=/usr/lib/libgalera_smm.so --wsrep_on=1 --wsrep_cluster_address=gcomm://127.0.0.1:4567 --wsrep-new-cluster --datadir=/home/jan/testdb --log-err=/home/jan/testdb/err.log --log-bin sysbench: sysbench 1.0.15-49f1757 (using bundled LuaJIT 2.1.0-beta2) ./sysbench --mysql-host=127.0.0.1 --db-driver=mysql --mysql-user=root --mysql-db=test --oltp-table-size=2500000 --report-interval=1 --max_requests=0 --max-time=300 --test=/home/jan/mysql/sysbench/tests/include/oltp_legacy/oltp.lua prepare After this I shutdown server and restart ./sysbench --mysql-host=127.0.0.1 --db-driver=mysql --mysql-user=root --mysql-db=test --oltp-table-size=2500000 --report-interval=1 --max_requests=0 --max-time=300 --test=/home/jan/mysql/sysbench/tests/include/oltp_legacy/oltp.lua --threads=4 run Still seeing clear difference between default setting and log-bin setting: #threads default log-bin 4 148.31 279.58 8 157.40 238.48 16 145.42 303.83

            Is the performance difference caused by the fact that MariaDB has to use the memory-mapped file-based transaction coordinator log (i.e. tc.log) when the binary log is disabled? i.e. see MDEV-16442

            I documented the following here:

            MariaDB Galera Cluster builds include a built-in plugin called wsrep. Prior to MariaDB 10.4.3, this plugin was internally considered an XA-capable storage engine. Consequently, these MariaDB Galera Cluster builds have multiple XA-capable storage engines by default, even if the only "real" storage engine that supports external XA transactions enabled on these builds by default is InnoDB. Therefore, when using one these builds MariaDB would be forced to use a transaction coordinator log by default, which could have performance implications.

            For example, MDEV-16509 describes performance problems where MariaDB Galera Cluster actually performs better when the binary log is enabled. It is possible that this is caused by the fact that MariaDB is forced to use the memory-mapped file-based transaction coordinator log in this case, which may not perform as well.

            This became a bigger issue in MariaDB 10.1 when the MySQL-wsrep patch that powers MariaDB Galera Cluster was enabled on most MariaDB builds on Linux by default. Consequently, this built-in wsrep plugin would exist on those MariaDB builds on Linux by default. Therefore, MariaDB users might pay a performance penalty, even if they never actually intended to use the MariaDB Galera Cluster features included in MariaDB 10.1.

            In MariaDB 10.4.3 and later, the built-in wsrep plugin has been changed to a replication plugin. Therefore, it is no longer considered an XA-capable storage engine, so it no longer forces MariaDB to use a transaction coordinator log by default.

            See MDEV-9214 and MDEV-16442 for more information.

            https://mariadb.com/kb/en/library/transaction-coordinator-log-overview/#mariadb-galera-cluster

            GeoffMontee Geoff Montee (Inactive) added a comment - Is the performance difference caused by the fact that MariaDB has to use the memory-mapped file-based transaction coordinator log (i.e. tc.log) when the binary log is disabled? i.e. see MDEV-16442 I documented the following here: MariaDB Galera Cluster builds include a built-in plugin called wsrep. Prior to MariaDB 10.4.3, this plugin was internally considered an XA-capable storage engine. Consequently, these MariaDB Galera Cluster builds have multiple XA-capable storage engines by default, even if the only "real" storage engine that supports external XA transactions enabled on these builds by default is InnoDB. Therefore, when using one these builds MariaDB would be forced to use a transaction coordinator log by default, which could have performance implications. For example, MDEV-16509 describes performance problems where MariaDB Galera Cluster actually performs better when the binary log is enabled. It is possible that this is caused by the fact that MariaDB is forced to use the memory-mapped file-based transaction coordinator log in this case, which may not perform as well. This became a bigger issue in MariaDB 10.1 when the MySQL-wsrep patch that powers MariaDB Galera Cluster was enabled on most MariaDB builds on Linux by default. Consequently, this built-in wsrep plugin would exist on those MariaDB builds on Linux by default. Therefore, MariaDB users might pay a performance penalty, even if they never actually intended to use the MariaDB Galera Cluster features included in MariaDB 10.1. In MariaDB 10.4.3 and later, the built-in wsrep plugin has been changed to a replication plugin. Therefore, it is no longer considered an XA-capable storage engine, so it no longer forces MariaDB to use a transaction coordinator log by default. See MDEV-9214 and MDEV-16442 for more information. https://mariadb.com/kb/en/library/transaction-coordinator-log-overview/#mariadb-galera-cluster

            Found in disabled.def for ES that it was disabled, also I'm adding the missing label.

            julien.fritsch Julien Fritsch added a comment - Found in disabled.def for ES that it was disabled, also I'm adding the missing label.

            People

              jplindst Jan Lindström (Inactive)
              valerii Valerii Kravchuk
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.