Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16589

default value for sync_binlog should be the safer value 1 instead of 0

Details

    • New Feature
    • Status: Stalled (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • Replication, Server
    • None

    Description

      The default variable for sync_binlog is 0 for MariaDB, where Oracle changed it to 1 starting with 5.7.7. I think we should also change the default to 1, as running a master with sync_binlog=0 is risky - any crash of server or mysqld will create inconsistent slaves 99% of the time.

      Attachments

        1. group_commit_benchmark.png
          group_commit_benchmark.png
          75 kB
        2. innodb_binlog_on_ssd.png
          innodb_binlog_on_ssd.png
          45 kB
        3. innodb_binlog.png
          innodb_binlog.png
          49 kB
        4. sysbench.pdf
          21 kB

        Issue Links

          Activity

            danblack Daniel Black added a comment -

            Nice graphs sujatha.sivakumar. So 8-16 threads the throughput is higher. Still suffering on latency, particularly insert.

            https://mariadb.org/fest2020/ssd/ at 10:48 offset - talking about fsync (redo, but same applies to binlog), that each fsync can be on the same data sector. Aligning every binlog unit to a beginning of a new 4k (discoverable fstat - blksize) block on disk after a fsync acceptable/show gains? And/or piggy back on the io_uring (MDEV-24883) implementation to have the kernel processing both binlog and other fsyncs for a transaction at the same tiem.

            danblack Daniel Black added a comment - Nice graphs sujatha.sivakumar . So 8-16 threads the throughput is higher. Still suffering on latency, particularly insert. https://mariadb.org/fest2020/ssd/ at 10:48 offset - talking about fsync (redo, but same applies to binlog), that each fsync can be on the same data sector. Aligning every binlog unit to a beginning of a new 4k (discoverable fstat - blksize) block on disk after a fsync acceptable/show gains? And/or piggy back on the io_uring ( MDEV-24883 ) implementation to have the kernel processing both binlog and other fsyncs for a transaction at the same tiem.

            On a benchmark which aims to isolate the transaction commits alone, and varying different group commit parameters (i.e., binlog_commit_wait_count, binlog_commit_wait_usec, innodb_flush_log_at_trx_commit, sync_binlog, and concurrent connection count), we have the following results:

            bnestere Brandon Nesterenko added a comment - On a benchmark which aims to isolate the transaction commits alone, and varying different group commit parameters (i.e., binlog_commit_wait_count, binlog_commit_wait_usec, innodb_flush_log_at_trx_commit, sync_binlog, and concurrent connection count), we have the following results:

            Added a new benchmark result prototyping the use of an innodb table to serve as the binlog (with the normal binary log disabled). Tested against various modes of flushing the binary log. In the legend, b stands for sync_binlog, and i stands for innodb_flush_log_at_trx_commit.

            bnestere Brandon Nesterenko added a comment - Added a new benchmark result prototyping the use of an innodb table to serve as the binlog (with the normal binary log disabled). Tested against various modes of flushing the binary log. In the legend, b stands for sync_binlog, and i stands for innodb_flush_log_at_trx_commit.

            Running the innodb table prototype benchmark on an SSD (rather than ramdisk and WITH_PMEM) shows that with low concurrency, the innodb binary log implementation is able to have higher performance than the current file implementation; however, with more concurrency, the methods switch. Will do further analysis into the time spent, along with more benchmarks comparing differing binlog group commit parameters.

            bnestere Brandon Nesterenko added a comment - Running the innodb table prototype benchmark on an SSD (rather than ramdisk and WITH_PMEM) shows that with low concurrency, the innodb binary log implementation is able to have higher performance than the current file implementation; however, with more concurrency, the methods switch. Will do further analysis into the time spent, along with more benchmarks comparing differing binlog group commit parameters.

            As far as I understand, implementing MDEV-34705 would significantly reduce the impact of setting sync_binlog. It should also make sync_binlog=0 (which would effectively be an alias of innodb_flush_log_at_trx_commit=0) crash-safe in a non-DDL workload. Yes, you might lose some latest committed transactions, but the binlog would be in sync with the storage engine.

            For any crash safety, DDL operations seem to require fdatasync() or fsync(), as long as a separate ddl_recovery.log file is being maintained.

            marko Marko Mäkelä added a comment - As far as I understand, implementing MDEV-34705 would significantly reduce the impact of setting sync_binlog . It should also make sync_binlog=0 (which would effectively be an alias of innodb_flush_log_at_trx_commit=0 ) crash-safe in a non-DDL workload. Yes, you might lose some latest committed transactions, but the binlog would be in sync with the storage engine. For any crash safety, DDL operations seem to require fdatasync() or fsync() , as long as a separate ddl_recovery.log file is being maintained.

            People

              Elkin Andrei Elkin
              rpizzi Rick Pizzi (Inactive)
              Votes:
              7 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.