[MDEV-16589] default value for sync_binlog should be the safer value 1 instead of 0 Created: 2018-06-26  Updated: 2023-12-07

Status: Stalled
Project: MariaDB Server
Component/s: Replication, Server
Fix Version/s: 11.5

Type: New Feature Priority: Major
Reporter: Rick Pizzi Assignee: Andrei Elkin
Resolution: Unresolved Votes: 7
Labels: None

Attachments: PNG File group_commit_benchmark.png     PNG File innodb_binlog.png     PNG File innodb_binlog_on_ssd.png     PDF File sysbench.pdf    
Issue Links:
Blocks
is blocked by MDEV-18959 Engine transaction recovery through p... Stalled
is blocked by MDEV-24386 MDEV-16589 benchmark & analysis Closed
Sub-Tasks:
Key
Summary
Type
Status
Assignee
MDEV-24386 MDEV-16589 benchmark & analysis Technical task Closed Axel Schwenke  

 Description   

The default variable for sync_binlog is 0 for MariaDB, where Oracle changed it to 1 starting with 5.7.7. I think we should also change the default to 1, as running a master with sync_binlog=0 is risky - any crash of server or mysqld will create inconsistent slaves 99% of the time.



 Comments   
Comment by Jean-François Gagné [ 2019-01-06 ]

IMHO, this should be a priority Major: a database without the D of ACID is not a "real" database.
https://jfg-mysql.blogspot.com/2018/10/consequences-sync-binlog-neq-1-part-1.html
https://fosdem.org/2019/schedule/event/sync_binlog_use_default/
https://twitter.com/jfg956/status/1081697022267351040

Comment by Marko Mäkelä [ 2020-10-20 ]

I think that we should run benchmarks to determine the performance impact of sync_binlog=1. Maybe it is insignificant enough on SSD and with group commit, so that we can enable this setting by default?

Comment by Andrei Elkin [ 2020-10-20 ]

to the benchmarking.

Comment by Jean-François Gagné [ 2020-10-20 ]

It is not a question of benchmark / speed, it is a question of trust in the database. IMHO, a database should ship "safe" by default, and now it is not the case with MariaDB with sync_binlog = 0 in the default configuration.

There will always be situations where running safe configuration will be slower. With HDD, the latency of a sync is ~10ms, which is slow, but this is not a reason for unsafe configuration. With a RAID cache, sync latency is less than 1ms, ans it is the same with SSD, but Cloud / Network storage is bringing back this latency to 1ms. If a DBA wants better performance, he can change the sync_binlog and trx_commit parameters, but this has consequence that I detailed in [1], [2] and [3].

[1]: https://jfg-mysql.blogspot.com/2018/10/consequences-sync-binlog-neq-1-part-1.html

[2]: https://archive.fosdem.org/2020/schedule/event/sync_binlog/

[3]: https://www.slideshare.net/JeanFranoisGagn/the-consequences-of-syncbinlog-1

Please ship MariaDB with a safe default configuration, which means sync_binlog = 1.

Comment by Andrei Elkin [ 2020-10-21 ]

jeanfrancois.gagne, thanks for your comments and compilation of valuable analysis! At my endorsement I actually meant MDEV-18959
(that aims at overcoming `sync_binlog=1 && innodb_flush_log_at_trx_commit=1` as the only safe configuration).
I'd personally agree to change to `sync_binlog=1`, but since there's some legacy involved it should
be widely discussed in engineering and support.

So we've actually started in that...

Comment by Rick Pizzi [ 2020-10-21 ]

I insist that we should either ship with sync_binlog=1, or clearly state that we are not ACID compliant unless the setting is changed...

Comment by Sergei Golubchik [ 2020-10-21 ]

let's change it in 10.6, unless there're convincing reasons not to

Comment by Daniel Black [ 2021-02-24 ]

Nice graphs sujatha.sivakumar. So 8-16 threads the throughput is higher. Still suffering on latency, particularly insert.

https://mariadb.org/fest2020/ssd/ at 10:48 offset - talking about fsync (redo, but same applies to binlog), that each fsync can be on the same data sector. Aligning every binlog unit to a beginning of a new 4k (discoverable fstat - blksize) block on disk after a fsync acceptable/show gains? And/or piggy back on the io_uring (MDEV-24883) implementation to have the kernel processing both binlog and other fsyncs for a transaction at the same tiem.

Comment by Brandon Nesterenko [ 2022-11-30 ]

On a benchmark which aims to isolate the transaction commits alone, and varying different group commit parameters (i.e., binlog_commit_wait_count, binlog_commit_wait_usec, innodb_flush_log_at_trx_commit, sync_binlog, and concurrent connection count), we have the following results:

Comment by Brandon Nesterenko [ 2022-12-05 ]

Added a new benchmark result prototyping the use of an innodb table to serve as the binlog (with the normal binary log disabled). Tested against various modes of flushing the binary log. In the legend, b stands for sync_binlog, and i stands for innodb_flush_log_at_trx_commit.

Comment by Brandon Nesterenko [ 2022-12-07 ]

Running the innodb table prototype benchmark on an SSD (rather than ramdisk and WITH_PMEM) shows that with low concurrency, the innodb binary log implementation is able to have higher performance than the current file implementation; however, with more concurrency, the methods switch. Will do further analysis into the time spent, along with more benchmarks comparing differing binlog group commit parameters.

Generated at Thu Feb 08 08:30:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.