Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27772

Performance regression with default configuration in 10.6

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.6
    • None
    • Server
    • None
    • SUSE Enterprise Linux 15 SP4, openSUSE Tumbleweed

    Description

      MariaDB 10.6 has O_DIRECT value as default for innodb_flush_method instead of fsync. This lead to a performance regression in the default configuration, regardless of the filesystem used.
      To benchmark the performance, I have used sysbench-0.4.8 oltp read test on btrfs (the most affected), ext4 and xfs.

      Filesystem threads innodb_flush_method=fsync innodb_flush_method=O_DIRECT
      ext4 1 6.5s 7.0s
      ext4 16 0.97s 1.15s
      brtfs 1 6.5s 77s
      brtfs 16 0.8s 3.1s
      xfs 1 6.4s 7.2s
      xfs 16 0.8s 1.1s

      I have used a mount point in /abuild and the following configuration in /etc/my.cnf.d/sysbench_testing.cnf:

      [mysqld]
      port = 3306
      datadir=/abuild/mysql-sysbench
      socket=/abuild/mysql-sysbench/sysbench-mysql.sock
      #innodb_flush_method=fsync
      

      To reproduce, run the following commands:

      $ # prepare the db
      $ sysbench --test=oltp --mysql-user=root \
             --mysql-socket=/abuild/mysql-sysbench/sysbench-mysql.sock \
             --mysql-db=test --mysql-port=3306 \
             --oltp-dist-type=uniform --oltp-read-only --oltp-order-ranges=0 \
             --oltp-table-size=1000000 --oltp-range-size=1000 prepare
      $ # run the benchmark
      $ sysbench --test=oltp --mysql-user=root \
             --mysql-socket=/abuild/mysql-sysbench/sysbench-mysql.sock \
             --mysql-db=test --mysql-port=3306 \
             --oltp-dist-type=uniform --oltp-read-only --oltp-order-ranges=0 \
             --oltp-table-size=1000000 --oltp-range-size=1000 \
             --num-threads=$NTHREADS run
      $ # cleanup the db
      $ sysbench --test=oltp --mysql-user=root \
               --mysql-socket=/abuild/mysql-sysbench/sysbench-mysql.sock \
      	 --mysql-db=test --mysql-port=3306  cleanup >/dev/null 1>&2
      

      I am also aware that some testing by some colleague lead to similar regression in xfs (353.5476s in 10.6 vs 16.9560s in 10.5).

      Attachments

        Issue Links

          Activity

            wlad Vladislav Vaintroub added a comment - - edited

            Note, that while I do support fsync (if innodb_buffer_pool_size is not configured), I think you're pushing this 20x a little bit too much. First, why would anyone consider btrfs at all - that's not the default, and its read speed as shown by your benchmark, is questionable. So for a database, maybe I'd stay away from btrfs pretty much.

            Second, asking about how much buffer pool size to dedicate to database, might be considered the installer's or if you want, packager's responsibility . MariaDB's Windows MSI by default asks about 12.5% of RAM, and also lets user overwrite this, in both UI and silent mode. MSI has been doing it since 10 years, so maybe zypper also can consider asking user that one question.

            wlad Vladislav Vaintroub added a comment - - edited Note, that while I do support fsync (if innodb_buffer_pool_size is not configured), I think you're pushing this 20x a little bit too much. First, why would anyone consider btrfs at all - that's not the default, and its read speed as shown by your benchmark, is questionable. So for a database, maybe I'd stay away from btrfs pretty much. Second, asking about how much buffer pool size to dedicate to database, might be considered the installer's or if you want, packager's responsibility . MariaDB's Windows MSI by default asks about 12.5% of RAM, and also lets user overwrite this, in both UI and silent mode. MSI has been doing it since 10 years, so maybe zypper also can consider asking user that one question.

            Can't we make MariaDB/InnoDB emit a warning or suggestion to users if it detects that the below is not the case?

            If the working set fits in the default-size buffer pool, or if you have configured the buffer pool size to match the available RAM and buffer pool, it works well.

            We can't rely on deb/rpm/msi packaging here. Many systems run installs/upgrades in an automatic and unattended way, and there is no human to tell things to and even less to ask them to decide on something. And anyway workloads might change over time long after the installation was done. Best place to decide on good defaults is MariaDB itself and best place to tell users something is wrong or suboptimal is by emitting warnings in the error logs of MariaDB at runtime.

            otto Otto Kekäläinen added a comment - Can't we make MariaDB/InnoDB emit a warning or suggestion to users if it detects that the below is not the case? If the working set fits in the default-size buffer pool, or if you have configured the buffer pool size to match the available RAM and buffer pool, it works well. We can't rely on deb/rpm/msi packaging here. Many systems run installs/upgrades in an automatic and unattended way, and there is no human to tell things to and even less to ask them to decide on something. And anyway workloads might change over time long after the installation was done. Best place to decide on good defaults is MariaDB itself and best place to tell users something is wrong or suboptimal is by emitting warnings in the error logs of MariaDB at runtime.

            First, why would anyone consider btrfs at all

            The 20x difference is on xfs. My benchmark shows regression on brtfs as well. I don't think making assumptions on the filesystem used by default is good idea either.

            read speed as shown by your benchmark, is questionable

            Why is it questionable? Can you please explain better?

            danyspin97 Danilo Spinella added a comment - First, why would anyone consider btrfs at all The 20x difference is on xfs. My benchmark shows regression on brtfs as well. I don't think making assumptions on the filesystem used by default is good idea either. read speed as shown by your benchmark, is questionable Why is it questionable? Can you please explain better?

            There exist monitoring tools that can build on existing instrumentation in the server to highlight problems. I do not think that such logic should be duplicated in the database server itself. The server can provide raw data. It cannot guess the intentions of the user.

            My intuition suggests that any copy-on-write file system, such as xfs or btrfs, could be a bad match for a write-heavy InnoDB setup, or for any database that is based on B-trees and a circular write-ahead log.

            marko Marko Mäkelä added a comment - There exist monitoring tools that can build on existing instrumentation in the server to highlight problems. I do not think that such logic should be duplicated in the database server itself. The server can provide raw data. It cannot guess the intentions of the user. My intuition suggests that any copy-on-write file system, such as xfs or btrfs, could be a bad match for a write-heavy InnoDB setup, or for any database that is based on B-trees and a circular write-ahead log.
            timojyrinki Timo Jyrinki added a comment -

            For a future MariaDB LTS release, it could be beneficial to have fallback scenarios in case MDEV-19895 does not get implemented by then. The pool size default could be bumped to the mentioned 1GB, or reverting to fsync if innodb_buffer_pool_size not configured, maybe with an informational message suggesting to set the value. This would likely improve the out-of-the-box experience for many.

            With O_DIRECT we generally see good results given the 1GB pool size (or higher). The plea for a warning message only comes from the fact that the default does not seem like an optimal compromise, and that the new behavior needs manual intervention compared to more "forgiving" fsync.

            All in all, learning about these is an acceptable learning curve about MariaDB 10.6 and good documentation helps. O_DIRECT has been the correct default choice in general. Good (or at least ok) defaults would be always useful too, though.

            timojyrinki Timo Jyrinki added a comment - For a future MariaDB LTS release, it could be beneficial to have fallback scenarios in case MDEV-19895 does not get implemented by then. The pool size default could be bumped to the mentioned 1GB, or reverting to fsync if innodb_buffer_pool_size not configured, maybe with an informational message suggesting to set the value. This would likely improve the out-of-the-box experience for many. With O_DIRECT we generally see good results given the 1GB pool size (or higher). The plea for a warning message only comes from the fact that the default does not seem like an optimal compromise, and that the new behavior needs manual intervention compared to more "forgiving" fsync. All in all, learning about these is an acceptable learning curve about MariaDB 10.6 and good documentation helps. O_DIRECT has been the correct default choice in general. Good (or at least ok) defaults would be always useful too, though.

            People

              Unassigned Unassigned
              danyspin97 Danilo Spinella
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.