Perf regression from removing innodb_flush_method=O_DIRECT_NO_FSYNC




      Back when I did web-scale InnoDB I always set innodb_flush_method to O_DIRECT_NO_FSYNC. At one point we had a bug in the FB patch for MySQL similar to the bug described in this comment, but we fixed that and upstream MySQL has a correct implementation of it.

      MariaDB docs for innodb_flush_method now includes this claim, and this claims is news to me. Do you have more detail as to why O_DIRECT_NO_FSYNC isn't good with XFS?
      "Not suitable for XFS filesystems."

      Finally, this is an example of a performance regression from not having O_DIRECT_NO_FSYNC. I ran subset of the insert benchmark and then timed how long it took for MySQL to shutdown.

      The server in this case is a 32-core (yes, real cores) AMD with 128G RAM, Ubuntu 22.04, XFS and SW RAID 10 across 2 NVMe devices. The my.cnf files are here for MariaDB 10.11, for MariaDB 11.4 and for MySQL 8.0.36

      From the results below, the shutdown is much faster with O_DIRECT_NO_FSYNC for the dbms that support it (MariaDB 10.11, MySQL 8.0.36).


      • a - innodb_flush_method=O_DIRECT_NO_FSYNC
      • b - innodb_flush_method=O_DIRECT (or equivalent)
      • c - innodb_flush_method=fsync (or equivalent)


      • 10.11.7, 11.4.1 - MariaDB with InnoDB
      • 8.0.36 - MySQL with InnoDB

      This is from 24 clients, 20M rows/table, table/client == 480M rows

      Numbers are seconds for shutdown
      dbms a b c
      10.11.7 117 454 1247
      11.4.1 529 1259
      8.0.36 76 2667 3390

      This is from 24 clients, 10M rows/table, table/client == 240M rows
      dbms a b c
      10.11.7 87 254 732
      11.4.1 258 730
      8.0.36 37 1972 2478}}

      If I look at PMP call stacks during the shutdown, this is a common stack with MariaDB 11.4.1 showing that things are frequently waiting on the fsync for the doublewrite buffer:

      When I look at stack traces for configs a and b (a=O_DIRECT_NO_FSYNC, b=O_DIRECT) for MySQL 8.0.36, then I also see many stack traces where things are waiting on the doublewritebuffer fsync.



