Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13190

InnoDB write-only performance regression

Details

    Description

      Starting with MariaDB 10.2.2 there is a heavy performance regression for write-only workload and low thread counts.

      thread count 10.2.1 QPS 10.2.2 QPS Change
      1 7233.9 4622.5 -36%
      8 43185 32226 -25%
      16 75237 61746 -18%
      32 130255 114321 -12%
      64 168282 173745 +3%
      128 179679 205590 +14%
      256 179779 209018 +16%

      Benchmark: sysbench OLTP with --oltp_simple_ranges=0 --oltp-distinct-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-point-selects=0 --oltp_non_index_updates=9 --oltp_index_updates=9. 32 tables with 10 mio rows total (yields ~2.5GB tablespaces). Datadir residing on RAID-0 on two SSD.

      my.cnf

      [mysqld]
      performance_schema = 0
      secure-file-priv = /tmp
       
      max_connections = 600
      table_open_cache = 1200
      query_cache_type = 0
       
      innodb-file-per-table = true
      innodb-flush-method = O_DIRECT
      innodb-buffer-pool-size = 16G
      innodb_log_file_size = 2G
      innodb_log_buffer_size = 8M
      innodb_buffer_pool_instances = 8
      loose-innodb_adaptive_hash_index_parts = 16
      loose-innodb_adaptive_hash_index_partitions = 16
      innodb_io_capacity = 5000
      loose-innodb_flush_neighbors = 0
      innodb_write_io_threads = 8
      

      This issue popped up during work on MDEV-10123 (see there for more numbers). This could be the same issue as MDEV-11937.

      Attachments

        1. MDEV-13190.pdf
          31 kB
        2. sysbench_de.pdf
          24 kB
        3. sysbench_ps.pdf
          24 kB
        4. sysbench_read_uncommitted.pdf
          16 kB

        Issue Links

          Activity

            I believe that this is simply caused by the merge of MySQL 5.7.9 into MariaDB 10.2.2.
            Because Jan Lindström did that, I am assigning this to him for now.

            marko Marko Mäkelä added a comment - I believe that this is simply caused by the merge of MySQL 5.7.9 into MariaDB 10.2.2. Because Jan Lindström did that, I am assigning this to him for now.

            I am assigning this back to me.

            I wonder how much we have improved since those days. I think that already 10.5.12 or 10.6.4 should perform considerably better, and 10.6.5 includes some more improvements.

            marko Marko Mäkelä added a comment - I am assigning this back to me. I wonder how much we have improved since those days. I think that already 10.5.12 or 10.6.4 should perform considerably better, and 10.6.5 includes some more improvements.

            Today, I observed 130ktps at 16 threads on 10.6 when testing the Linux kernel io_uring hang (MDEV-26674). But the hardware is newer and the workload is different, so this is not directly comparable. Besides, it was with an insanely small redo log, to trigger very frequent page writes in order to test the kernel bug. During furious page flushing (frequent checkpoints due to the intentional misconfiguration), the throughput dropped to the claimed 10.2 level (60ktps).

            marko Marko Mäkelä added a comment - Today, I observed 130ktps at 16 threads on 10.6 when testing the Linux kernel io_uring hang ( MDEV-26674 ). But the hardware is newer and the workload is different, so this is not directly comparable. Besides, it was with an insanely small redo log, to trigger very frequent page writes in order to test the kernel bug. During furious page flushing (frequent checkpoints due to the intentional misconfiguration), the throughput dropped to the claimed 10.2 level (60ktps).
            axel Axel Schwenke added a comment -

            Things got no better in recent releases:

            thread count 10.2.1 QPS 10.2.2 QPS 10.6.5 QPS
            1 7233.9 4622.5 4874.7
            8 43185 32226 32260
            16 75237 61746 59867
            32 130255 114321 98753
            64 168282 173745 155256
            128 179679 205590 191656
            256 179779 209018 203569
            axel Axel Schwenke added a comment - Things got no better in recent releases: thread count 10.2.1 QPS 10.2.2 QPS 10.6.5 QPS 1 7233.9 4622.5 4874.7 8 43185 32226 32260 16 75237 61746 59867 32 130255 114321 98753 64 168282 173745 155256 128 179679 205590 191656 256 179779 209018 203569
            axel Axel Schwenke added a comment -

            Attached a summary of all InnoDB-related benchmarks from the regression suite for 10.2.1, 10.2.2 and 10.6.5 - MDEV-13190.pdf

            axel Axel Schwenke added a comment - Attached a summary of all InnoDB-related benchmarks from the regression suite for 10.2.1, 10.2.2 and 10.6.5 - MDEV-13190.pdf
            axel Axel Schwenke added a comment -

            I have rerun the benchmarks using the current kernel (with mitigations against SPECTRE & co) and have also recompiled everything, so using the same compiler and libraries.

            I still see an increase in latency and accordingly a decrease in performance at low thread counts for 10.2.2 (vs. 10.2.1). 10.6.5 looks quite good in direct comparison. For read-only it is better du to faster collation algorithm. For writes 10.6 is just better.

            Since the regression vanishes at higher load, it seems like the extra time is spent waiting in the kernel with the user threads suspended.

            Attached: sysbench_de.pdf (direct execution), sysbench_ps.pdf (prepared statements)

            axel Axel Schwenke added a comment - I have rerun the benchmarks using the current kernel (with mitigations against SPECTRE & co) and have also recompiled everything, so using the same compiler and libraries. I still see an increase in latency and accordingly a decrease in performance at low thread counts for 10.2.2 (vs. 10.2.1). 10.6.5 looks quite good in direct comparison. For read-only it is better du to faster collation algorithm. For writes 10.6 is just better. Since the regression vanishes at higher load, it seems like the extra time is spent waiting in the kernel with the user threads suspended. Attached: sysbench_de.pdf (direct execution), sysbench_ps.pdf (prepared statements)

            Thank you. Some regression for updates at low thread counts is still present. I hope that applying https://www.brendangregg.com/offcpuanalysis.html in a similar way as in MDEV-26004 could identify the bottleneck.

            The regression for point selects happens at any thread count. I wonder if using the READ UNCOMMITTED isolation level would make it go away. Possibly MDEV-21423 could address this.

            marko Marko Mäkelä added a comment - Thank you. Some regression for updates at low thread counts is still present. I hope that applying https://www.brendangregg.com/offcpuanalysis.html in a similar way as in MDEV-26004 could identify the bottleneck. The regression for point selects happens at any thread count. I wonder if using the READ UNCOMMITTED isolation level would make it go away. Possibly MDEV-21423 could address this.
            axel Axel Schwenke added a comment -

            Tested with transaction-isolation = 'READ-UNCOMMITTED'. For point-selects the performance didn't change at all, for read-only it became actually slower.

            Attached: sysbench_read_uncommitted.pdf

            axel Axel Schwenke added a comment - Tested with transaction-isolation = 'READ-UNCOMMITTED' . For point-selects the performance didn't change at all, for read-only it became actually slower. Attached: sysbench_read_uncommitted.pdf

            People

              marko Marko Mäkelä
              axel Axel Schwenke
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.