Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35155

Performance degradation and unstable observed on 10.6.19

Details

    Description

      I plan to migrate our MariaDB instances from `10.2.10` to `10.6.19`, and have run some performance benchmarks. And I observed performance is not stable compared to `10.2.10`, especially for in-memory workload.

      Here is my test setup.
      Test tool: sysbench 1.0.X
      OS: CentOS 7.9 X86_64
      MariaDB version: 10.2.10 10.6.19
      Dataset: create 10 tables and each with 5M rows, each table ~ 1.2GB, the total size is ~ 12GB
      Almost all config options are the same, except I removed some options which is deprecated/removed in 10.6, e.g. `innodb_buffer_pool_instances`, `innodb_page_cleaners`,`innodb-thread-concurrency`,`innodb_checksum_algorithm` etc.

      Test 1.
      In-memory workset, with `innodb_buffer_pool_size`=188GB
      > NOTE:
      > TPS-X means using X threads run sysbench `oltp_read_write.lua` test

      10.2.10

      10.6.19

      We can see there are performance drops periodically with version `10.6.19`. The `10.6.19` can keep stable only in `4` threads case, while `10.2.10` 's performance is always stable with threads `4, 8, 16, and 32`.

      Test 2:
      Disk io bund test with `innodb_buffer_pool_size=2G`

      10.2.10

      10.6.19

      you can see `10.2.10` is also more stable compared to `10.6.19`.

      Attachments

        1. 10.6.20_write_only.zip
          1.08 MB
        2. image-2024-10-15-11-41-04-176.png
          image-2024-10-15-11-41-04-176.png
          44 kB
        3. image-2024-10-15-11-43-21-120.png
          image-2024-10-15-11-43-21-120.png
          60 kB
        4. image-2024-10-15-11-46-38-886.png
          image-2024-10-15-11-46-38-886.png
          101 kB
        5. image-2024-10-15-11-47-23-300.png
          image-2024-10-15-11-47-23-300.png
          140 kB
        6. image-2024-10-16-14-52-49-455.png
          image-2024-10-16-14-52-49-455.png
          59 kB
        7. image-2024-10-16-16-03-42-724.png
          image-2024-10-16-16-03-42-724.png
          30 kB
        8. screenshot-1.png
          screenshot-1.png
          56 kB
        9. screenshot-2.png
          screenshot-2.png
          42 kB
        10. screenshot-3.png
          screenshot-3.png
          40 kB
        11. screenshot-4.png
          screenshot-4.png
          44 kB
        12. screenshot-5.png
          screenshot-5.png
          40 kB
        13. screenshot-6.png
          screenshot-6.png
          145 kB
        14. screenshot-7.png
          screenshot-7.png
          39 kB

        Issue Links

          Activity

            lujinke Luke added a comment - - edited

            And performance degradation was also observed on io-bound workload test.
            On in-memory workload, 10.6.19 and 10.2.10 almost can archive the same TPS outputs and the cpu usage is also nearly identical. But on io-bound workload test, the 10.6.19 version's performance is far behind that of 10.2.10 in terms of avg TPS:

            test TPS of 10.2.10 TPS of 10.6.19(percentage of 10.2.10)
            TPS-4 125.7586111 116.6411667(92.7%)
            TPS-8 187.4434444 164.4717778(87.7%)
            TPS-16 253.4766667 209.8410556(82.7%)
            TPS-32 321.2640556 248.6147778(77.4%

            when the concurrency increases, the gap becomes higher, from 92.7% for 4 threads decreases to 77.4% for 32 threads.

            lujinke Luke added a comment - - edited And performance degradation was also observed on io-bound workload test. On in-memory workload, 10.6.19 and 10.2.10 almost can archive the same TPS outputs and the cpu usage is also nearly identical. But on io-bound workload test, the 10.6.19 version's performance is far behind that of 10.2.10 in terms of avg TPS: test TPS of 10.2.10 TPS of 10.6.19(percentage of 10.2.10) TPS-4 125.7586111 116.6411667(92.7%) TPS-8 187.4434444 164.4717778(87.7%) TPS-16 253.4766667 209.8410556(82.7%) TPS-32 321.2640556 248.6147778(77.4% when the concurrency increases, the gap becomes higher, from 92.7% for 4 threads decreases to 77.4% for 32 threads.
            wlad Vladislav Vaintroub added a comment - - edited

            lujinke, it would be helpful provide the sysbench command line, and the my.cnf you're using, so that the results of the benchmark could be reproduced exactly (maybe also describe the hardware, although this is of less importance)

            wlad Vladislav Vaintroub added a comment - - edited lujinke , it would be helpful provide the sysbench command line, and the my.cnf you're using, so that the results of the benchmark could be reproduced exactly (maybe also describe the hardware, although this is of less importance)
            lujinke Luke added a comment - - edited

            Sure. Here is how to reproduce the sysbench test.

            sysbench version : 1.0.20
             OS            : CentOS Linux release 7.9.2009 (Core)
             KERNEL        : 3.10.0-1160.31.1.el7.x86_64
             CPU           : Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
             CPU COUNT     : 48
             RAM           : 256 GB
            MariaDB 10.2.10 and 10.6.19
            Filesystem xfs
            

            1. dataset, as I mentioned before, create 10 tables, each with 5M rows

            sysbench /usr/share/sysbench/oltp_read_write.lua --mysql-host=localhost --mysql-user=sbtest --mysql-password=sbtest --mysql-socket=/tmp/mysql.sock --mysql-db=sbtest --threads=4 --tables=10 --table-size=5000000 prepare
            

            2. The test script

            #!/bin/bash
            # the first 4 threads test is intended to warm the buffer pool
            mkdir -p logs
            for i in 4 4 8 16 32 64
            do
                    echo "run sysbench with $i threads"
                    sysbench /usr/share/sysbench/oltp_read_write.lua --mysql-host=remote_host --mysql-user=sbtest --mysql-password=sbtest --mysql-port=3306 --mysql-db=sbtest --threads=$i --tables=10 --table-size=5000000 --report-interval=5 --time=900 run > logs/oltp_t$i.log
                    echo "sleep 60 seconds ..."
                    sleep 60                # 10.2.10 MariaDB intensively flush the dirty data here, after each run finished, while 10.6.19 intensively flush dirty data during the workload.
            done
             
            echo "done!"
            

            3. The config file

            [mysqld]
            server-id                       = 6301
            socket                          = /tmp/mysql.sock
            port                            = 3306
             
            character-set-server            = utf8mb4
            collation-server                = utf8mb4_general_ci
            default-storage-engine          = InnoDB
            transaction-isolation           = read-committed
             
            basedir                   = /usr
            datadir                         = /mysqldb/data
            innodb-data-home-dir            = /mysqldb/data
            innodb-log-group-home-dir       = /mysqldb/data
            log-bin                         = /mysqldb/binlogs/mysql-bin
            log-error                       = /mysqldb/logs/error.log
            relay-log                       = /mysqldb/logs/mysql-relay.log
            relay-log-space-limit           = 25G
            relay-log-index                 = /mysqldb/logs/mysql-relay-log.index
            relay-log-info-file             = /mysqldb/logs/mysql-relay-log.info
            slow-query-log-file             = /mysqldb/logs/slow-query.log
            tmpdir                          = /mysqldb/tmp
             
            key-buffer-size                 = 8G
            myisam-sort-buffer-size         = 256M
            max-sort-length                 = 256
            myisam-max-sort-file-size       = 120G
            myisam-recover-options          = DEFAULT
             
             
            # innodb_buffer_pool_instances  = 8  # removed for 10.6.19 test
            # innodb_checksum_algorithm     = innodb  # removed for 10.6.19 test
             
            innodb_strict_mode              = 0
            sql_mode                        = ''
            binlog_checksum         = none
            gtid_strict_mode                = 0
            innodb_temp_data_file_path      = ibtmp1:12M:autoextend
            # innodb_page_cleaners            = 8              # removed for 10.6.19 test
            plugin_load_add                 = metadata_lock_info
             
            max-allowed-packet              = 32M
            max-connections                 = 1000
            max-connect-errors              = 100
            skip-name-resolve
            wait-timeout                    = 86400
             
            bulk-insert-buffer-size         = 128M
            join-buffer-size                = 64M
            max-heap-table-size             = 640M
            open-files-limit                = 128000
            query-cache-limit               = 8M
            query-cache-size                = 0
            query-cache-type                = 1
            read-buffer-size                = 8M
            read-rnd-buffer-size            = 8M
            sort-buffer-size                = 14M
            table-cache                     = 10000
            table-definition-cache          = 40K
            thread-cache-size               = 400
            tmp-table-size                  = 640M
            userstat                        = ON
            optimizer-switch                = 'extended_keys=on'
             
            innodb-adaptive-hash-index      = 1
            # innodb-locks-unsafe-for-binlog  = 0          # removed for 10.6.19 test
            #innodb-concurrency-tickets     = 5000         # removed for 10.6.19 test
            innodb-io-capacity              = 2000
            # innodb-thread-concurrency       = 0      # removed for 10.6.19 test
            # innodb-thread-sleep-delay       = 0        # removed for 10.6.19 test
            innodb-max-dirty-pages-pct      = 20
            innodb-write-io-threads         = 4
            # General InnoDB options
            innodb-autoextend-increment     = 8
            innodb-buffer-pool-size         = 188G      # in-memory test
            # innodb-buffer-pool-size         = 2G       # io-bound test
            # innodb-checksums                = 1              # removed for 10.6.19 test
            innodb-data-file-path           = ibdata1:1024M;ibdata2:10M:autoextend
            innodb-doublewrite              = 1
            # innodb-file-format              = 1           # removed for 10.6.19 test
            innodb-file-per-table           = 1
            innodb-flush-log-at-trx-commit  = 2
            innodb-flush-method             = O_DIRECT
            innodb-lock-wait-timeout        = 400
            innodb-log-buffer-size          = 32M
            innodb-log-file-size            = 4000M
            innodb_purge_threads            = 8
            innodb-open-files               = 96000
            innodb-read-ahead-threshold     = 32
            innodb-read-io-threads          = 8
            innodb-stats-on-metadata        = OFF
            # innodb-support-xa               = 1         # removed for 10.6.19 test
            innodb-use-native-aio           = 1
             
            slow-query-log                  = ON
            log-output                      = TABLE,FILE
             
            binlog-cache-size               = 1M
            binlog-format                   = MIXED
            expire-logs-days                = 7
            log-slave-updates
            max-binlog-cache-size           = 16G
            max-binlog-size                 = 256M
            slave-transaction-retries       = 60
            sync-binlog                     = 1
            

            BTW: the 10.2.10 buffer pool dashboard above only contains 3 turns, TPS-4 is not included in that dashboard, the 3 segments indicates TPS-8,TPS-16 and TPS-32.

            lujinke Luke added a comment - - edited Sure. Here is how to reproduce the sysbench test. sysbench version : 1.0.20 OS : CentOS Linux release 7.9.2009 (Core) KERNEL : 3.10.0-1160.31.1.el7.x86_64 CPU : Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz CPU COUNT : 48 RAM : 256 GB MariaDB 10.2.10 and 10.6.19 Filesystem xfs 1. dataset, as I mentioned before, create 10 tables, each with 5M rows sysbench /usr/share/sysbench/oltp_read_write.lua --mysql-host=localhost --mysql-user=sbtest --mysql-password=sbtest --mysql-socket=/tmp/mysql.sock --mysql-db=sbtest --threads=4 --tables=10 --table-size=5000000 prepare 2. The test script #!/bin/bash # the first 4 threads test is intended to warm the buffer pool mkdir -p logs for i in 4 4 8 16 32 64 do echo "run sysbench with $i threads" sysbench /usr/share/sysbench/oltp_read_write.lua --mysql-host=remote_host --mysql-user=sbtest --mysql-password=sbtest --mysql-port=3306 --mysql-db=sbtest --threads=$i --tables=10 --table-size=5000000 --report-interval=5 --time=900 run > logs/oltp_t$i.log echo "sleep 60 seconds ..." sleep 60 # 10.2.10 MariaDB intensively flush the dirty data here, after each run finished, while 10.6.19 intensively flush dirty data during the workload. done   echo "done!" 3. The config file [mysqld] server-id = 6301 socket = /tmp/mysql.sock port = 3306   character-set-server = utf8mb4 collation-server = utf8mb4_general_ci default-storage-engine = InnoDB transaction-isolation = read-committed   basedir = /usr datadir = /mysqldb/data innodb-data-home-dir = /mysqldb/data innodb-log-group-home-dir = /mysqldb/data log-bin = /mysqldb/binlogs/mysql-bin log-error = /mysqldb/logs/error.log relay-log = /mysqldb/logs/mysql-relay.log relay-log-space-limit = 25G relay-log-index = /mysqldb/logs/mysql-relay-log.index relay-log-info-file = /mysqldb/logs/mysql-relay-log.info slow-query-log-file = /mysqldb/logs/slow-query.log tmpdir = /mysqldb/tmp   key-buffer-size = 8G myisam-sort-buffer-size = 256M max-sort-length = 256 myisam-max-sort-file-size = 120G myisam-recover-options = DEFAULT     # innodb_buffer_pool_instances = 8 # removed for 10.6.19 test # innodb_checksum_algorithm = innodb # removed for 10.6.19 test   innodb_strict_mode = 0 sql_mode = '' binlog_checksum = none gtid_strict_mode = 0 innodb_temp_data_file_path = ibtmp1:12M:autoextend # innodb_page_cleaners = 8 # removed for 10.6.19 test plugin_load_add = metadata_lock_info   max-allowed-packet = 32M max-connections = 1000 max-connect-errors = 100 skip-name-resolve wait-timeout = 86400   bulk-insert-buffer-size = 128M join-buffer-size = 64M max-heap-table-size = 640M open-files-limit = 128000 query-cache-limit = 8M query-cache-size = 0 query-cache-type = 1 read-buffer-size = 8M read-rnd-buffer-size = 8M sort-buffer-size = 14M table-cache = 10000 table-definition-cache = 40K thread-cache-size = 400 tmp-table-size = 640M userstat = ON optimizer-switch = 'extended_keys=on'   innodb-adaptive-hash-index = 1 # innodb-locks-unsafe-for-binlog = 0 # removed for 10.6.19 test #innodb-concurrency-tickets = 5000 # removed for 10.6.19 test innodb-io-capacity = 2000 # innodb-thread-concurrency = 0 # removed for 10.6.19 test # innodb-thread-sleep-delay = 0 # removed for 10.6.19 test innodb-max-dirty-pages-pct = 20 innodb-write-io-threads = 4 # General InnoDB options innodb-autoextend-increment = 8 innodb-buffer-pool-size = 188G # in-memory test # innodb-buffer-pool-size = 2G # io-bound test # innodb-checksums = 1 # removed for 10.6.19 test innodb-data-file-path = ibdata1:1024M;ibdata2:10M:autoextend innodb-doublewrite = 1 # innodb-file-format = 1 # removed for 10.6.19 test innodb-file-per-table = 1 innodb-flush-log-at-trx-commit = 2 innodb-flush-method = O_DIRECT innodb-lock-wait-timeout = 400 innodb-log-buffer-size = 32M innodb-log-file-size = 4000M innodb_purge_threads = 8 innodb-open-files = 96000 innodb-read-ahead-threshold = 32 innodb-read-io-threads = 8 innodb-stats-on-metadata = OFF # innodb-support-xa = 1 # removed for 10.6.19 test innodb-use-native-aio = 1   slow-query-log = ON log-output = TABLE,FILE   binlog-cache-size = 1M binlog-format = MIXED expire-logs-days = 7 log-slave-updates max-binlog-cache-size = 16G max-binlog-size = 256M slave-transaction-retries = 60 sync-binlog = 1 BTW: the 10.2.10 buffer pool dashboard above only contains 3 turns, TPS-4 is not included in that dashboard, the 3 segments indicates TPS-8,TPS-16 and TPS-32.

            There are a couple of "adaptive flushing" settings that could be configured, or maybe the logic around them could be improved.

            If you set innodb_max_dirty_pages_pct_lwm=1 or some even smaller positive value, then InnoDB should write out dirty pages proportionally to innodb_io_capacity. This should reduce the growth of the checkpoint age, at the risk of increasing write amplification (not being able to combine multiple modifications to a page within a single write). By default, this logic is disabled.

            There is also a setting innodb_adaptive_flushing_lwm, which you could try to set to a smaller nonnegative value than the default 10. I am not very familiar with the associated logic; it could be that it is not working as it was originally intended.

            marko Marko Mäkelä added a comment - There are a couple of "adaptive flushing" settings that could be configured, or maybe the logic around them could be improved. If you set innodb_max_dirty_pages_pct_lwm=1 or some even smaller positive value, then InnoDB should write out dirty pages proportionally to innodb_io_capacity . This should reduce the growth of the checkpoint age, at the risk of increasing write amplification (not being able to combine multiple modifications to a page within a single write). By default, this logic is disabled. There is also a setting innodb_adaptive_flushing_lwm , which you could try to set to a smaller nonnegative value than the default 10. I am not very familiar with the associated logic; it could be that it is not working as it was originally intended.
            axel Axel Schwenke added a comment -

            lujinke what you see looks familiar to me. It is the impact of log flushing (flushing pages in order to make transactions persistent before they are overwritten in the REDO log). The exact reason why that happens at all is still unclear. But it seems to be connected to the new adaptive flushing algorithm.

            Configuration options that helped me to mitigate the effects:

            1. bigger redo log. With the new recovery code it is now safe to make the redo log big. Even as big as the buffer pool. I have got good results for the redo log size = 1/4 .. 1/2 of the buffer pool.
            2. tuning innodb_io_capacity. I guess your datadir is on a SSD? Then please set innodb_flush_neighbors = 0 and increase innodb_io_capacity. I use innodb_io_capacity = 10000
            3. tuning adaptive flushing. I have

              innodb_adaptive_flushing = 1
              innodb_max_dirty_pages_pct = 95
              innodb_max_dirty_pages_pct_lwm = 75
              

              in my.cnf.

            axel Axel Schwenke added a comment - lujinke what you see looks familiar to me. It is the impact of log flushing (flushing pages in order to make transactions persistent before they are overwritten in the REDO log). The exact reason why that happens at all is still unclear. But it seems to be connected to the new adaptive flushing algorithm. Configuration options that helped me to mitigate the effects: bigger redo log. With the new recovery code it is now safe to make the redo log big. Even as big as the buffer pool. I have got good results for the redo log size = 1/4 .. 1/2 of the buffer pool. tuning innodb_io_capacity. I guess your datadir is on a SSD? Then please set innodb_flush_neighbors = 0 and increase innodb_io_capacity . I use innodb_io_capacity = 10000 tuning adaptive flushing. I have innodb_adaptive_flushing = 1 innodb_max_dirty_pages_pct = 95 innodb_max_dirty_pages_pct_lwm = 75 in my.cnf.

            People

              axel Axel Schwenke
              lujinke Luke
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.