Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26055

Adaptive flushing is still not getting invoked in 10.5.11

Details

    Description

      Hi Team,

      This issue is related to MDEV-25093. Adaptive flushing is not getting invoked after crossing the configured value of innodb_adaptive_flushing_lwm. I have attached the graph for checkpoint.

      My setup is 3 node galera cluster running 10.5.11 version and 26.4.8-focal galera package installed on ubuntu focal 20.04.

      MDEV-25093 is supposed to be fixed in 10.5.10, but I see a similar behaviour in all the mariadb versions starting from 10.5.7 to 10.5.11. I have shared the graph for 10.5.11 though.

      Below are the configs I used.

       
      MariaDB [(none)]> select version();
      +-------------------------------------------+
      | version()                                 |
      +-------------------------------------------+
      | 10.5.11-MariaDB-1:10.5.11+maria~focal-log |
      +-------------------------------------------+
       
       
      MariaDB [(none)]> show global variables like '%innodb_max_dirty_pages_pct%';
      +--------------------------------+-----------+
      | Variable_name                  | Value     |
      +--------------------------------+-----------+
      | innodb_max_dirty_pages_pct     | 90.000000 |
      | innodb_max_dirty_pages_pct_lwm | 10.000000 |
      +--------------------------------+-----------+
       
      MariaDB [(none)]> show global variables like '%innodb_adaptive_flushing%';
      +------------------------------+-----------+
      | Variable_name                | Value     |
      +------------------------------+-----------+
      | innodb_adaptive_flushing     | ON        |
      | innodb_adaptive_flushing_lwm | 10.000000 |
      +------------------------------+-----------+
       
      MariaDB [(none)]> show variables like '%innodb_flush%';
      +--------------------------------+----------+
      | Variable_name                  | Value    |
      +--------------------------------+----------+
      | innodb_flush_log_at_timeout    | 1        |
      | innodb_flush_log_at_trx_commit | 0        |
      | innodb_flush_method            | O_DIRECT |
      | innodb_flush_neighbors         | 1        |
      | innodb_flush_sync              | ON      |
      | innodb_flushing_avg_loops      | 30       |
      +--------------------------------+----------+
       
      MariaDB [(none)]> show global variables like '%io_capa%';
      +------------------------+-------+
      | Variable_name          | Value |
      +------------------------+-------+
      | innodb_io_capacity     | 3000  |
      | innodb_io_capacity_max | 5000  |
      +------------------------+-------+
      
      

      I tried by increasing/decreasing the innodb_adaptive_flushing_lwm and innodb_max_dirty_pages_pct_lwm but sill facing the same issue.

      I also tweaked innodb_io_capacity and innodb_io_capacity_max but no luck, btw we are using NVMe disks for this setup.

      Below are the status when Innodb_checkpoint_age reaches Innodb_checkpoint_max_age

      MariaDB [(none)]> show status like 'Innodb_buffer_pool_pages%'; show status like 'Innodb_checkpoint_%';
       
      +-----------------------------------------+-----------+
      | Variable_name                           | Value     |
      +-----------------------------------------+-----------+
      | Innodb_buffer_pool_pages_data           | 10654068  |
      | Innodb_buffer_pool_pages_dirty          | 9425873   |
      | Innodb_buffer_pool_pages_flushed        | 100225151 |
      | Innodb_buffer_pool_pages_free           | 16726607  |
      | Innodb_buffer_pool_pages_made_not_young | 12247924  |
      | Innodb_buffer_pool_pages_made_young     | 565734067 |
      | Innodb_buffer_pool_pages_misc           | 0         |
      | Innodb_buffer_pool_pages_old            | 3932829   |
      | Innodb_buffer_pool_pages_total          | 27380675  |
      | Innodb_buffer_pool_pages_lru_flushed    | 0         |
      +-----------------------------------------+-----------+
      10 rows in set (0.000 sec)
       
      +---------------------------+-------------+
      | Variable_name             | Value       |
      +---------------------------+-------------+
      | Innodb_checkpoint_age     | 26090991889 |
      | Innodb_checkpoint_max_age | 26091216876 |
      +---------------------------+-------------+
      
      

      Iostat
       
      Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz  aqu-sz  %util
      dm-0            28.80    460.80     0.00   0.00    0.22    16.00 30756.60 992005.60     0.00   0.00    0.04    32.25    0.00      0.00     0.00   0.00    0.00     0.00    1.27  99.68
      dm-1             0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00   0.00
      dm-2             0.00      0.00     0.00   0.00    0.00     0.00    1.80     21.60     0.00   0.00    0.00    12.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00   0.48
      dm-3             0.00      0.00     0.00   0.00    0.00     0.00   13.60  31140.80     0.00   0.00    1.29  2289.76    0.00      0.00     0.00   0.00    0.00     0.00    0.02   6.32
      sda              0.00      0.00     0.00   0.00    0.00     0.00    0.80     27.20     6.00  88.24    0.25    34.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00   0.24
      sdb              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00   0.00
      sdc              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00   0.00
      sfd0n1          28.80    460.80     0.00   0.00    0.29    16.00 32618.60 1023168.00     0.00   0.00    0.05    31.37    0.00      0.00     0.00   0.00    0.00     0.00    0.16  99.68
      

      QPS around the issue

      [ 1252s ] thds: 256 tps: 69863.98 qps: 69863.98 (r/w/o: 0.00/49843.99/20019.99) lat (ms,99%): 5.88 err/s: 0.00 reconn/s: 0.00
      [ 1253s ] thds: 256 tps: 20750.41 qps: 20750.41 (r/w/o: 0.00/14817.86/5932.55) lat (ms,99%): 6.09 err/s: 0.00 reconn/s: 0.00
      [ 1254s ] thds: 256 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,99%): 0.00 err/s: 0.00 reconn/s: 0.00
      [ 1255s ] thds: 256 tps: 58797.66 qps: 58797.66 (r/w/o: 0.00/41615.64/17182.02) lat (ms,99%): 7.98 err/s: 0.00 reconn/s: 0.00
      [ 1256s ] thds: 256 tps: 67527.27 qps: 67527.27 (r/w/o: 0.00/48173.05/19354.22) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00
      [ 1257s ] thds: 256 tps: 58384.64 qps: 58384.64 (r/w/o: 0.00/41503.59/16881.05) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00
       
      [ 1273s ] thds: 256 tps: 69908.70 qps: 69908.70 (r/w/o: 0.00/49996.07/19912.63) lat (ms,99%): 6.09 err/s: 0.00 reconn/s: 0.00
      [ 1274s ] thds: 256 tps: 43327.15 qps: 43327.15 (r/w/o: 0.00/30641.10/12686.04) lat (ms,99%): 167.44 err/s: 0.00 reconn/s: 0.00
      [ 1275s ] thds: 256 tps: 32849.45 qps: 32849.45 (r/w/o: 0.00/23436.04/9413.42) lat (ms,99%): 118.92 err/s: 0.00 reconn/s: 0.00
      [ 1276s ] thds: 256 tps: 38600.51 qps: 38600.51 (r/w/o: 0.00/27316.24/11284.27) lat (ms,99%): 223.34 err/s: 0.00 reconn/s: 0.00
      [ 1277s ] thds: 256 tps: 41558.22 qps: 41558.22 (r/w/o: 0.00/29635.58/11922.64) lat (ms,99%): 15.83 err/s: 0.00 reconn/s: 0.00
      [ 1278s ] thds: 256 tps: 49127.92 qps: 49127.92 (r/w/o: 0.00/35045.52/14082.40) lat (ms,99%): 183.21 err/s: 0.00 reconn/s: 0.00
      [ 1279s ] thds: 256 tps: 47763.57 qps: 47763.57 (r/w/o: 0.00/33943.82/13819.74) lat (ms,99%): 15.27 err/s: 0.00 reconn/s: 0.00
      [ 1280s ] thds: 256 tps: 41276.96 qps: 41276.96 (r/w/o: 0.00/29521.97/11754.99) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
      [ 1281s ] thds: 256 tps: 40896.95 qps: 40896.95 (r/w/o: 0.00/29010.55/11886.41) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
      [ 1282s ] thds: 256 tps: 51294.76 qps: 51294.76 (r/w/o: 0.00/36633.83/14660.93) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
      [ 1283s ] thds: 256 tps: 36653.38 qps: 36653.38 (r/w/o: 0.00/26246.55/10406.82) lat (ms,99%): 227.40 err/s: 0.00 reconn/s: 0.00
      [ 1284s ] thds: 256 tps: 47463.39 qps: 47463.39 (r/w/o: 0.00/33757.41/13705.98) lat (ms,99%): 8.43 err/s: 0.00 reconn/s: 0.00
      [ 1285s ] thds: 256 tps: 44322.09 qps: 44322.09 (r/w/o: 0.00/31595.92/12726.16) lat (ms,99%): 176.73 err/s: 0.00 reconn/s: 0.00
      [ 1286s ] thds: 256 tps: 45233.40 qps: 45233.40 (r/w/o: 0.00/32176.58/13056.83) lat (ms,99%): 139.85 err/s: 0.00 reconn/s: 0.00
      [ 1287s ] thds: 256 tps: 43981.65 qps: 43981.65 (r/w/o: 0.00/31424.61/12557.04) lat (ms,99%): 137.35 err/s: 0.00 reconn/s: 0.00
      [ 1288s ] thds: 256 tps: 45362.25 qps: 45362.25 (r/w/o: 0.00/32337.89/13024.36) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
      [ 1289s ] thds: 256 tps: 47941.21 qps: 47941.21 (r/w/o: 0.00/34073.42/13867.80) lat (ms,99%): 15.55 err/s: 0.00 reconn/s: 0.00
      [ 1290s ] thds: 256 tps: 48869.94 qps: 48869.94 (r/w/o: 0.00/34709.96/14159.98) lat (ms,99%): 10.09 err/s: 0.00 reconn/s: 0.00
      [ 1291s ] thds: 256 tps: 46091.53 qps: 46091.53 (r/w/o: 0.00/32806.82/13284.71) lat (ms,99%): 10.09 err/s: 0.00 reconn/s: 0.00
      [ 1292s ] thds: 256 tps: 46741.92 qps: 46741.92 (r/w/o: 0.00/33258.23/13483.69) lat (ms,99%): 73.13 err/s: 0.00 reconn/s: 0.00
      [ 1293s ] thds: 256 tps: 49509.12 qps: 49509.12 (r/w/o: 0.00/35481.93/14027.18) lat (ms,99%): 66.84 err/s: 0.00 reconn/s: 0.00
      [ 1294s ] thds: 256 tps: 43450.71 qps: 43450.71 (r/w/o: 0.00/30950.49/12500.22) lat (ms,99%): 15.83 err/s: 0.00 reconn/s: 0.00
      [ 1295s ] thds: 256 tps: 48791.38 qps: 48791.38 (r/w/o: 0.00/34645.01/14146.37) lat (ms,99%): 183.21 err/s: 0.00 reconn/s: 0.00
      [ 1296s ] thds: 256 tps: 42562.93 qps: 42562.93 (r/w/o: 0.00/30257.95/12304.98) lat (ms,99%): 16.41 err/s: 0.00 reconn/s: 0.00
      [ 1297s ] thds: 256 tps: 65641.78 qps: 65641.78 (r/w/o: 0.00/46758.54/18883.24) lat (ms,99%): 7.56 err/s: 0.00 reconn/s: 0.00
      [ 1298s ] thds: 256 tps: 38736.94 qps: 38736.94 (r/w/o: 0.00/27624.96/11111.98) lat (ms,99%): 15.55 err/s: 0.00 reconn/s: 0.00
      [ 1299s ] thds: 256 tps: 51147.80 qps: 51147.80 (r/w/o: 0.00/36503.14/14644.66) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
      [ 1300s ] thds: 256 tps: 34908.85 qps: 34908.85 (r/w/o: 0.00/24826.60/10082.24) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
      [ 1301s ] thds: 256 tps: 60257.94 qps: 60257.94 (r/w/o: 0.00/42931.96/17325.98) lat (ms,99%): 13.46 err/s: 0.00 reconn/s: 0.00
      [ 1302s ] thds: 256 tps: 44607.33 qps: 44607.33 (r/w/o: 0.00/31644.10/12963.22) lat (ms,99%): 179.94 err/s: 0.00 reconn/s: 0.00
      [ 1303s ] thds: 256 tps: 52189.03 qps: 52189.03 (r/w/o: 0.00/37171.75/15017.28) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00
       
      [ 1319s ] thds: 256 tps: 67309.85 qps: 67309.85 (r/w/o: 0.00/47982.61/19327.24) lat (ms,99%): 9.39 err/s: 0.00 reconn/s: 0.00
      [ 1320s ] thds: 256 tps: 6857.52 qps: 6857.52 (r/w/o: 0.00/4928.37/1929.15) lat (ms,99%): 189.93 err/s: 0.00 reconn/s: 0.00
      [ 1321s ] thds: 256 tps: 6017.51 qps: 6017.51 (r/w/o: 0.00/4291.65/1725.86) lat (ms,99%): 1618.78 err/s: 0.00 reconn/s: 0.00
      [ 1322s ] thds: 256 tps: 64763.63 qps: 64763.63 (r/w/o: 0.00/46201.32/18562.32) lat (ms,99%): 7.98 err/s: 0.00 reconn/s: 0.00
      
      

      QPS dips, disk util increases when Mariadb starts to do the furious flushing .

      Attachments

        Issue Links

          Activity

            gktomar Gaurav Tomar created issue -
            gktomar Gaurav Tomar made changes -
            Field Original Value New Value
            gktomar Gaurav Tomar made changes -
            Description Hi Team,

            This issue is related to [MDEV-25093|https://jira.mariadb.org/browse/MDEV-25093]. Adaptive flushing is not getting invoked after crossing the configured value of innodb_adaptive_flushing_lwm. I have attached the graph for checkpoint.

            My setup is 3 node galera cluster running 10.5.11 version and 26.4.8-focal galera package installed on ubuntu focal 20.04.

            [MDEV-25093|https://jira.mariadb.org/browse/MDEV-25093] is supposed to be fixed in 10.5.10, but I see a similar behaviour in all the mariadb versions starting from 10.5.7 to 10.5.11. I have shared the graph for 10.5.11 though.

            Below are the configs I used.

            MariaDB [(none)]> select version();
            +-------------------------------------------+
            | version() |
            +-------------------------------------------+
            | 10.5.11-MariaDB-1:10.5.11+maria~focal-log |
            +-------------------------------------------+


            MariaDB [(none)]> show global variables like '%innodb_max_dirty_pages_pct%';
            +--------------------------------+-----------+
            | Variable_name | Value |
            +--------------------------------+-----------+
            | innodb_max_dirty_pages_pct | 90.000000 |
            | innodb_max_dirty_pages_pct_lwm | 10.000000 |
            +--------------------------------+-----------+

            MariaDB [(none)]> show global variables like '%innodb_adaptive_flushing%';
            +------------------------------+-----------+
            | Variable_name | Value |
            +------------------------------+-----------+
            | innodb_adaptive_flushing | ON |
            | innodb_adaptive_flushing_lwm | 10.000000 |
            +------------------------------+-----------+

            MariaDB [(none)]> show variables like '%innodb_flush%';
            +--------------------------------+----------+
            | Variable_name | Value |
            +--------------------------------+----------+
            | innodb_flush_log_at_timeout | 1 |
            | innodb_flush_log_at_trx_commit | 0 |
            | innodb_flush_method | O_DIRECT |
            | innodb_flush_neighbors | 1 |
            | innodb_flush_sync | ON |
            | innodb_flushing_avg_loops | 30 |
            +--------------------------------+----------+

            MariaDB [(none)]> show global variables like '%io_capa%';
            +------------------------+-------+
            | Variable_name | Value |
            +------------------------+-------+
            | innodb_io_capacity | 3000 |
            | innodb_io_capacity_max | 5000 |
            +------------------------+-------+


            I tried by increasing/decreasing the innodb_adaptive_flushing_lwm and innodb_max_dirty_pages_pct_lwm but sill facing the same issue.

            I also tweaked innodb_io_capacity and innodb_io_capacity_max but no luck, btw we are using NVMe disks for this setup.

            Below are the status when Innodb_checkpoint_age reaches Innodb_checkpoint_max_age

            MariaDB [(none)]> show status like 'Innodb_buffer_pool_pages%'; show status like 'Innodb_checkpoint_%';
            +-----------------------------------------+-----------+
            | Variable_name | Value |
            +-----------------------------------------+-----------+
            | Innodb_buffer_pool_pages_data | 10654068 |
            | Innodb_buffer_pool_pages_dirty | 9425873 |
            | Innodb_buffer_pool_pages_flushed | 100225151 |
            | Innodb_buffer_pool_pages_free | 16726607 |
            | Innodb_buffer_pool_pages_made_not_young | 12247924 |
            | Innodb_buffer_pool_pages_made_young | 565734067 |
            | Innodb_buffer_pool_pages_misc | 0 |
            | Innodb_buffer_pool_pages_old | 3932829 |
            | Innodb_buffer_pool_pages_total | 27380675 |
            | Innodb_buffer_pool_pages_lru_flushed | 0 |
            +-----------------------------------------+-----------+
            10 rows in set (0.000 sec)

            +---------------------------+-------------+
            | Variable_name | Value |
            +---------------------------+-------------+
            | Innodb_checkpoint_age | 26090991889 |
            | Innodb_checkpoint_max_age | 26091216876 |
            +---------------------------+-------------+

            Iostat

            Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz aqu-sz %util
            dm-0 28.80 460.80 0.00 0.00 0.22 16.00 30756.60 992005.60 0.00 0.00 0.04 32.25 0.00 0.00 0.00 0.00 0.00 0.00 1.27 99.68
            dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            dm-2 0.00 0.00 0.00 0.00 0.00 0.00 1.80 21.60 0.00 0.00 0.00 12.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.48
            dm-3 0.00 0.00 0.00 0.00 0.00 0.00 13.60 31140.80 0.00 0.00 1.29 2289.76 0.00 0.00 0.00 0.00 0.00 0.00 0.02 6.32
            sda 0.00 0.00 0.00 0.00 0.00 0.00 0.80 27.20 6.00 88.24 0.25 34.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.24
            sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            sfd0n1 28.80 460.80 0.00 0.00 0.29 16.00 32618.60 1023168.00 0.00 0.00 0.05 31.37 0.00 0.00 0.00 0.00 0.00 0.00 0.16 99.68


            QPS around the issue

            [ 1252s ] thds: 256 tps: 69863.98 qps: 69863.98 (r/w/o: 0.00/49843.99/20019.99) lat (ms,99%): 5.88 err/s: 0.00 reconn/s: 0.00
            [ 1253s ] thds: 256 tps: 20750.41 qps: 20750.41 (r/w/o: 0.00/14817.86/5932.55) lat (ms,99%): 6.09 err/s: 0.00 reconn/s: 0.00
            [ 1254s ] thds: 256 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,99%): 0.00 err/s: 0.00 reconn/s: 0.00
            [ 1255s ] thds: 256 tps: 58797.66 qps: 58797.66 (r/w/o: 0.00/41615.64/17182.02) lat (ms,99%): 7.98 err/s: 0.00 reconn/s: 0.00
            [ 1256s ] thds: 256 tps: 67527.27 qps: 67527.27 (r/w/o: 0.00/48173.05/19354.22) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00
            [ 1257s ] thds: 256 tps: 58384.64 qps: 58384.64 (r/w/o: 0.00/41503.59/16881.05) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00

            [ 1273s ] thds: 256 tps: 69908.70 qps: 69908.70 (r/w/o: 0.00/49996.07/19912.63) lat (ms,99%): 6.09 err/s: 0.00 reconn/s: 0.00
            [ 1274s ] thds: 256 tps: 43327.15 qps: 43327.15 (r/w/o: 0.00/30641.10/12686.04) lat (ms,99%): 167.44 err/s: 0.00 reconn/s: 0.00
            [ 1275s ] thds: 256 tps: 32849.45 qps: 32849.45 (r/w/o: 0.00/23436.04/9413.42) lat (ms,99%): 118.92 err/s: 0.00 reconn/s: 0.00
            [ 1276s ] thds: 256 tps: 38600.51 qps: 38600.51 (r/w/o: 0.00/27316.24/11284.27) lat (ms,99%): 223.34 err/s: 0.00 reconn/s: 0.00
            [ 1277s ] thds: 256 tps: 41558.22 qps: 41558.22 (r/w/o: 0.00/29635.58/11922.64) lat (ms,99%): 15.83 err/s: 0.00 reconn/s: 0.00
            [ 1278s ] thds: 256 tps: 49127.92 qps: 49127.92 (r/w/o: 0.00/35045.52/14082.40) lat (ms,99%): 183.21 err/s: 0.00 reconn/s: 0.00
            [ 1279s ] thds: 256 tps: 47763.57 qps: 47763.57 (r/w/o: 0.00/33943.82/13819.74) lat (ms,99%): 15.27 err/s: 0.00 reconn/s: 0.00
            [ 1280s ] thds: 256 tps: 41276.96 qps: 41276.96 (r/w/o: 0.00/29521.97/11754.99) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1281s ] thds: 256 tps: 40896.95 qps: 40896.95 (r/w/o: 0.00/29010.55/11886.41) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1282s ] thds: 256 tps: 51294.76 qps: 51294.76 (r/w/o: 0.00/36633.83/14660.93) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1283s ] thds: 256 tps: 36653.38 qps: 36653.38 (r/w/o: 0.00/26246.55/10406.82) lat (ms,99%): 227.40 err/s: 0.00 reconn/s: 0.00
            [ 1284s ] thds: 256 tps: 47463.39 qps: 47463.39 (r/w/o: 0.00/33757.41/13705.98) lat (ms,99%): 8.43 err/s: 0.00 reconn/s: 0.00
            [ 1285s ] thds: 256 tps: 44322.09 qps: 44322.09 (r/w/o: 0.00/31595.92/12726.16) lat (ms,99%): 176.73 err/s: 0.00 reconn/s: 0.00
            [ 1286s ] thds: 256 tps: 45233.40 qps: 45233.40 (r/w/o: 0.00/32176.58/13056.83) lat (ms,99%): 139.85 err/s: 0.00 reconn/s: 0.00
            [ 1287s ] thds: 256 tps: 43981.65 qps: 43981.65 (r/w/o: 0.00/31424.61/12557.04) lat (ms,99%): 137.35 err/s: 0.00 reconn/s: 0.00
            [ 1288s ] thds: 256 tps: 45362.25 qps: 45362.25 (r/w/o: 0.00/32337.89/13024.36) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1289s ] thds: 256 tps: 47941.21 qps: 47941.21 (r/w/o: 0.00/34073.42/13867.80) lat (ms,99%): 15.55 err/s: 0.00 reconn/s: 0.00
            [ 1290s ] thds: 256 tps: 48869.94 qps: 48869.94 (r/w/o: 0.00/34709.96/14159.98) lat (ms,99%): 10.09 err/s: 0.00 reconn/s: 0.00
            [ 1291s ] thds: 256 tps: 46091.53 qps: 46091.53 (r/w/o: 0.00/32806.82/13284.71) lat (ms,99%): 10.09 err/s: 0.00 reconn/s: 0.00
            [ 1292s ] thds: 256 tps: 46741.92 qps: 46741.92 (r/w/o: 0.00/33258.23/13483.69) lat (ms,99%): 73.13 err/s: 0.00 reconn/s: 0.00
            [ 1293s ] thds: 256 tps: 49509.12 qps: 49509.12 (r/w/o: 0.00/35481.93/14027.18) lat (ms,99%): 66.84 err/s: 0.00 reconn/s: 0.00
            [ 1294s ] thds: 256 tps: 43450.71 qps: 43450.71 (r/w/o: 0.00/30950.49/12500.22) lat (ms,99%): 15.83 err/s: 0.00 reconn/s: 0.00
            [ 1295s ] thds: 256 tps: 48791.38 qps: 48791.38 (r/w/o: 0.00/34645.01/14146.37) lat (ms,99%): 183.21 err/s: 0.00 reconn/s: 0.00
            [ 1296s ] thds: 256 tps: 42562.93 qps: 42562.93 (r/w/o: 0.00/30257.95/12304.98) lat (ms,99%): 16.41 err/s: 0.00 reconn/s: 0.00
            [ 1297s ] thds: 256 tps: 65641.78 qps: 65641.78 (r/w/o: 0.00/46758.54/18883.24) lat (ms,99%): 7.56 err/s: 0.00 reconn/s: 0.00
            [ 1298s ] thds: 256 tps: 38736.94 qps: 38736.94 (r/w/o: 0.00/27624.96/11111.98) lat (ms,99%): 15.55 err/s: 0.00 reconn/s: 0.00
            [ 1299s ] thds: 256 tps: 51147.80 qps: 51147.80 (r/w/o: 0.00/36503.14/14644.66) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1300s ] thds: 256 tps: 34908.85 qps: 34908.85 (r/w/o: 0.00/24826.60/10082.24) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1301s ] thds: 256 tps: 60257.94 qps: 60257.94 (r/w/o: 0.00/42931.96/17325.98) lat (ms,99%): 13.46 err/s: 0.00 reconn/s: 0.00
            [ 1302s ] thds: 256 tps: 44607.33 qps: 44607.33 (r/w/o: 0.00/31644.10/12963.22) lat (ms,99%): 179.94 err/s: 0.00 reconn/s: 0.00
            [ 1303s ] thds: 256 tps: 52189.03 qps: 52189.03 (r/w/o: 0.00/37171.75/15017.28) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00

            [ 1319s ] thds: 256 tps: 67309.85 qps: 67309.85 (r/w/o: 0.00/47982.61/19327.24) lat (ms,99%): 9.39 err/s: 0.00 reconn/s: 0.00
            [ 1320s ] thds: 256 tps: 6857.52 qps: 6857.52 (r/w/o: 0.00/4928.37/1929.15) lat (ms,99%): 189.93 err/s: 0.00 reconn/s: 0.00
            [ 1321s ] thds: 256 tps: 6017.51 qps: 6017.51 (r/w/o: 0.00/4291.65/1725.86) lat (ms,99%): 1618.78 err/s: 0.00 reconn/s: 0.00
            [ 1322s ] thds: 256 tps: 64763.63 qps: 64763.63 (r/w/o: 0.00/46201.32/18562.32) lat (ms,99%): 7.98 err/s: 0.00 reconn/s: 0.00


            QPS dips, disk util increases when Mariadb starts to do the furious flushing .
            Hi Team,

            This issue is related to [MDEV-25093|https://jira.mariadb.org/browse/MDEV-25093]. Adaptive flushing is not getting invoked after crossing the configured value of innodb_adaptive_flushing_lwm. I have attached the graph for checkpoint.

            My setup is 3 node galera cluster running 10.5.11 version and 26.4.8-focal galera package installed on ubuntu focal 20.04.

            [MDEV-25093|https://jira.mariadb.org/browse/MDEV-25093] is supposed to be fixed in 10.5.10, but I see a similar behaviour in all the mariadb versions starting from 10.5.7 to 10.5.11. I have shared the graph for 10.5.11 though.

            Below are the configs I used.


            {code:java}
            MariaDB [(none)]> select version();
            +-------------------------------------------+
            | version() |
            +-------------------------------------------+
            | 10.5.11-MariaDB-1:10.5.11+maria~focal-log |
            +-------------------------------------------+


            MariaDB [(none)]> show global variables like '%innodb_max_dirty_pages_pct%';
            +--------------------------------+-----------+
            | Variable_name | Value |
            +--------------------------------+-----------+
            | innodb_max_dirty_pages_pct | 90.000000 |
            | innodb_max_dirty_pages_pct_lwm | 10.000000 |
            +--------------------------------+-----------+

            MariaDB [(none)]> show global variables like '%innodb_adaptive_flushing%';
            +------------------------------+-----------+
            | Variable_name | Value |
            +------------------------------+-----------+
            | innodb_adaptive_flushing | ON |
            | innodb_adaptive_flushing_lwm | 10.000000 |
            +------------------------------+-----------+

            MariaDB [(none)]> show variables like '%innodb_flush%';
            +--------------------------------+----------+
            | Variable_name | Value |
            +--------------------------------+----------+
            | innodb_flush_log_at_timeout | 1 |
            | innodb_flush_log_at_trx_commit | 0 |
            | innodb_flush_method | O_DIRECT |
            | innodb_flush_neighbors | 1 |
            | innodb_flush_sync | ON |
            | innodb_flushing_avg_loops | 30 |
            +--------------------------------+----------+

            MariaDB [(none)]> show global variables like '%io_capa%';
            +------------------------+-------+
            | Variable_name | Value |
            +------------------------+-------+
            | innodb_io_capacity | 3000 |
            | innodb_io_capacity_max | 5000 |
            +------------------------+-------+
            {code}



            I tried by increasing/decreasing the innodb_adaptive_flushing_lwm and innodb_max_dirty_pages_pct_lwm but sill facing the same issue.

            I also tweaked innodb_io_capacity and innodb_io_capacity_max but no luck, btw we are using NVMe disks for this setup.

            Below are the status when Innodb_checkpoint_age reaches Innodb_checkpoint_max_age

            MariaDB [(none)]> show status like 'Innodb_buffer_pool_pages%'; show status like 'Innodb_checkpoint_%';

            {code:java}
            +-----------------------------------------+-----------+
            | Variable_name | Value |
            +-----------------------------------------+-----------+
            | Innodb_buffer_pool_pages_data | 10654068 |
            | Innodb_buffer_pool_pages_dirty | 9425873 |
            | Innodb_buffer_pool_pages_flushed | 100225151 |
            | Innodb_buffer_pool_pages_free | 16726607 |
            | Innodb_buffer_pool_pages_made_not_young | 12247924 |
            | Innodb_buffer_pool_pages_made_young | 565734067 |
            | Innodb_buffer_pool_pages_misc | 0 |
            | Innodb_buffer_pool_pages_old | 3932829 |
            | Innodb_buffer_pool_pages_total | 27380675 |
            | Innodb_buffer_pool_pages_lru_flushed | 0 |
            +-----------------------------------------+-----------+
            10 rows in set (0.000 sec)

            +---------------------------+-------------+
            | Variable_name | Value |
            +---------------------------+-------------+
            | Innodb_checkpoint_age | 26090991889 |
            | Innodb_checkpoint_max_age | 26091216876 |
            +---------------------------+-------------+

            {code}

            Iostat

            Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz aqu-sz %util
            dm-0 28.80 460.80 0.00 0.00 0.22 16.00 30756.60 992005.60 0.00 0.00 0.04 32.25 0.00 0.00 0.00 0.00 0.00 0.00 1.27 99.68
            dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            dm-2 0.00 0.00 0.00 0.00 0.00 0.00 1.80 21.60 0.00 0.00 0.00 12.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.48
            dm-3 0.00 0.00 0.00 0.00 0.00 0.00 13.60 31140.80 0.00 0.00 1.29 2289.76 0.00 0.00 0.00 0.00 0.00 0.00 0.02 6.32
            sda 0.00 0.00 0.00 0.00 0.00 0.00 0.80 27.20 6.00 88.24 0.25 34.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.24
            sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            sfd0n1 28.80 460.80 0.00 0.00 0.29 16.00 32618.60 1023168.00 0.00 0.00 0.05 31.37 0.00 0.00 0.00 0.00 0.00 0.00 0.16 99.68


            QPS around the issue

            [ 1252s ] thds: 256 tps: 69863.98 qps: 69863.98 (r/w/o: 0.00/49843.99/20019.99) lat (ms,99%): 5.88 err/s: 0.00 reconn/s: 0.00
            [ 1253s ] thds: 256 tps: 20750.41 qps: 20750.41 (r/w/o: 0.00/14817.86/5932.55) lat (ms,99%): 6.09 err/s: 0.00 reconn/s: 0.00
            [ 1254s ] thds: 256 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,99%): 0.00 err/s: 0.00 reconn/s: 0.00
            [ 1255s ] thds: 256 tps: 58797.66 qps: 58797.66 (r/w/o: 0.00/41615.64/17182.02) lat (ms,99%): 7.98 err/s: 0.00 reconn/s: 0.00
            [ 1256s ] thds: 256 tps: 67527.27 qps: 67527.27 (r/w/o: 0.00/48173.05/19354.22) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00
            [ 1257s ] thds: 256 tps: 58384.64 qps: 58384.64 (r/w/o: 0.00/41503.59/16881.05) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00

            [ 1273s ] thds: 256 tps: 69908.70 qps: 69908.70 (r/w/o: 0.00/49996.07/19912.63) lat (ms,99%): 6.09 err/s: 0.00 reconn/s: 0.00
            [ 1274s ] thds: 256 tps: 43327.15 qps: 43327.15 (r/w/o: 0.00/30641.10/12686.04) lat (ms,99%): 167.44 err/s: 0.00 reconn/s: 0.00
            [ 1275s ] thds: 256 tps: 32849.45 qps: 32849.45 (r/w/o: 0.00/23436.04/9413.42) lat (ms,99%): 118.92 err/s: 0.00 reconn/s: 0.00
            [ 1276s ] thds: 256 tps: 38600.51 qps: 38600.51 (r/w/o: 0.00/27316.24/11284.27) lat (ms,99%): 223.34 err/s: 0.00 reconn/s: 0.00
            [ 1277s ] thds: 256 tps: 41558.22 qps: 41558.22 (r/w/o: 0.00/29635.58/11922.64) lat (ms,99%): 15.83 err/s: 0.00 reconn/s: 0.00
            [ 1278s ] thds: 256 tps: 49127.92 qps: 49127.92 (r/w/o: 0.00/35045.52/14082.40) lat (ms,99%): 183.21 err/s: 0.00 reconn/s: 0.00
            [ 1279s ] thds: 256 tps: 47763.57 qps: 47763.57 (r/w/o: 0.00/33943.82/13819.74) lat (ms,99%): 15.27 err/s: 0.00 reconn/s: 0.00
            [ 1280s ] thds: 256 tps: 41276.96 qps: 41276.96 (r/w/o: 0.00/29521.97/11754.99) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1281s ] thds: 256 tps: 40896.95 qps: 40896.95 (r/w/o: 0.00/29010.55/11886.41) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1282s ] thds: 256 tps: 51294.76 qps: 51294.76 (r/w/o: 0.00/36633.83/14660.93) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1283s ] thds: 256 tps: 36653.38 qps: 36653.38 (r/w/o: 0.00/26246.55/10406.82) lat (ms,99%): 227.40 err/s: 0.00 reconn/s: 0.00
            [ 1284s ] thds: 256 tps: 47463.39 qps: 47463.39 (r/w/o: 0.00/33757.41/13705.98) lat (ms,99%): 8.43 err/s: 0.00 reconn/s: 0.00
            [ 1285s ] thds: 256 tps: 44322.09 qps: 44322.09 (r/w/o: 0.00/31595.92/12726.16) lat (ms,99%): 176.73 err/s: 0.00 reconn/s: 0.00
            [ 1286s ] thds: 256 tps: 45233.40 qps: 45233.40 (r/w/o: 0.00/32176.58/13056.83) lat (ms,99%): 139.85 err/s: 0.00 reconn/s: 0.00
            [ 1287s ] thds: 256 tps: 43981.65 qps: 43981.65 (r/w/o: 0.00/31424.61/12557.04) lat (ms,99%): 137.35 err/s: 0.00 reconn/s: 0.00
            [ 1288s ] thds: 256 tps: 45362.25 qps: 45362.25 (r/w/o: 0.00/32337.89/13024.36) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1289s ] thds: 256 tps: 47941.21 qps: 47941.21 (r/w/o: 0.00/34073.42/13867.80) lat (ms,99%): 15.55 err/s: 0.00 reconn/s: 0.00
            [ 1290s ] thds: 256 tps: 48869.94 qps: 48869.94 (r/w/o: 0.00/34709.96/14159.98) lat (ms,99%): 10.09 err/s: 0.00 reconn/s: 0.00
            [ 1291s ] thds: 256 tps: 46091.53 qps: 46091.53 (r/w/o: 0.00/32806.82/13284.71) lat (ms,99%): 10.09 err/s: 0.00 reconn/s: 0.00
            [ 1292s ] thds: 256 tps: 46741.92 qps: 46741.92 (r/w/o: 0.00/33258.23/13483.69) lat (ms,99%): 73.13 err/s: 0.00 reconn/s: 0.00
            [ 1293s ] thds: 256 tps: 49509.12 qps: 49509.12 (r/w/o: 0.00/35481.93/14027.18) lat (ms,99%): 66.84 err/s: 0.00 reconn/s: 0.00
            [ 1294s ] thds: 256 tps: 43450.71 qps: 43450.71 (r/w/o: 0.00/30950.49/12500.22) lat (ms,99%): 15.83 err/s: 0.00 reconn/s: 0.00
            [ 1295s ] thds: 256 tps: 48791.38 qps: 48791.38 (r/w/o: 0.00/34645.01/14146.37) lat (ms,99%): 183.21 err/s: 0.00 reconn/s: 0.00
            [ 1296s ] thds: 256 tps: 42562.93 qps: 42562.93 (r/w/o: 0.00/30257.95/12304.98) lat (ms,99%): 16.41 err/s: 0.00 reconn/s: 0.00
            [ 1297s ] thds: 256 tps: 65641.78 qps: 65641.78 (r/w/o: 0.00/46758.54/18883.24) lat (ms,99%): 7.56 err/s: 0.00 reconn/s: 0.00
            [ 1298s ] thds: 256 tps: 38736.94 qps: 38736.94 (r/w/o: 0.00/27624.96/11111.98) lat (ms,99%): 15.55 err/s: 0.00 reconn/s: 0.00
            [ 1299s ] thds: 256 tps: 51147.80 qps: 51147.80 (r/w/o: 0.00/36503.14/14644.66) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1300s ] thds: 256 tps: 34908.85 qps: 34908.85 (r/w/o: 0.00/24826.60/10082.24) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1301s ] thds: 256 tps: 60257.94 qps: 60257.94 (r/w/o: 0.00/42931.96/17325.98) lat (ms,99%): 13.46 err/s: 0.00 reconn/s: 0.00
            [ 1302s ] thds: 256 tps: 44607.33 qps: 44607.33 (r/w/o: 0.00/31644.10/12963.22) lat (ms,99%): 179.94 err/s: 0.00 reconn/s: 0.00
            [ 1303s ] thds: 256 tps: 52189.03 qps: 52189.03 (r/w/o: 0.00/37171.75/15017.28) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00

            [ 1319s ] thds: 256 tps: 67309.85 qps: 67309.85 (r/w/o: 0.00/47982.61/19327.24) lat (ms,99%): 9.39 err/s: 0.00 reconn/s: 0.00
            [ 1320s ] thds: 256 tps: 6857.52 qps: 6857.52 (r/w/o: 0.00/4928.37/1929.15) lat (ms,99%): 189.93 err/s: 0.00 reconn/s: 0.00
            [ 1321s ] thds: 256 tps: 6017.51 qps: 6017.51 (r/w/o: 0.00/4291.65/1725.86) lat (ms,99%): 1618.78 err/s: 0.00 reconn/s: 0.00
            [ 1322s ] thds: 256 tps: 64763.63 qps: 64763.63 (r/w/o: 0.00/46201.32/18562.32) lat (ms,99%): 7.98 err/s: 0.00 reconn/s: 0.00


            QPS dips, disk util increases when Mariadb starts to do the furious flushing .
            gktomar Gaurav Tomar made changes -
            Description Hi Team,

            This issue is related to [MDEV-25093|https://jira.mariadb.org/browse/MDEV-25093]. Adaptive flushing is not getting invoked after crossing the configured value of innodb_adaptive_flushing_lwm. I have attached the graph for checkpoint.

            My setup is 3 node galera cluster running 10.5.11 version and 26.4.8-focal galera package installed on ubuntu focal 20.04.

            [MDEV-25093|https://jira.mariadb.org/browse/MDEV-25093] is supposed to be fixed in 10.5.10, but I see a similar behaviour in all the mariadb versions starting from 10.5.7 to 10.5.11. I have shared the graph for 10.5.11 though.

            Below are the configs I used.


            {code:java}
            MariaDB [(none)]> select version();
            +-------------------------------------------+
            | version() |
            +-------------------------------------------+
            | 10.5.11-MariaDB-1:10.5.11+maria~focal-log |
            +-------------------------------------------+


            MariaDB [(none)]> show global variables like '%innodb_max_dirty_pages_pct%';
            +--------------------------------+-----------+
            | Variable_name | Value |
            +--------------------------------+-----------+
            | innodb_max_dirty_pages_pct | 90.000000 |
            | innodb_max_dirty_pages_pct_lwm | 10.000000 |
            +--------------------------------+-----------+

            MariaDB [(none)]> show global variables like '%innodb_adaptive_flushing%';
            +------------------------------+-----------+
            | Variable_name | Value |
            +------------------------------+-----------+
            | innodb_adaptive_flushing | ON |
            | innodb_adaptive_flushing_lwm | 10.000000 |
            +------------------------------+-----------+

            MariaDB [(none)]> show variables like '%innodb_flush%';
            +--------------------------------+----------+
            | Variable_name | Value |
            +--------------------------------+----------+
            | innodb_flush_log_at_timeout | 1 |
            | innodb_flush_log_at_trx_commit | 0 |
            | innodb_flush_method | O_DIRECT |
            | innodb_flush_neighbors | 1 |
            | innodb_flush_sync | ON |
            | innodb_flushing_avg_loops | 30 |
            +--------------------------------+----------+

            MariaDB [(none)]> show global variables like '%io_capa%';
            +------------------------+-------+
            | Variable_name | Value |
            +------------------------+-------+
            | innodb_io_capacity | 3000 |
            | innodb_io_capacity_max | 5000 |
            +------------------------+-------+
            {code}



            I tried by increasing/decreasing the innodb_adaptive_flushing_lwm and innodb_max_dirty_pages_pct_lwm but sill facing the same issue.

            I also tweaked innodb_io_capacity and innodb_io_capacity_max but no luck, btw we are using NVMe disks for this setup.

            Below are the status when Innodb_checkpoint_age reaches Innodb_checkpoint_max_age

            MariaDB [(none)]> show status like 'Innodb_buffer_pool_pages%'; show status like 'Innodb_checkpoint_%';

            {code:java}
            +-----------------------------------------+-----------+
            | Variable_name | Value |
            +-----------------------------------------+-----------+
            | Innodb_buffer_pool_pages_data | 10654068 |
            | Innodb_buffer_pool_pages_dirty | 9425873 |
            | Innodb_buffer_pool_pages_flushed | 100225151 |
            | Innodb_buffer_pool_pages_free | 16726607 |
            | Innodb_buffer_pool_pages_made_not_young | 12247924 |
            | Innodb_buffer_pool_pages_made_young | 565734067 |
            | Innodb_buffer_pool_pages_misc | 0 |
            | Innodb_buffer_pool_pages_old | 3932829 |
            | Innodb_buffer_pool_pages_total | 27380675 |
            | Innodb_buffer_pool_pages_lru_flushed | 0 |
            +-----------------------------------------+-----------+
            10 rows in set (0.000 sec)

            +---------------------------+-------------+
            | Variable_name | Value |
            +---------------------------+-------------+
            | Innodb_checkpoint_age | 26090991889 |
            | Innodb_checkpoint_max_age | 26091216876 |
            +---------------------------+-------------+

            {code}

            Iostat

            Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz aqu-sz %util
            dm-0 28.80 460.80 0.00 0.00 0.22 16.00 30756.60 992005.60 0.00 0.00 0.04 32.25 0.00 0.00 0.00 0.00 0.00 0.00 1.27 99.68
            dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            dm-2 0.00 0.00 0.00 0.00 0.00 0.00 1.80 21.60 0.00 0.00 0.00 12.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.48
            dm-3 0.00 0.00 0.00 0.00 0.00 0.00 13.60 31140.80 0.00 0.00 1.29 2289.76 0.00 0.00 0.00 0.00 0.00 0.00 0.02 6.32
            sda 0.00 0.00 0.00 0.00 0.00 0.00 0.80 27.20 6.00 88.24 0.25 34.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.24
            sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            sfd0n1 28.80 460.80 0.00 0.00 0.29 16.00 32618.60 1023168.00 0.00 0.00 0.05 31.37 0.00 0.00 0.00 0.00 0.00 0.00 0.16 99.68


            QPS around the issue

            [ 1252s ] thds: 256 tps: 69863.98 qps: 69863.98 (r/w/o: 0.00/49843.99/20019.99) lat (ms,99%): 5.88 err/s: 0.00 reconn/s: 0.00
            [ 1253s ] thds: 256 tps: 20750.41 qps: 20750.41 (r/w/o: 0.00/14817.86/5932.55) lat (ms,99%): 6.09 err/s: 0.00 reconn/s: 0.00
            [ 1254s ] thds: 256 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,99%): 0.00 err/s: 0.00 reconn/s: 0.00
            [ 1255s ] thds: 256 tps: 58797.66 qps: 58797.66 (r/w/o: 0.00/41615.64/17182.02) lat (ms,99%): 7.98 err/s: 0.00 reconn/s: 0.00
            [ 1256s ] thds: 256 tps: 67527.27 qps: 67527.27 (r/w/o: 0.00/48173.05/19354.22) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00
            [ 1257s ] thds: 256 tps: 58384.64 qps: 58384.64 (r/w/o: 0.00/41503.59/16881.05) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00

            [ 1273s ] thds: 256 tps: 69908.70 qps: 69908.70 (r/w/o: 0.00/49996.07/19912.63) lat (ms,99%): 6.09 err/s: 0.00 reconn/s: 0.00
            [ 1274s ] thds: 256 tps: 43327.15 qps: 43327.15 (r/w/o: 0.00/30641.10/12686.04) lat (ms,99%): 167.44 err/s: 0.00 reconn/s: 0.00
            [ 1275s ] thds: 256 tps: 32849.45 qps: 32849.45 (r/w/o: 0.00/23436.04/9413.42) lat (ms,99%): 118.92 err/s: 0.00 reconn/s: 0.00
            [ 1276s ] thds: 256 tps: 38600.51 qps: 38600.51 (r/w/o: 0.00/27316.24/11284.27) lat (ms,99%): 223.34 err/s: 0.00 reconn/s: 0.00
            [ 1277s ] thds: 256 tps: 41558.22 qps: 41558.22 (r/w/o: 0.00/29635.58/11922.64) lat (ms,99%): 15.83 err/s: 0.00 reconn/s: 0.00
            [ 1278s ] thds: 256 tps: 49127.92 qps: 49127.92 (r/w/o: 0.00/35045.52/14082.40) lat (ms,99%): 183.21 err/s: 0.00 reconn/s: 0.00
            [ 1279s ] thds: 256 tps: 47763.57 qps: 47763.57 (r/w/o: 0.00/33943.82/13819.74) lat (ms,99%): 15.27 err/s: 0.00 reconn/s: 0.00
            [ 1280s ] thds: 256 tps: 41276.96 qps: 41276.96 (r/w/o: 0.00/29521.97/11754.99) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1281s ] thds: 256 tps: 40896.95 qps: 40896.95 (r/w/o: 0.00/29010.55/11886.41) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1282s ] thds: 256 tps: 51294.76 qps: 51294.76 (r/w/o: 0.00/36633.83/14660.93) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1283s ] thds: 256 tps: 36653.38 qps: 36653.38 (r/w/o: 0.00/26246.55/10406.82) lat (ms,99%): 227.40 err/s: 0.00 reconn/s: 0.00
            [ 1284s ] thds: 256 tps: 47463.39 qps: 47463.39 (r/w/o: 0.00/33757.41/13705.98) lat (ms,99%): 8.43 err/s: 0.00 reconn/s: 0.00
            [ 1285s ] thds: 256 tps: 44322.09 qps: 44322.09 (r/w/o: 0.00/31595.92/12726.16) lat (ms,99%): 176.73 err/s: 0.00 reconn/s: 0.00
            [ 1286s ] thds: 256 tps: 45233.40 qps: 45233.40 (r/w/o: 0.00/32176.58/13056.83) lat (ms,99%): 139.85 err/s: 0.00 reconn/s: 0.00
            [ 1287s ] thds: 256 tps: 43981.65 qps: 43981.65 (r/w/o: 0.00/31424.61/12557.04) lat (ms,99%): 137.35 err/s: 0.00 reconn/s: 0.00
            [ 1288s ] thds: 256 tps: 45362.25 qps: 45362.25 (r/w/o: 0.00/32337.89/13024.36) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1289s ] thds: 256 tps: 47941.21 qps: 47941.21 (r/w/o: 0.00/34073.42/13867.80) lat (ms,99%): 15.55 err/s: 0.00 reconn/s: 0.00
            [ 1290s ] thds: 256 tps: 48869.94 qps: 48869.94 (r/w/o: 0.00/34709.96/14159.98) lat (ms,99%): 10.09 err/s: 0.00 reconn/s: 0.00
            [ 1291s ] thds: 256 tps: 46091.53 qps: 46091.53 (r/w/o: 0.00/32806.82/13284.71) lat (ms,99%): 10.09 err/s: 0.00 reconn/s: 0.00
            [ 1292s ] thds: 256 tps: 46741.92 qps: 46741.92 (r/w/o: 0.00/33258.23/13483.69) lat (ms,99%): 73.13 err/s: 0.00 reconn/s: 0.00
            [ 1293s ] thds: 256 tps: 49509.12 qps: 49509.12 (r/w/o: 0.00/35481.93/14027.18) lat (ms,99%): 66.84 err/s: 0.00 reconn/s: 0.00
            [ 1294s ] thds: 256 tps: 43450.71 qps: 43450.71 (r/w/o: 0.00/30950.49/12500.22) lat (ms,99%): 15.83 err/s: 0.00 reconn/s: 0.00
            [ 1295s ] thds: 256 tps: 48791.38 qps: 48791.38 (r/w/o: 0.00/34645.01/14146.37) lat (ms,99%): 183.21 err/s: 0.00 reconn/s: 0.00
            [ 1296s ] thds: 256 tps: 42562.93 qps: 42562.93 (r/w/o: 0.00/30257.95/12304.98) lat (ms,99%): 16.41 err/s: 0.00 reconn/s: 0.00
            [ 1297s ] thds: 256 tps: 65641.78 qps: 65641.78 (r/w/o: 0.00/46758.54/18883.24) lat (ms,99%): 7.56 err/s: 0.00 reconn/s: 0.00
            [ 1298s ] thds: 256 tps: 38736.94 qps: 38736.94 (r/w/o: 0.00/27624.96/11111.98) lat (ms,99%): 15.55 err/s: 0.00 reconn/s: 0.00
            [ 1299s ] thds: 256 tps: 51147.80 qps: 51147.80 (r/w/o: 0.00/36503.14/14644.66) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1300s ] thds: 256 tps: 34908.85 qps: 34908.85 (r/w/o: 0.00/24826.60/10082.24) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1301s ] thds: 256 tps: 60257.94 qps: 60257.94 (r/w/o: 0.00/42931.96/17325.98) lat (ms,99%): 13.46 err/s: 0.00 reconn/s: 0.00
            [ 1302s ] thds: 256 tps: 44607.33 qps: 44607.33 (r/w/o: 0.00/31644.10/12963.22) lat (ms,99%): 179.94 err/s: 0.00 reconn/s: 0.00
            [ 1303s ] thds: 256 tps: 52189.03 qps: 52189.03 (r/w/o: 0.00/37171.75/15017.28) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00

            [ 1319s ] thds: 256 tps: 67309.85 qps: 67309.85 (r/w/o: 0.00/47982.61/19327.24) lat (ms,99%): 9.39 err/s: 0.00 reconn/s: 0.00
            [ 1320s ] thds: 256 tps: 6857.52 qps: 6857.52 (r/w/o: 0.00/4928.37/1929.15) lat (ms,99%): 189.93 err/s: 0.00 reconn/s: 0.00
            [ 1321s ] thds: 256 tps: 6017.51 qps: 6017.51 (r/w/o: 0.00/4291.65/1725.86) lat (ms,99%): 1618.78 err/s: 0.00 reconn/s: 0.00
            [ 1322s ] thds: 256 tps: 64763.63 qps: 64763.63 (r/w/o: 0.00/46201.32/18562.32) lat (ms,99%): 7.98 err/s: 0.00 reconn/s: 0.00


            QPS dips, disk util increases when Mariadb starts to do the furious flushing .
            Hi Team,

            This issue is related to [MDEV-25093|https://jira.mariadb.org/browse/MDEV-25093]. Adaptive flushing is not getting invoked after crossing the configured value of innodb_adaptive_flushing_lwm. I have attached the graph for checkpoint.

            My setup is 3 node galera cluster running 10.5.11 version and 26.4.8-focal galera package installed on ubuntu focal 20.04.

            [MDEV-25093|https://jira.mariadb.org/browse/MDEV-25093] is supposed to be fixed in 10.5.10, but I see a similar behaviour in all the mariadb versions starting from 10.5.7 to 10.5.11. I have shared the graph for 10.5.11 though.

            Below are the configs I used.



            {noformat}

            MariaDB [(none)]> select version();
            +-------------------------------------------+
            | version() |
            +-------------------------------------------+
            | 10.5.11-MariaDB-1:10.5.11+maria~focal-log |
            +-------------------------------------------+


            MariaDB [(none)]> show global variables like '%innodb_max_dirty_pages_pct%';
            +--------------------------------+-----------+
            | Variable_name | Value |
            +--------------------------------+-----------+
            | innodb_max_dirty_pages_pct | 90.000000 |
            | innodb_max_dirty_pages_pct_lwm | 10.000000 |
            +--------------------------------+-----------+

            MariaDB [(none)]> show global variables like '%innodb_adaptive_flushing%';
            +------------------------------+-----------+
            | Variable_name | Value |
            +------------------------------+-----------+
            | innodb_adaptive_flushing | ON |
            | innodb_adaptive_flushing_lwm | 10.000000 |
            +------------------------------+-----------+

            MariaDB [(none)]> show variables like '%innodb_flush%';
            +--------------------------------+----------+
            | Variable_name | Value |
            +--------------------------------+----------+
            | innodb_flush_log_at_timeout | 1 |
            | innodb_flush_log_at_trx_commit | 0 |
            | innodb_flush_method | O_DIRECT |
            | innodb_flush_neighbors | 1 |
            | innodb_flush_sync | ON |
            | innodb_flushing_avg_loops | 30 |
            +--------------------------------+----------+

            MariaDB [(none)]> show global variables like '%io_capa%';
            +------------------------+-------+
            | Variable_name | Value |
            +------------------------+-------+
            | innodb_io_capacity | 3000 |
            | innodb_io_capacity_max | 5000 |
            +------------------------+-------+

            {noformat}




            I tried by increasing/decreasing the innodb_adaptive_flushing_lwm and innodb_max_dirty_pages_pct_lwm but sill facing the same issue.

            I also tweaked innodb_io_capacity and innodb_io_capacity_max but no luck, btw we are using NVMe disks for this setup.

            Below are the status when Innodb_checkpoint_age reaches Innodb_checkpoint_max_age


            {noformat}
            MariaDB [(none)]> show status like 'Innodb_buffer_pool_pages%'; show status like 'Innodb_checkpoint_%';

            +-----------------------------------------+-----------+
            | Variable_name | Value |
            +-----------------------------------------+-----------+
            | Innodb_buffer_pool_pages_data | 10654068 |
            | Innodb_buffer_pool_pages_dirty | 9425873 |
            | Innodb_buffer_pool_pages_flushed | 100225151 |
            | Innodb_buffer_pool_pages_free | 16726607 |
            | Innodb_buffer_pool_pages_made_not_young | 12247924 |
            | Innodb_buffer_pool_pages_made_young | 565734067 |
            | Innodb_buffer_pool_pages_misc | 0 |
            | Innodb_buffer_pool_pages_old | 3932829 |
            | Innodb_buffer_pool_pages_total | 27380675 |
            | Innodb_buffer_pool_pages_lru_flushed | 0 |
            +-----------------------------------------+-----------+
            10 rows in set (0.000 sec)

            +---------------------------+-------------+
            | Variable_name | Value |
            +---------------------------+-------------+
            | Innodb_checkpoint_age | 26090991889 |
            | Innodb_checkpoint_max_age | 26091216876 |
            +---------------------------+-------------+

            {noformat}




            {noformat}
            Iostat

            Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz aqu-sz %util
            dm-0 28.80 460.80 0.00 0.00 0.22 16.00 30756.60 992005.60 0.00 0.00 0.04 32.25 0.00 0.00 0.00 0.00 0.00 0.00 1.27 99.68
            dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            dm-2 0.00 0.00 0.00 0.00 0.00 0.00 1.80 21.60 0.00 0.00 0.00 12.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.48
            dm-3 0.00 0.00 0.00 0.00 0.00 0.00 13.60 31140.80 0.00 0.00 1.29 2289.76 0.00 0.00 0.00 0.00 0.00 0.00 0.02 6.32
            sda 0.00 0.00 0.00 0.00 0.00 0.00 0.80 27.20 6.00 88.24 0.25 34.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.24
            sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
            sfd0n1 28.80 460.80 0.00 0.00 0.29 16.00 32618.60 1023168.00 0.00 0.00 0.05 31.37 0.00 0.00 0.00 0.00 0.00 0.00 0.16 99.68
            {noformat}



            QPS around the issue


            {noformat}
            [ 1252s ] thds: 256 tps: 69863.98 qps: 69863.98 (r/w/o: 0.00/49843.99/20019.99) lat (ms,99%): 5.88 err/s: 0.00 reconn/s: 0.00
            [ 1253s ] thds: 256 tps: 20750.41 qps: 20750.41 (r/w/o: 0.00/14817.86/5932.55) lat (ms,99%): 6.09 err/s: 0.00 reconn/s: 0.00
            [ 1254s ] thds: 256 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,99%): 0.00 err/s: 0.00 reconn/s: 0.00
            [ 1255s ] thds: 256 tps: 58797.66 qps: 58797.66 (r/w/o: 0.00/41615.64/17182.02) lat (ms,99%): 7.98 err/s: 0.00 reconn/s: 0.00
            [ 1256s ] thds: 256 tps: 67527.27 qps: 67527.27 (r/w/o: 0.00/48173.05/19354.22) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00
            [ 1257s ] thds: 256 tps: 58384.64 qps: 58384.64 (r/w/o: 0.00/41503.59/16881.05) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00

            [ 1273s ] thds: 256 tps: 69908.70 qps: 69908.70 (r/w/o: 0.00/49996.07/19912.63) lat (ms,99%): 6.09 err/s: 0.00 reconn/s: 0.00
            [ 1274s ] thds: 256 tps: 43327.15 qps: 43327.15 (r/w/o: 0.00/30641.10/12686.04) lat (ms,99%): 167.44 err/s: 0.00 reconn/s: 0.00
            [ 1275s ] thds: 256 tps: 32849.45 qps: 32849.45 (r/w/o: 0.00/23436.04/9413.42) lat (ms,99%): 118.92 err/s: 0.00 reconn/s: 0.00
            [ 1276s ] thds: 256 tps: 38600.51 qps: 38600.51 (r/w/o: 0.00/27316.24/11284.27) lat (ms,99%): 223.34 err/s: 0.00 reconn/s: 0.00
            [ 1277s ] thds: 256 tps: 41558.22 qps: 41558.22 (r/w/o: 0.00/29635.58/11922.64) lat (ms,99%): 15.83 err/s: 0.00 reconn/s: 0.00
            [ 1278s ] thds: 256 tps: 49127.92 qps: 49127.92 (r/w/o: 0.00/35045.52/14082.40) lat (ms,99%): 183.21 err/s: 0.00 reconn/s: 0.00
            [ 1279s ] thds: 256 tps: 47763.57 qps: 47763.57 (r/w/o: 0.00/33943.82/13819.74) lat (ms,99%): 15.27 err/s: 0.00 reconn/s: 0.00
            [ 1280s ] thds: 256 tps: 41276.96 qps: 41276.96 (r/w/o: 0.00/29521.97/11754.99) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1281s ] thds: 256 tps: 40896.95 qps: 40896.95 (r/w/o: 0.00/29010.55/11886.41) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1282s ] thds: 256 tps: 51294.76 qps: 51294.76 (r/w/o: 0.00/36633.83/14660.93) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1283s ] thds: 256 tps: 36653.38 qps: 36653.38 (r/w/o: 0.00/26246.55/10406.82) lat (ms,99%): 227.40 err/s: 0.00 reconn/s: 0.00
            [ 1284s ] thds: 256 tps: 47463.39 qps: 47463.39 (r/w/o: 0.00/33757.41/13705.98) lat (ms,99%): 8.43 err/s: 0.00 reconn/s: 0.00
            [ 1285s ] thds: 256 tps: 44322.09 qps: 44322.09 (r/w/o: 0.00/31595.92/12726.16) lat (ms,99%): 176.73 err/s: 0.00 reconn/s: 0.00
            [ 1286s ] thds: 256 tps: 45233.40 qps: 45233.40 (r/w/o: 0.00/32176.58/13056.83) lat (ms,99%): 139.85 err/s: 0.00 reconn/s: 0.00
            [ 1287s ] thds: 256 tps: 43981.65 qps: 43981.65 (r/w/o: 0.00/31424.61/12557.04) lat (ms,99%): 137.35 err/s: 0.00 reconn/s: 0.00
            [ 1288s ] thds: 256 tps: 45362.25 qps: 45362.25 (r/w/o: 0.00/32337.89/13024.36) lat (ms,99%): 164.45 err/s: 0.00 reconn/s: 0.00
            [ 1289s ] thds: 256 tps: 47941.21 qps: 47941.21 (r/w/o: 0.00/34073.42/13867.80) lat (ms,99%): 15.55 err/s: 0.00 reconn/s: 0.00
            [ 1290s ] thds: 256 tps: 48869.94 qps: 48869.94 (r/w/o: 0.00/34709.96/14159.98) lat (ms,99%): 10.09 err/s: 0.00 reconn/s: 0.00
            [ 1291s ] thds: 256 tps: 46091.53 qps: 46091.53 (r/w/o: 0.00/32806.82/13284.71) lat (ms,99%): 10.09 err/s: 0.00 reconn/s: 0.00
            [ 1292s ] thds: 256 tps: 46741.92 qps: 46741.92 (r/w/o: 0.00/33258.23/13483.69) lat (ms,99%): 73.13 err/s: 0.00 reconn/s: 0.00
            [ 1293s ] thds: 256 tps: 49509.12 qps: 49509.12 (r/w/o: 0.00/35481.93/14027.18) lat (ms,99%): 66.84 err/s: 0.00 reconn/s: 0.00
            [ 1294s ] thds: 256 tps: 43450.71 qps: 43450.71 (r/w/o: 0.00/30950.49/12500.22) lat (ms,99%): 15.83 err/s: 0.00 reconn/s: 0.00
            [ 1295s ] thds: 256 tps: 48791.38 qps: 48791.38 (r/w/o: 0.00/34645.01/14146.37) lat (ms,99%): 183.21 err/s: 0.00 reconn/s: 0.00
            [ 1296s ] thds: 256 tps: 42562.93 qps: 42562.93 (r/w/o: 0.00/30257.95/12304.98) lat (ms,99%): 16.41 err/s: 0.00 reconn/s: 0.00
            [ 1297s ] thds: 256 tps: 65641.78 qps: 65641.78 (r/w/o: 0.00/46758.54/18883.24) lat (ms,99%): 7.56 err/s: 0.00 reconn/s: 0.00
            [ 1298s ] thds: 256 tps: 38736.94 qps: 38736.94 (r/w/o: 0.00/27624.96/11111.98) lat (ms,99%): 15.55 err/s: 0.00 reconn/s: 0.00
            [ 1299s ] thds: 256 tps: 51147.80 qps: 51147.80 (r/w/o: 0.00/36503.14/14644.66) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1300s ] thds: 256 tps: 34908.85 qps: 34908.85 (r/w/o: 0.00/24826.60/10082.24) lat (ms,99%): 16.12 err/s: 0.00 reconn/s: 0.00
            [ 1301s ] thds: 256 tps: 60257.94 qps: 60257.94 (r/w/o: 0.00/42931.96/17325.98) lat (ms,99%): 13.46 err/s: 0.00 reconn/s: 0.00
            [ 1302s ] thds: 256 tps: 44607.33 qps: 44607.33 (r/w/o: 0.00/31644.10/12963.22) lat (ms,99%): 179.94 err/s: 0.00 reconn/s: 0.00
            [ 1303s ] thds: 256 tps: 52189.03 qps: 52189.03 (r/w/o: 0.00/37171.75/15017.28) lat (ms,99%): 7.43 err/s: 0.00 reconn/s: 0.00

            [ 1319s ] thds: 256 tps: 67309.85 qps: 67309.85 (r/w/o: 0.00/47982.61/19327.24) lat (ms,99%): 9.39 err/s: 0.00 reconn/s: 0.00
            [ 1320s ] thds: 256 tps: 6857.52 qps: 6857.52 (r/w/o: 0.00/4928.37/1929.15) lat (ms,99%): 189.93 err/s: 0.00 reconn/s: 0.00
            [ 1321s ] thds: 256 tps: 6017.51 qps: 6017.51 (r/w/o: 0.00/4291.65/1725.86) lat (ms,99%): 1618.78 err/s: 0.00 reconn/s: 0.00
            [ 1322s ] thds: 256 tps: 64763.63 qps: 64763.63 (r/w/o: 0.00/46201.32/18562.32) lat (ms,99%): 7.98 err/s: 0.00 reconn/s: 0.00

            {noformat}


            QPS dips, disk util increases when Mariadb starts to do the furious flushing .
            gktomar Gaurav Tomar made changes -
            Attachment BP data.png [ 58171 ]
            gktomar Gaurav Tomar made changes -
            Attachment BP data.png [ 58173 ]

            gktomar, can you please try to collect Innodb_buffer_pool_pages_flushed and Innodb_checkpoint_age during the workload?

            Also, can you share a minimal complete configuration for repeating this? I do not think that Galera should play a role here, so the minimal configuration should be without Galera.

            marko Marko Mäkelä added a comment - gktomar , can you please try to collect Innodb_buffer_pool_pages_flushed and Innodb_checkpoint_age during the workload? Also, can you share a minimal complete configuration for repeating this? I do not think that Galera should play a role here, so the minimal configuration should be without Galera.
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ]

            I now see that Checkpoint.png is a graph of the Innodb_checkpoint_age, but it is not clear to me what BP data.png represents.

            marko Marko Mäkelä added a comment - I now see that Checkpoint.png is a graph of the Innodb_checkpoint_age , but it is not clear to me what BP data.png represents.
            gktomar Gaurav Tomar added a comment -

            Estimated dirty limit is set by innodb_max_dirty_pages_pct variables which is set to 90, total data in buffer pool is 163GB, hence Estimated dirty limit becomes 146.7GB. Data dirty are the Modified db pages. What BP data image is representing is dirty pages are getting flushed only when innodb_max_dirty_pages_pct is crossed, as per the doc it should have been started after crossing innodb_max_dirty_pages_pct_lwm

            One other impact of this issue is stopping the mariadb takes a longer time as it have to flush the dirty pages to disk during the shutdown.

            This issue can be reproduced on a stand alone mariadb instance with below configurations:

            innodb_log_file_size = 2G
            innodb_buffer_pool_size = 32G
            innodb_file_per_table = 1
            innodb_open_files = 400
            innodb_io_capacity = 3000
            innodb_io_capacity_max = 5000
            innodb_flush_method = O_DIRECT
            innodb_flush_log_at_trx_commit=0
            innodb_adaptive_flushing_lwm=10
            innodb_max_dirty_pages_pct=90
            innodb_max_dirty_pages_pct_lwm=10
            innodb_flush_sync=1

            Innodb_buffer_pool_pages_flushed and Innodb_checkpoint_age around the issue are provided in the description of the ticket.

            gktomar Gaurav Tomar added a comment - Estimated dirty limit is set by innodb_max_dirty_pages_pct variables which is set to 90, total data in buffer pool is 163GB, hence Estimated dirty limit becomes 146.7GB. Data dirty are the Modified db pages. What BP data image is representing is dirty pages are getting flushed only when innodb_max_dirty_pages_pct is crossed, as per the doc it should have been started after crossing innodb_max_dirty_pages_pct_lwm One other impact of this issue is stopping the mariadb takes a longer time as it have to flush the dirty pages to disk during the shutdown. This issue can be reproduced on a stand alone mariadb instance with below configurations: innodb_log_file_size = 2G innodb_buffer_pool_size = 32G innodb_file_per_table = 1 innodb_open_files = 400 innodb_io_capacity = 3000 innodb_io_capacity_max = 5000 innodb_flush_method = O_DIRECT innodb_flush_log_at_trx_commit=0 innodb_adaptive_flushing_lwm=10 innodb_max_dirty_pages_pct=90 innodb_max_dirty_pages_pct_lwm=10 innodb_flush_sync=1 Innodb_buffer_pool_pages_flushed and Innodb_checkpoint_age around the issue are provided in the description of the ticket.
            krunalbauskar Krunal Bauskar added a comment - - edited

            I tried to re-validate the issue with 10.5.11 and found that the adaptive flushing does kick in as expected.
            ------------------------------------------------------------------------

            a. Started server with
            innodb_max_dirty_pages_pct=90
            innodb_max_dirty_pages_pct_lwm=90 (***** not the high value)
            (innodb_adaptive_flushing_lwm=10 (default))

            b. This means adaptive flushing will kick in till when

            (A): dirty_pages > 90% of the buffer pool pages
            ---- OR ----
            (B): redo > 10% full.

            [I purposely used this configuration so that we see the effect of (B) before (A) kicks in].

            ------------------------------

            at-start
            =========

            MariaDB [(none)]> show status like 'Innodb_buffer_pool%pages%'; show status like '%check%age%';

            Variable_name Value
            Innodb_buffer_pool_pages_data 2055511
            Innodb_buffer_pool_pages_dirty 0
            Innodb_buffer_pool_pages_flushed 0
            Innodb_buffer_pool_pages_free 525289
            Innodb_buffer_pool_pages_made_not_young 0
            Innodb_buffer_pool_pages_made_young 0
            Innodb_buffer_pool_pages_misc 0
            Innodb_buffer_pool_pages_old 758792
            Innodb_buffer_pool_pages_total 2580800
            Innodb_buffer_pool_pages_lru_flushed 0
            Variable_name Value
            Innodb_checkpoint_age 12
            Innodb_checkpoint_max_age 8696599327

            2 rows in set (0.000 sec)

            -----------------

            adaptive flushing kicked in
            ======================

            MariaDB [(none)]> show status like 'Innodb_buffer_pool%pages%'; show status like '%check%age%';

            Variable_name Value
            Innodb_buffer_pool_pages_data 2198223
            Innodb_buffer_pool_pages_dirty 142583
            Innodb_buffer_pool_pages_flushed 4158
            Innodb_buffer_pool_pages_free 382577
            Innodb_buffer_pool_pages_made_not_young 542
            Innodb_buffer_pool_pages_made_young 23750
            Innodb_buffer_pool_pages_misc 0
            Innodb_buffer_pool_pages_old 811433
            Innodb_buffer_pool_pages_total 2580800
            Innodb_buffer_pool_pages_lru_flushed 0
            Variable_name Value
            Innodb_checkpoint_age 1003881850
            Innodb_checkpoint_max_age 8696599327

            ------------------------------

            Observations:
            1. If we see the output (2) above we can observe that dirty pct is approx 5% (142583). Given our threshold of 90% (A) is not satisfied.

            2. checkpoint_age > 10% that causes adaptive flushing to kick in as (B) is satisfied.
            The effect of adaptive flushing could be seen with increasing Innodb_buffer_pool_pages_flushed.

            krunalbauskar Krunal Bauskar added a comment - - edited I tried to re-validate the issue with 10.5.11 and found that the adaptive flushing does kick in as expected. ------------------------------------------------------------------------ a. Started server with innodb_max_dirty_pages_pct=90 innodb_max_dirty_pages_pct_lwm=90 (***** not the high value) (innodb_adaptive_flushing_lwm=10 (default)) b. This means adaptive flushing will kick in till when (A): dirty_pages > 90% of the buffer pool pages ---- OR ---- (B): redo > 10% full. [I purposely used this configuration so that we see the effect of (B) before (A) kicks in] . ------------------------------ at-start ========= MariaDB [(none)] > show status like 'Innodb_buffer_pool%pages%'; show status like '%check%age%'; Variable_name Value Innodb_buffer_pool_pages_data 2055511 Innodb_buffer_pool_pages_dirty 0 Innodb_buffer_pool_pages_flushed 0 Innodb_buffer_pool_pages_free 525289 Innodb_buffer_pool_pages_made_not_young 0 Innodb_buffer_pool_pages_made_young 0 Innodb_buffer_pool_pages_misc 0 Innodb_buffer_pool_pages_old 758792 Innodb_buffer_pool_pages_total 2580800 Innodb_buffer_pool_pages_lru_flushed 0 Variable_name Value Innodb_checkpoint_age 12 Innodb_checkpoint_max_age 8696599327 2 rows in set (0.000 sec) ----------------- adaptive flushing kicked in ====================== MariaDB [(none)] > show status like 'Innodb_buffer_pool%pages%'; show status like '%check%age%'; Variable_name Value Innodb_buffer_pool_pages_data 2198223 Innodb_buffer_pool_pages_dirty 142583 Innodb_buffer_pool_pages_flushed 4158 Innodb_buffer_pool_pages_free 382577 Innodb_buffer_pool_pages_made_not_young 542 Innodb_buffer_pool_pages_made_young 23750 Innodb_buffer_pool_pages_misc 0 Innodb_buffer_pool_pages_old 811433 Innodb_buffer_pool_pages_total 2580800 Innodb_buffer_pool_pages_lru_flushed 0 Variable_name Value Innodb_checkpoint_age 1003881850 Innodb_checkpoint_max_age 8696599327 ------------------------------ Observations: 1. If we see the output (2) above we can observe that dirty pct is approx 5% (142583). Given our threshold of 90% (A) is not satisfied. 2. checkpoint_age > 10% that causes adaptive flushing to kick in as (B) is satisfied. The effect of adaptive flushing could be seen with increasing Innodb_buffer_pool_pages_flushed.

            Gaurav Tomar,

            1. As Marko pointed you should try to monitor the pages-flushed.

            2. I see with your standalone configuration you have used a pretty small redo-log file size (2G).
            That means once you cross 200 MB mark you should see adaptive-flushing getting active.

            [There could be a thin line as to whether you are meeting condition (A) or (B).
            To ensure that you can increase the innodb_max_dirty_pages_pct_lwm so ideally
            (A) is too far and (B) will be met immediately].

            3. Despite the said fix there are good chances that you may still hit flush-storm.
            Enabling adaptive flushing doesn't ensure that flushing can flush to match the rate
            of dirty page generation.

            This is another known issue. Adaptive flushing algorithm needs to be either more aggressive or better tune to handle increasing pressure.
            I recall discussing it with Marko a while back there are some experiments done https://jira.mariadb.org/browse/MDEV-25113 but I have not tested it
            yet.

            krunalbauskar Krunal Bauskar added a comment - Gaurav Tomar, 1. As Marko pointed you should try to monitor the pages-flushed. 2. I see with your standalone configuration you have used a pretty small redo-log file size (2G). That means once you cross 200 MB mark you should see adaptive-flushing getting active. [There could be a thin line as to whether you are meeting condition (A) or (B). To ensure that you can increase the innodb_max_dirty_pages_pct_lwm so ideally (A) is too far and (B) will be met immediately]. 3. Despite the said fix there are good chances that you may still hit flush-storm. Enabling adaptive flushing doesn't ensure that flushing can flush to match the rate of dirty page generation. This is another known issue. Adaptive flushing algorithm needs to be either more aggressive or better tune to handle increasing pressure. I recall discussing it with Marko a while back there are some experiments done https://jira.mariadb.org/browse/MDEV-25113 but I have not tested it yet.
            gktomar Gaurav Tomar added a comment -

            @Krunal: Yes it seems the adaptive flushing is actually getting kicked in but the rate of flushing is not much to keep the Innodb_checkpoint_age below Innodb_checkpoint_max_age.

            Will wait for the 10.5.12 release to test this scenario again.

            gktomar Gaurav Tomar added a comment - @Krunal: Yes it seems the adaptive flushing is actually getting kicked in but the rate of flushing is not much to keep the Innodb_checkpoint_age below Innodb_checkpoint_max_age. Will wait for the 10.5.12 release to test this scenario again.

            I do not think that anything has been recently changed about the old adaptive flushing code. But, we may have made that code obsolete.

            Slightly after the 10.5.11 release, we reduced the latency related to page flushing and LRU eviction, to ultimately address MDEV-25451: MDEV-25954, MDEV-25948, MDEV-25113, MDEV-26004.

            The reported MDEV-25113 scenario (read-only workload being slow after a read-write workload) was addressed by 3 fixes:

            1. removing buf_pool.flush_list_mutex acquisition from the page write completion callback
            2. (maybe this should have been called the MDEV-25451 fix) implementing a ‘soft’ aggressive flushing trigger
            3. (the main MDEV-25113 fix, postponed to MDEV-26004) reduce the latency of buf_LRU_get_free_block()

            It may be that triggering a little more aggressive flushing earlier would reduce or eliminate the need for the old adaptive flushing mechanism. In control theory, PID controllers are well established and understood. The pre-flushing trigger is an example of a Positional controller with no I or D component. The "position" would be the checkpoint age.

            If we still need the old adaptive flushing mechanism, I think that we should spend time to fully understand it, and conduct some experiments where we would collect more data and try to tune the P,I,D coefficients (if the old code really forms a PID controller).

            marko Marko Mäkelä added a comment - I do not think that anything has been recently changed about the old adaptive flushing code. But, we may have made that code obsolete. Slightly after the 10.5.11 release, we reduced the latency related to page flushing and LRU eviction, to ultimately address MDEV-25451 : MDEV-25954 , MDEV-25948 , MDEV-25113 , MDEV-26004 . The reported MDEV-25113 scenario (read-only workload being slow after a read-write workload) was addressed by 3 fixes: removing buf_pool.flush_list_mutex acquisition from the page write completion callback (maybe this should have been called the MDEV-25451 fix) implementing a ‘soft’ aggressive flushing trigger (the main MDEV-25113 fix, postponed to MDEV-26004 ) reduce the latency of buf_LRU_get_free_block() It may be that triggering a little more aggressive flushing earlier would reduce or eliminate the need for the old adaptive flushing mechanism. In control theory, PID controllers are well established and understood. The pre-flushing trigger is an example of a Positional controller with no I or D component. The "position" would be the checkpoint age. If we still need the old adaptive flushing mechanism, I think that we should spend time to fully understand it, and conduct some experiments where we would collect more data and try to tune the P,I,D coefficients (if the old code really forms a PID controller).
            serg Sergei Golubchik made changes -
            Priority Blocker [ 1 ] Major [ 3 ]

            gktomar, did you try MariaDB Server 10.5.12 yet?

            marko Marko Mäkelä added a comment - gktomar , did you try MariaDB Server 10.5.12 yet?
            marko Marko Mäkelä made changes -
            Labels regression need_feedback regression
            gktomar Gaurav Tomar added a comment -

            @marko I'm currently testing this on 10.5.12, will get back with the results.

            gktomar Gaurav Tomar added a comment - @marko I'm currently testing this on 10.5.12, will get back with the results.

            gktomar, do you have any results yet?

            marko Marko Mäkelä added a comment - gktomar , do you have any results yet?
            gktomar Gaurav Tomar added a comment -

            @Marko I tested the dirty pages flushing behavior in MariaDB 10.5.12, and below are my observations.

            1. Adaptive flushing is getting enabled only when innodb_max_dirty_pages_pct_lwm<>0 even if innodb_adaptive_flushing is set to on.
            2. Adaptive flushing is getting enabled both the times when the Innodb_checkpoint_age crosses the configured value of innodb_adaptive_flushing_lwm or when dirty pages cross the configured value of innodb_max_dirty_pages_pct_lwm.
            3. Innodb_checkpoint_age does not reach the Innodb_checkpoint_max_age in this version as it used to in 10.5.11.
            4. Instead when the Innodb_checkpoint_age reaches ~ 71% of the total redo log size, async flushing starts, keeping the Innodb_checkpoint_age below Innodb_checkpoint_max_age. Since innodb_flush_sync is enabled in our setup, we see a burst of IO at this point (this happens with furious flushing aka sync flush but not sure if this should be the case with async flushing also) and a small dip in QPS.

            [ 466s ] thds: 36 tps: 8597.20 qps: 171976.99 (r/w/o: 120391.79/34389.80/17195.40) lat (ms,99%): 5.99 err/s: 0.00 reconn/s: 0.00
            [ 467s ] thds: 36 tps: 6853.07 qps: 137122.31 (r/w/o: 95985.92/27431.26/13705.13) lat (ms,99%): 6.79 err/s: 0.00 reconn/s: 0.00
            [ 468s ] thds: 36 tps: 8804.10 qps: 176017.02 (r/w/o: 123205.41/35202.40/17609.20) lat (ms,99%): 5.99 err/s: 0.00 reconn/s: 0.00

            [ 696s ] thds: 36 tps: 8634.82 qps: 172760.46 (r/w/o: 120947.52/34541.29/17271.65) lat (ms,99%): 6.21 err/s: 0.00 reconn/s: 0.00
            [ 697s ] thds: 36 tps: 6767.29 qps: 135356.78 (r/w/o: 94745.04/27077.16/13534.58) lat (ms,99%): 7.04 err/s: 0.00 reconn/s: 0.00
            [ 698s ] thds: 36 tps: 8811.16 qps: 176146.18 (r/w/o: 123289.22/35235.64/17621.32) lat (ms,99%): 6.55 err/s: 0.00 reconn/s: 0.00

            gktomar Gaurav Tomar added a comment - @Marko I tested the dirty pages flushing behavior in MariaDB 10.5.12, and below are my observations. Adaptive flushing is getting enabled only when innodb_max_dirty_pages_pct_lwm<>0 even if innodb_adaptive_flushing is set to on. Adaptive flushing is getting enabled both the times when the Innodb_checkpoint_age crosses the configured value of innodb_adaptive_flushing_lwm or when dirty pages cross the configured value of innodb_max_dirty_pages_pct_lwm. Innodb_checkpoint_age does not reach the Innodb_checkpoint_max_age in this version as it used to in 10.5.11. Instead when the Innodb_checkpoint_age reaches ~ 71% of the total redo log size, async flushing starts, keeping the Innodb_checkpoint_age below Innodb_checkpoint_max_age. Since innodb_flush_sync is enabled in our setup, we see a burst of IO at this point (this happens with furious flushing aka sync flush but not sure if this should be the case with async flushing also) and a small dip in QPS. [ 466s ] thds: 36 tps: 8597.20 qps: 171976.99 (r/w/o: 120391.79/34389.80/17195.40) lat (ms,99%): 5.99 err/s: 0.00 reconn/s: 0.00 [ 467s ] thds: 36 tps: 6853.07 qps: 137122.31 (r/w/o: 95985.92/27431.26/13705.13) lat (ms,99%): 6.79 err/s: 0.00 reconn/s: 0.00 [ 468s ] thds: 36 tps: 8804.10 qps: 176017.02 (r/w/o: 123205.41/35202.40/17609.20) lat (ms,99%): 5.99 err/s: 0.00 reconn/s: 0.00 [ 696s ] thds: 36 tps: 8634.82 qps: 172760.46 (r/w/o: 120947.52/34541.29/17271.65) lat (ms,99%): 6.21 err/s: 0.00 reconn/s: 0.00 [ 697s ] thds: 36 tps: 6767.29 qps: 135356.78 (r/w/o: 94745.04/27077.16/13534.58) lat (ms,99%): 7.04 err/s: 0.00 reconn/s: 0.00 [ 698s ] thds: 36 tps: 8811.16 qps: 176146.18 (r/w/o: 123289.22/35235.64/17621.32) lat (ms,99%): 6.55 err/s: 0.00 reconn/s: 0.00
            gktomar Gaurav Tomar made changes -
            Attachment checkpoint_10_5_12.png [ 59017 ]
            gktomar Gaurav Tomar added a comment - - edited

            Here is the graph for checkpoint age in 10.5.12.

            gktomar Gaurav Tomar added a comment - - edited Here is the graph for checkpoint age in 10.5.12.

            gktomar, thank you. It sounds like the flush-ahead patch is working as designed and preventing a deeper stall.

            We may still want to do something about adaptive flushing, to have some background flushing prevent even that small drop of performance. On faster storage (such as NVMe), the write bursts should be less of a problem.

            I have been running some benchmarks on 10.6 today. On a SATA SSD, when having about 40GiB of data in a 30GiB buffer pool (so that LRU eviction will be constantly happening during the oltp_update_non_index workload), I am seeing rather stable throughput. If I increase the buffer pool size to 40GiB so that no LRU eviction will take place, then I am seeing 5-second intervals with 0tps every now and then. We might dismiss that as a user error, because the innodb_io_capacity was configured way too high (for my NVMe, not the much slower SATA SSD).

            marko Marko Mäkelä added a comment - gktomar , thank you. It sounds like the flush-ahead patch is working as designed and preventing a deeper stall. We may still want to do something about adaptive flushing, to have some background flushing prevent even that small drop of performance. On faster storage (such as NVMe), the write bursts should be less of a problem. I have been running some benchmarks on 10.6 today. On a SATA SSD, when having about 40GiB of data in a 30GiB buffer pool (so that LRU eviction will be constantly happening during the oltp_update_non_index workload), I am seeing rather stable throughput. If I increase the buffer pool size to 40GiB so that no LRU eviction will take place, then I am seeing 5-second intervals with 0tps every now and then. We might dismiss that as a user error, because the innodb_io_capacity was configured way too high (for my NVMe, not the much slower SATA SSD).

            It remains to be seen how much MDEV-26626 would improve this.

            marko Marko Mäkelä added a comment - It remains to be seen how much MDEV-26626 would improve this.
            serg Sergei Golubchik made changes -
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.6 [ 24028 ]
            serg Sergei Golubchik made changes -
            Labels need_feedback regression regression
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 123139 ] MariaDB v4 [ 142967 ]
            shalogo Leonard Sinquer added a comment - - edited

            I have the same issue on 10.6.7. Regardless of `innodb_io_capacity`, or `innodb_adaptive_flushing_lwm` the checkpoint age keeps growing until it reaches the 75% and then its flushed furiously.

            I agree with @Gaurav Tomar that:

            Adaptive flushing is getting enabled only when `innodb_max_dirty_pages_pct_lwm<>0` even if `innodb_adaptive_flushing` is set to `on`.

            When I set `innodb_max_dirty_pages_pct_lwm=0.000001` the adaptive flushing seems to kick in properly. With `innodb_max_dirty_pages_pct_lwm=0`, the adaptive flushing never kicks in, even when `innodb_adaptive_flushing_lwm` is reached

            shalogo Leonard Sinquer added a comment - - edited I have the same issue on 10.6.7. Regardless of `innodb_io_capacity`, or `innodb_adaptive_flushing_lwm` the checkpoint age keeps growing until it reaches the 75% and then its flushed furiously. I agree with @Gaurav Tomar that: Adaptive flushing is getting enabled only when `innodb_max_dirty_pages_pct_lwm<>0` even if `innodb_adaptive_flushing` is set to `on`. When I set `innodb_max_dirty_pages_pct_lwm=0.000001` the adaptive flushing seems to kick in properly. With `innodb_max_dirty_pages_pct_lwm=0`, the adaptive flushing never kicks in, even when `innodb_adaptive_flushing_lwm` is reached

            shalogo, yes, the value innodb_max_dirty_pages_pct_lwm=0 is special ("use innodb_max_dirty_pages_pct"). For a few releases between MDEV-23855 and MDEV-24537, that special meaning had been lost. See also MDEV-27295.

            axel mentioned some time ago that the adaptive flushing does not work as well as he would like to. I think that some test parameters and results are needed before we can improve this area.

            marko Marko Mäkelä added a comment - shalogo , yes, the value innodb_max_dirty_pages_pct_lwm=0 is special ("use innodb_max_dirty_pages_pct "). For a few releases between MDEV-23855 and MDEV-24537 , that special meaning had been lost. See also MDEV-27295 . axel mentioned some time ago that the adaptive flushing does not work as well as he would like to. I think that some test parameters and results are needed before we can improve this area.
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Axel Schwenke [ axel ]
            marko Marko Mäkelä made changes -

            MySQL Bug #74637 has been filed for an interesting idea. Rephrased in the MariaDB 10.5+ terminology: If the innodb_max_dirty_pages_pct or innodb_max_dirty_pages_pct_lwm condition is satisfied but we are not yet near the log_free_check() limit (neither buf_flush_async_lsn nor buf_flush_sync_lsn has been set), then buf_flush_page_cleaner() could write out pages from buf_pool.LRU instead of buf_pool.flush_list.

            marko Marko Mäkelä added a comment - MySQL Bug #74637 has been filed for an interesting idea. Rephrased in the MariaDB 10.5+ terminology: If the innodb_max_dirty_pages_pct or innodb_max_dirty_pages_pct_lwm condition is satisfied but we are not yet near the log_free_check() limit (neither buf_flush_async_lsn nor buf_flush_sync_lsn has been set), then buf_flush_page_cleaner() could write out pages from buf_pool.LRU instead of buf_pool.flush_list .
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            marko Marko Mäkelä made changes -
            Assignee Axel Schwenke [ axel ] Marko Mäkelä [ marko ]
            marko Marko Mäkelä made changes -
            Status Confirmed [ 10101 ] In Progress [ 3 ]

            There are two parameters related to background flushing: innodb_max_dirty_pages_pct_lwm>0 (default 0) and innodb_adaptive_flushing=ON (default ON).

            marko Marko Mäkelä added a comment - There are two parameters related to background flushing: innodb_max_dirty_pages_pct_lwm>0 (default 0) and innodb_adaptive_flushing=ON (default ON).

            I noticed that there is one more parameter related to adaptive flushing: innodb_adaptive_flushing_lwm (default: 10 percent of the log capacity). That could enable adaptive flushing even when innodb_max_dirty_pages_pct_lwm=0. If the innodb_log_file_size is large and the workload consists of update-in-place of non-indexed columns and involves a lot of thrashing (the working set is larger than the buffer pool), such checkpoint-oriented adaptive flushing may not help much. You’d additionally need to set innodb_max_dirty_pages_pct_lwm to the maximum allowed percentage of dirty pages in the buffer pool before page writes would be initiated.

            marko Marko Mäkelä added a comment - I noticed that there is one more parameter related to adaptive flushing: innodb_adaptive_flushing_lwm (default: 10 percent of the log capacity). That could enable adaptive flushing even when innodb_max_dirty_pages_pct_lwm=0 . If the innodb_log_file_size is large and the workload consists of update-in-place of non-indexed columns and involves a lot of thrashing (the working set is larger than the buffer pool), such checkpoint-oriented adaptive flushing may not help much. You’d additionally need to set innodb_max_dirty_pages_pct_lwm to the maximum allowed percentage of dirty pages in the buffer pool before page writes would be initiated.

            jgagne suggested to me that 1 second is a long time to wait for buf_flush_page_cleaner() to free up some pages. But, in an oltp_update_index test with innodb_adaptive_flushing_lwm=0, I observed a significant regression of throughput for reducing the scheduling interval to 0.1 seconds:

            revision throughput/tps 99th-percentile latency/ms
            10.6 67a6ad0a4a36bd59fabbdd6b1cdd38de54e82c79 123737.93 0.30
            e6b4c770dba6f04c22dc2f66e7133f202b877f61 122920.34 0.30
            52d65c66fdb5fd1554e6aefcb81587490424d8ee 111422.78 0.39

            This was a quick 120-second run with 16 concurrent connections, 4GiB buffer pool and log, 8×2M rows.

            The first revision (making buf_flush_page_cleaner() stock the buf_pool.free list) could be worth broader testing.

            marko Marko Mäkelä added a comment - jgagne suggested to me that 1 second is a long time to wait for buf_flush_page_cleaner() to free up some pages. But, in an oltp_update_index test with innodb_adaptive_flushing_lwm=0 , I observed a significant regression of throughput for reducing the scheduling interval to 0.1 seconds: revision throughput/tps 99th-percentile latency/ms 10.6 67a6ad0a4a36bd59fabbdd6b1cdd38de54e82c79 123737.93 0.30 e6b4c770dba6f04c22dc2f66e7133f202b877f61 122920.34 0.30 52d65c66fdb5fd1554e6aefcb81587490424d8ee 111422.78 0.39 This was a quick 120-second run with 16 concurrent connections, 4GiB buffer pool and log, 8×2M rows. The first revision (making buf_flush_page_cleaner() stock the buf_pool.free list) could be worth broader testing.

            I revised the main commit, still based on the same 10.6 parent commit 67a6ad0a4a36bd59fabbdd6b1cdd38de54e82c79. I did not run any performance tests yet, but I think that this one should work more accurately and with fewer operations on buf_pool.mutex. I think that the original commit may have been wrongly evicting pages that had been freed.

            marko Marko Mäkelä added a comment - I revised the main commit , still based on the same 10.6 parent commit 67a6ad0a4a36bd59fabbdd6b1cdd38de54e82c79. I did not run any performance tests yet, but I think that this one should work more accurately and with fewer operations on buf_pool.mutex . I think that the original commit may have been wrongly evicting pages that had been freed.
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Fix Version/s 10.6.13 [ 28514 ]
            Fix Version/s 10.8.8 [ 28518 ]
            Fix Version/s 10.9.6 [ 28520 ]
            Fix Version/s 10.10.4 [ 28522 ]
            Fix Version/s 10.11.3 [ 28524 ]
            Fix Version/s 11.0.2 [ 28706 ]
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.6 [ 24028 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]

            Many performance fixes on top of this were made in MDEV-26827.

            marko Marko Mäkelä added a comment - Many performance fixes on top of this were made in MDEV-26827 .
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -

            axel managed to reproduce a couple of times a strange anomaly where the system throughput would soar for a while, and the buf_flush_page_cleaner thread would remain idle.

            It looks like in MDEV-24369, a multiplication by 100 for calculating pct_for_dirty was inadvertently deleted. This follow-up fix should address that and make the adaptive flushing proportionally reduce the page write rate below innodb_io_capacity/s. By MDEV-26827, any ‘unused’ page writes in each one-second slot would be used for writing out up to innodb_lru_scan_depth dirty pages from the end of buf_pool.LRU list.

            marko Marko Mäkelä added a comment - axel managed to reproduce a couple of times a strange anomaly where the system throughput would soar for a while, and the buf_flush_page_cleaner thread would remain idle. It looks like in MDEV-24369 , a multiplication by 100 for calculating pct_for_dirty was inadvertently deleted. This follow-up fix should address that and make the adaptive flushing proportionally reduce the page write rate below innodb_io_capacity /s. By MDEV-26827 , any ‘unused’ page writes in each one-second slot would be used for writing out up to innodb_lru_scan_depth dirty pages from the end of buf_pool.LRU list.
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -

            Something similar to jeanfrancois.gagne’s idea had been implemented in MDEV-6932 but probably lost when the InnoDB changes from MySQL 5.7.9 were applied to MariaDB Server 10.2.2.

            marko Marko Mäkelä added a comment - Something similar to jeanfrancois.gagne ’s idea had been implemented in MDEV-6932 but probably lost when the InnoDB changes from MySQL 5.7.9 were applied to MariaDB Server 10.2.2.
            adrake Andrew Drake made changes -
            marko Marko Mäkelä made changes -

            In order to fix a frequent regression test due to a timeout (MDEV-32681), I ported this change to 10.5 as well.

            I observed a small improvement when running a 120-second, 16-client Sysbench oltp_update_index test on Intel Optane 960 NVMe, with 4 GiB buffer pool and log size, 8×2M rows, and innodb_flush_log_at_trx_commit=0. This is an I/O bound workload; the data directory would grow to 12 or 13 GiB (that is, 8 or 9 GiB of data files in addition to the 4 GiB ib_logfile0).

            revision throughput/tps 99th-percentile latency/ms max latency/ms sum of latency/ms
            baseline 113246.16 0.59 35.80 1913510.12
            patch 116098.43 0.46 60.58 1914245.80

            The maximum latency as well as the sum of latency are slightly worse, but those are a little random in nature. We can observe some improvement in the throughput as well as in the typical latency.

            marko Marko Mäkelä added a comment - In order to fix a frequent regression test due to a timeout ( MDEV-32681 ), I ported this change to 10.5 as well. I observed a small improvement when running a 120-second, 16-client Sysbench oltp_update_index test on Intel Optane 960 NVMe, with 4 GiB buffer pool and log size, 8×2M rows, and innodb_flush_log_at_trx_commit=0 . This is an I/O bound workload; the data directory would grow to 12 or 13 GiB (that is, 8 or 9 GiB of data files in addition to the 4 GiB ib_logfile0 ). revision throughput/tps 99th-percentile latency/ms max latency/ms sum of latency/ms baseline 113246.16 0.59 35.80 1913510.12 patch 116098.43 0.46 60.58 1914245.80 The maximum latency as well as the sum of latency are slightly worse, but those are a little random in nature. We can observe some improvement in the throughput as well as in the typical latency.
            marko Marko Mäkelä made changes -
            Fix Version/s 10.5.24 [ 29517 ]
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -

            People

              marko Marko Mäkelä
              gktomar Gaurav Tomar
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.