Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29967

innodb_read_ahead_threshold (linear read-ahead) does not work

Details

    Description

      We upgraded an MariaDB Galera Cluster from Version 10.5.13 to 10.9.3. Everything is up and running, from a client's point of view.

      However we see a regression in I/O access when running mariadb-dump. A custom script (dumping and compressing every table in separate file) ran for about 45 minutes with MariaDB 10.5.13, the time increased to about two and a half hour with MariaDB 10.9.3.

      A simple cp can do about 500MB/s on the machines, with mariadb-dump we get about 25MB/s - limited by disk I/O (where mariadbd is in I/O wait / state "D"). MariaDB 10.5.13 did something about 100MB/s - limited by compression, so could probably do a lot more.

      This happens with all available I/O methods, tried with aio, uring and native. Also tried different filesystems (ext4 & btrfs) which does not make a difference. Same results with a cluster node and standalone (non-Galera) installation.

      No idea if I/O for regular SQL queries regressed as well. The machines have a reasonable amount of RAM, so no delay is noticeable there. Running mariabackup for Galera state transfer reaches expected transfer rates.

      Attachments

        Issue Links

          Activity

            I reviewed the logic of buf_read_ahead_random(), and I do not notice any regression there. I see that the default is innodb_random_read_ahead=OFF. Only linear read-ahead is enabled by default. It would be disabled if read_ahead_threshold=0. The default value is 56 and the maximum is 64.

            The next step is to test with innodb_random_read_ahead=ON and innodb_read_ahead_threshold=0 to see if the random read-ahead will kick in.

            marko Marko Mäkelä added a comment - I reviewed the logic of buf_read_ahead_random() , and I do not notice any regression there. I see that the default is innodb_random_read_ahead=OFF . Only linear read-ahead is enabled by default. It would be disabled if read_ahead_threshold=0 . The default value is 56 and the maximum is 64. The next step is to test with innodb_random_read_ahead=ON and innodb_read_ahead_threshold=0 to see if the random read-ahead will kick in.

            Random read ahead seems to be fine. I added a wait for everything to be purged, so that we can compare the total execution time between 10.4 and 10.5.

            --source include/have_innodb.inc
            --source include/have_sequence.inc
            create table t1 engine=innodb as select seq from seq_1_to_10000000;
            let $datadir=`select @@datadir`;
            --source include/shutdown_mysqld.inc
            remove_file $datadir/ib_buffer_pool;
            --source include/start_mysqld.inc
            show global status like 'innodb_buffer_pool_read%';
            SET GLOBAL innodb_max_purge_lag_wait=0;
            show global status like 'innodb_buffer_pool_read%';
            DROP TABLE t1;
            

            ./mtr --mysqld=--innodb-{buffer-pool,log-file}-size=1g --mysqld=--innodb-read-ahead-threshold=0 --mysqld=--innodb-random-read-ahead=ON --innodb-purge-threas=32 main.name_of_test
            

            10.4 84b9fc25a29b94a37eb9d5ac2e2c0f75c0efafda

            show global status like 'innodb_buffer_pool_read%';
            Variable_name	Value
            Innodb_buffer_pool_read_ahead_rnd	283
            Innodb_buffer_pool_read_ahead	0
            Innodb_buffer_pool_read_ahead_evicted	0
            Innodb_buffer_pool_read_requests	292045
            Innodb_buffer_pool_reads	515
            SET GLOBAL innodb_max_purge_lag_wait=0;
            show global status like 'innodb_buffer_pool_read%';
            Variable_name	Value
            Innodb_buffer_pool_read_ahead_rnd	24883
            Innodb_buffer_pool_read_ahead	0
            Innodb_buffer_pool_read_ahead_evicted	0
            Innodb_buffer_pool_read_requests	31552786
            Innodb_buffer_pool_reads	32008
            …
            Spent 102.762 of 110 seconds executing testcases
            

            10.5 d821fd7fab2723b84f774aaa4882acb86b35769d

            show global status like 'innodb_buffer_pool_read%';
            Variable_name	Value
            Innodb_buffer_pool_read_ahead_rnd	383
            Innodb_buffer_pool_read_ahead	0
            Innodb_buffer_pool_read_ahead_evicted	0
            Innodb_buffer_pool_read_requests	408332
            Innodb_buffer_pool_reads	640
            SET GLOBAL innodb_max_purge_lag_wait=0;
            show global status like 'innodb_buffer_pool_read%';
            Variable_name	Value
            Innodb_buffer_pool_read_ahead_rnd	24883
            Innodb_buffer_pool_read_ahead	0
            Innodb_buffer_pool_read_ahead_evicted	0
            Innodb_buffer_pool_read_requests	40023169
            Innodb_buffer_pool_reads	32001
            …
            Spent 64.291 of 67 seconds executing testcases
            

            We can see that 10.5 was a little more eager to issue random read-ahead requests during the test. Despite there being 25% more read requests from the buffer pool (we should probably find the reason why), 10.5 completed the test much faster. During the wait for purge, 10.4 was using some 660% of CPU, despite there being 32 threads available. 10.5 was using only 100% of CPU during that phase, possibly due to MDEV-16678 and related changes. If there had been DML into multiple tables or multiple partitions of a table, it should have been able to use more threads. Starting with 10.6, enabling the MDEV-515 bulk load would avoid the need of purging individual rows altogether.

            marko Marko Mäkelä added a comment - Random read ahead seems to be fine. I added a wait for everything to be purged, so that we can compare the total execution time between 10.4 and 10.5. --source include/have_innodb.inc --source include/have_sequence.inc create table t1 engine=innodb as select seq from seq_1_to_10000000; let $datadir=` select @@datadir`; --source include/shutdown_mysqld.inc remove_file $datadir/ib_buffer_pool; --source include/start_mysqld.inc show global status like 'innodb_buffer_pool_read%' ; SET GLOBAL innodb_max_purge_lag_wait=0; show global status like 'innodb_buffer_pool_read%' ; DROP TABLE t1; . /mtr --mysqld=--innodb-{buffer-pool,log- file }-size=1g --mysqld=--innodb- read -ahead-threshold=0 --mysqld=--innodb-random- read -ahead=ON --innodb-purge-threas=32 main.name_of_test 10.4 84b9fc25a29b94a37eb9d5ac2e2c0f75c0efafda show global status like 'innodb_buffer_pool_read%'; Variable_name Value Innodb_buffer_pool_read_ahead_rnd 283 Innodb_buffer_pool_read_ahead 0 Innodb_buffer_pool_read_ahead_evicted 0 Innodb_buffer_pool_read_requests 292045 Innodb_buffer_pool_reads 515 SET GLOBAL innodb_max_purge_lag_wait=0; show global status like 'innodb_buffer_pool_read%'; Variable_name Value Innodb_buffer_pool_read_ahead_rnd 24883 Innodb_buffer_pool_read_ahead 0 Innodb_buffer_pool_read_ahead_evicted 0 Innodb_buffer_pool_read_requests 31552786 Innodb_buffer_pool_reads 32008 … Spent 102.762 of 110 seconds executing testcases 10.5 d821fd7fab2723b84f774aaa4882acb86b35769d show global status like 'innodb_buffer_pool_read%'; Variable_name Value Innodb_buffer_pool_read_ahead_rnd 383 Innodb_buffer_pool_read_ahead 0 Innodb_buffer_pool_read_ahead_evicted 0 Innodb_buffer_pool_read_requests 408332 Innodb_buffer_pool_reads 640 SET GLOBAL innodb_max_purge_lag_wait=0; show global status like 'innodb_buffer_pool_read%'; Variable_name Value Innodb_buffer_pool_read_ahead_rnd 24883 Innodb_buffer_pool_read_ahead 0 Innodb_buffer_pool_read_ahead_evicted 0 Innodb_buffer_pool_read_requests 40023169 Innodb_buffer_pool_reads 32001 … Spent 64.291 of 67 seconds executing testcases We can see that 10.5 was a little more eager to issue random read-ahead requests during the test. Despite there being 25% more read requests from the buffer pool (we should probably find the reason why), 10.5 completed the test much faster. During the wait for purge, 10.4 was using some 660% of CPU, despite there being 32 threads available. 10.5 was using only 100% of CPU during that phase, possibly due to MDEV-16678 and related changes. If there had been DML into multiple tables or multiple partitions of a table, it should have been able to use more threads. Starting with 10.6, enabling the MDEV-515 bulk load would avoid the need of purging individual rows altogether.

            origin/bb-10.5-MDEV-29967 9e1ff0435dfa31c16d3138e802971dda8b25e3b1 2023-05-10T12:43:17+03:00
            behaved well in RQG testing.

            mleich Matthias Leich added a comment - origin/bb-10.5- MDEV-29967 9e1ff0435dfa31c16d3138e802971dda8b25e3b1 2023-05-10T12:43:17+03:00 behaved well in RQG testing.

            In 10.6, the test would not trigger any read-ahead, because the CREATE…SELECT statement will be executed with table-level and not row-level undo logging (MDEV-515). It would complete for me in about 31 seconds. A test with row-level undo logging will start like this:

            create table t1(a int primary key) engine=innodb;
            insert into t1 select seq from seq_1_to_10000000;
            

            10.6 c271057288f71746d1816824f338f2d9c47f67c1

            show global status like 'innodb_buffer_pool_read%';
            Variable_name	Value
            Innodb_buffer_pool_read_ahead_rnd	0
            Innodb_buffer_pool_read_ahead	0
            Innodb_buffer_pool_read_ahead_evicted	0
            Innodb_buffer_pool_read_requests	5903
            Innodb_buffer_pool_reads	413
            SET GLOBAL innodb_max_purge_lag_wait=0;
            show global status like 'innodb_buffer_pool_read%';
            Variable_name	Value
            Innodb_buffer_pool_read_ahead_rnd	0
            Innodb_buffer_pool_read_ahead	14720
            Innodb_buffer_pool_read_ahead_evicted	0
            Innodb_buffer_pool_read_requests	670313
            Innodb_buffer_pool_reads	24061
            …
            Spent 48.995 of 52 seconds executing testcases
            

            marko Marko Mäkelä added a comment - In 10.6, the test would not trigger any read-ahead, because the CREATE…SELECT statement will be executed with table-level and not row-level undo logging ( MDEV-515 ). It would complete for me in about 31 seconds. A test with row-level undo logging will start like this: create table t1(a int primary key ) engine=innodb; insert into t1 select seq from seq_1_to_10000000; 10.6 c271057288f71746d1816824f338f2d9c47f67c1 show global status like 'innodb_buffer_pool_read%'; Variable_name Value Innodb_buffer_pool_read_ahead_rnd 0 Innodb_buffer_pool_read_ahead 0 Innodb_buffer_pool_read_ahead_evicted 0 Innodb_buffer_pool_read_requests 5903 Innodb_buffer_pool_reads 413 SET GLOBAL innodb_max_purge_lag_wait=0; show global status like 'innodb_buffer_pool_read%'; Variable_name Value Innodb_buffer_pool_read_ahead_rnd 0 Innodb_buffer_pool_read_ahead 14720 Innodb_buffer_pool_read_ahead_evicted 0 Innodb_buffer_pool_read_requests 670313 Innodb_buffer_pool_reads 24061 … Spent 48.995 of 52 seconds executing testcases

            In MDEV-30986 I just found out that setting innodb_random_read_ahead=ON may improve read performance for some workloads. It is disabled by default.

            marko Marko Mäkelä added a comment - In MDEV-30986 I just found out that setting innodb_random_read_ahead=ON may improve read performance for some workloads. It is disabled by default.

            People

              marko Marko Mäkelä
              eworm Christian Hesse
              Votes:
              3 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.