Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19747

Deprecate and ignore innodb_log_optimize_ddl

Details

    Description

      MariaDB Server 10.2.2 disabled redo logging for certain ALTER TABLE operations. In MDEV-16809 (10.2.17, 10.3.9) a possibility to enable the redo logging was introduced, by SET GLOBAL innodb_log_optimize_ddl=OFF. By default, the redo logging remains disabled. Backup can be much slower if such non-logged ALTER TABLE are performed while mariabackup --backup is running.

      MDEV-19738 in 10.5 will skip the InnoDB doublewrite buffer in some cases. But, this cannot be done safely if redo logging can be disabled during ALTER TABLE. Hence, to get full advantage of MDEV-19738, we should change the default value of innodb_log_optimize_ddl to OFF, and possibly deprecate and ignore this parameter altogether.

      The reason why I would consider making innodb_log_optimize_ddl read-only is that with MDEV-19738, changing the setting at runtime soon before or after affected ALTER TABLE is not crash-safe if the server is killed during a page write. But on the other hand, for ‘persons who know what they are doing’, maybe we should allow the parameter to modified while the server is running. SET GLOBAL requires special privileges anyway.

      Before changing the default value, we should find out how much the speed of ALTER TABLE operations is affected by the change. With innodb_log_optimize_ddl=OFF, the extra page flushing will be avoided, and the affected forms of ALTER TABLE should become durable solely via redo log. Page flushing for the ALTER TABLE should be a little faster, because the doublewrite buffer can be avoided.

      The affected forms of ALTER TABLE are:

      --echo # rebuilding a table and all its indexes
      ALTER TABLE t1 FORCE, ALGORITHM=INPLACE, LOCK=SHARED;
      --echo # creating secondary indexes (one or several)
      ALTER TABLE t1 ADD INDEX (b), ADD INDEX (c), ALGORITHM=INPLACE, LOCK=SHARED;
      

      I would like to know the performance of this kind of ALTER TABLE on larger tables, with both settings of innodb_log_optimize_ddl.

      Attachments

        Issue Links

          Activity

            I executed this crude benchmark:

            --source include/have_innodb.inc
            --source include/have_sequence.inc
             
            SET @saved_frequency = @@GLOBAL.innodb_purge_rseg_truncate_frequency;
            SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
             
            CREATE TABLE t1 (
             a BIGINT PRIMARY KEY,
             b CHAR(255) NOT NULL DEFAULT '',
             c CHAR(255) NOT NULL DEFAULT '',
             d CHAR(255) NOT NULL DEFAULT '',
             e CHAR(255) NOT NULL DEFAULT '',
             f CHAR(255) NOT NULL DEFAULT '',
             g CHAR(255) NOT NULL DEFAULT '',
             h CHAR(255) NOT NULL DEFAULT ''
            ) ENGINE=InnoDB;
            INSERT INTO t1 (a) SELECT seq FROM seq_1_to_100000;
             
            SET profiling = 1;
            --enable_info
            ALTER TABLE t1 ADD INDEX(b),ADD INDEX(c),ADD INDEX(d),
            ADD INDEX(e(10)),ADD INDEX(f(4)),ADD INDEX(g(5)),ADD INDEX(h(5));
            ALTER TABLE t1 FORCE;
            --disable_info
            SHOW PROFILES;
            SET profiling = 0;
             
            show status like 'innodb_d%_writes';
            FLUSH TABLES t1 FOR EXPORT;
            show status like 'innodb_d%_writes';
             
            UNLOCK TABLES;
            DROP TABLE t1;
            SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency;
            

            ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=1} innodb.benchmark
            ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=0} innodb.benchmark
            

            The results:

            step log_optimize_ddl=1 log_optimize_ddl=0
            ADD INDEX 3.94s 7.10s
            FORCE 3.25s 13.49s
            total test time 18.7s 32.4s

            Note: the "affected rows: 0" for a non-empty table means that native ALTER TABLE was used.

            The time for rebuilding tables is much longer. In 10.5, we expect this to go down with the redo log format improvements (MDEV-12353 and MDEV-14425). Until then, maybe we should allow the parameter to be set while the server is running, so that users can take the risk and temporarily disable redo logging for large table-rebuilding ALTER operations.

            We should also keep in mind that thanks to MDEV-11369 and various subtasks of MDEV-11424, many ALTER TABLE operations are instantaneous, and table rebuilds should not occur very often.

            I would be interested in seeing how much the innodb_dblwr_writes is reduced (and performance is improved) on generic DML workloads. Maybe it is enough justification to change the parameter to the safe default.

            marko Marko Mäkelä added a comment - I executed this crude benchmark: --source include/have_innodb.inc --source include/have_sequence.inc   SET @saved_frequency = @@ GLOBAL .innodb_purge_rseg_truncate_frequency; SET GLOBAL innodb_purge_rseg_truncate_frequency=1;   CREATE TABLE t1 ( a BIGINT PRIMARY KEY , b CHAR (255) NOT NULL DEFAULT '' , c CHAR (255) NOT NULL DEFAULT '' , d CHAR (255) NOT NULL DEFAULT '' , e CHAR (255) NOT NULL DEFAULT '' , f CHAR (255) NOT NULL DEFAULT '' , g CHAR (255) NOT NULL DEFAULT '' , h CHAR (255) NOT NULL DEFAULT '' ) ENGINE=InnoDB; INSERT INTO t1 (a) SELECT seq FROM seq_1_to_100000;   SET profiling = 1; --enable_info ALTER TABLE t1 ADD INDEX (b), ADD INDEX (c), ADD INDEX (d), ADD INDEX (e(10)), ADD INDEX (f(4)), ADD INDEX (g(5)), ADD INDEX (h(5)); ALTER TABLE t1 FORCE ; --disable_info SHOW PROFILES; SET profiling = 0;   show status like 'innodb_d%_writes' ; FLUSH TABLES t1 FOR EXPORT; show status like 'innodb_d%_writes' ;   UNLOCK TABLES; DROP TABLE t1; SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency; . /mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=1} innodb.benchmark . /mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=0} innodb.benchmark The results: step log_optimize_ddl=1 log_optimize_ddl=0 ADD INDEX 3.94s 7.10s FORCE 3.25s 13.49s total test time 18.7s 32.4s Note: the "affected rows: 0" for a non-empty table means that native ALTER TABLE was used. The time for rebuilding tables is much longer. In 10.5, we expect this to go down with the redo log format improvements ( MDEV-12353 and MDEV-14425 ). Until then, maybe we should allow the parameter to be set while the server is running, so that users can take the risk and temporarily disable redo logging for large table-rebuilding ALTER operations. We should also keep in mind that thanks to MDEV-11369 and various subtasks of MDEV-11424 , many ALTER TABLE operations are instantaneous, and table rebuilds should not occur very often. I would be interested in seeing how much the innodb_dblwr_writes is reduced (and performance is improved) on generic DML workloads. Maybe it is enough justification to change the parameter to the safe default.

            I plan to deprecate and ignore the parameter innodb_log_optimize_ddl=OFF as part of the MDEV-12353 implementation. The different redo log format should alleviate any performance regressions.

            marko Marko Mäkelä added a comment - I plan to deprecate and ignore the parameter innodb_log_optimize_ddl=OFF as part of the MDEV-12353 implementation. The different redo log format should alleviate any performance regressions.

            I repeated the microbenchmark after implementing some optimization for MDEV-12353. The physical format is generating slightly more redo log.

            I amended the test case with a few reads of innodb_lsn_current (log_sys.lsn):

            --source include/have_innodb.inc
            --source include/have_sequence.inc
             
            SET @saved_frequency = @@GLOBAL.innodb_purge_rseg_truncate_frequency;
            SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
             
            CREATE TABLE t1 (
             a BIGINT PRIMARY KEY,
             b CHAR(255) NOT NULL DEFAULT '',
             c CHAR(255) NOT NULL DEFAULT '',
             d CHAR(255) NOT NULL DEFAULT '',
             e CHAR(255) NOT NULL DEFAULT '',
             f CHAR(255) NOT NULL DEFAULT '',
             g CHAR(255) NOT NULL DEFAULT '',
             h CHAR(255) NOT NULL DEFAULT ''
            ) ENGINE=InnoDB;
            INSERT INTO t1 (a) SELECT seq FROM seq_1_to_100000;
             
            SET profiling = 1;
            show status like 'innodb_lsn_current';
            --enable_info
            ALTER TABLE t1 ADD INDEX(b),ADD INDEX(c),ADD INDEX(d),
            ADD INDEX(e(10)),ADD INDEX(f(4)),ADD INDEX(g(5)),ADD INDEX(h(5));
            show status like 'innodb_lsn_current';
            ALTER TABLE t1 FORCE;
            --disable_info
            SHOW PROFILES;
            SET profiling = 0;
            show status like 'innodb_lsn_current';
             
            show status like 'innodb_d%_writes';
            FLUSH TABLES t1 FOR EXPORT;
            show status like 'innodb_d%_writes';
             
            UNLOCK TABLES;
            DROP TABLE t1;
            SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency;
            

            The execution time with the original settings

            ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=1} innodb.benchmark
            ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=0} innodb.benchmark
            

            did not change much. However, the default redo log size must be so small that it is forcing page flushes. Let us fix that:

            ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-file-size=512m,log-optimize-ddl=1} innodb.benchmark
            ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-file-size=512m,log-optimize-ddl=1} innodb.benchmark
            

            This will shrink the test time as follows:

            step log_optimize_ddl=1 log_optimize_ddl=0 MDEV-12353
            ADD INDEX 3.84s 2.94s 2.73s
            FORCE 3.21s 3.44s 3.45s
            entire test 8.68s 8.01s 7.87s

            I also collected statistics on the redo log bytes written:

            step log_optimize_ddl=1 log_optimize_ddl=0 MDEV-12353
            setup 197,240,744 197,250,640 198,351,963
            ADD INDEX 3,512,609 25,724,276 28,557,329
            FORCE 1,419,512 219,959,416 220,527,231

            So, it looks like with a reasonably large innodb_log_file_size, setting innodb_log_file_size=OFF may slightly improve performance. In MDEV-12353, the setting is deprecated and ignored. We can also note that the redo log records in the physical redo log format of MDEV-12353 are slightly larger than the old physiological format. The situation could slightly improve after btr_page_reorganize() has been rewritten to write less log. There should be some room to reduce the log record volume further without changing the format.

            marko Marko Mäkelä added a comment - I repeated the microbenchmark after implementing some optimization for MDEV-12353 . The physical format is generating slightly more redo log. I amended the test case with a few reads of innodb_lsn_current ( log_sys.lsn ): --source include/have_innodb.inc --source include/have_sequence.inc   SET @saved_frequency = @@ GLOBAL .innodb_purge_rseg_truncate_frequency; SET GLOBAL innodb_purge_rseg_truncate_frequency=1;   CREATE TABLE t1 ( a BIGINT PRIMARY KEY , b CHAR (255) NOT NULL DEFAULT '' , c CHAR (255) NOT NULL DEFAULT '' , d CHAR (255) NOT NULL DEFAULT '' , e CHAR (255) NOT NULL DEFAULT '' , f CHAR (255) NOT NULL DEFAULT '' , g CHAR (255) NOT NULL DEFAULT '' , h CHAR (255) NOT NULL DEFAULT '' ) ENGINE=InnoDB; INSERT INTO t1 (a) SELECT seq FROM seq_1_to_100000;   SET profiling = 1; show status like 'innodb_lsn_current' ; --enable_info ALTER TABLE t1 ADD INDEX (b), ADD INDEX (c), ADD INDEX (d), ADD INDEX (e(10)), ADD INDEX (f(4)), ADD INDEX (g(5)), ADD INDEX (h(5)); show status like 'innodb_lsn_current' ; ALTER TABLE t1 FORCE ; --disable_info SHOW PROFILES; SET profiling = 0; show status like 'innodb_lsn_current' ;   show status like 'innodb_d%_writes' ; FLUSH TABLES t1 FOR EXPORT; show status like 'innodb_d%_writes' ;   UNLOCK TABLES; DROP TABLE t1; SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency; The execution time with the original settings ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=1} innodb.benchmark ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=0} innodb.benchmark did not change much. However, the default redo log size must be so small that it is forcing page flushes. Let us fix that: ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-file-size=512m,log-optimize-ddl=1} innodb.benchmark ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-file-size=512m,log-optimize-ddl=1} innodb.benchmark This will shrink the test time as follows: step log_optimize_ddl=1 log_optimize_ddl=0 MDEV-12353 ADD INDEX 3.84s 2.94s 2.73s FORCE 3.21s 3.44s 3.45s entire test 8.68s 8.01s 7.87s I also collected statistics on the redo log bytes written: step log_optimize_ddl=1 log_optimize_ddl=0 MDEV-12353 setup 197,240,744 197,250,640 198,351,963 ADD INDEX 3,512,609 25,724,276 28,557,329 FORCE 1,419,512 219,959,416 220,527,231 So, it looks like with a reasonably large innodb_log_file_size , setting innodb_log_file_size=OFF may slightly improve performance. In MDEV-12353 , the setting is deprecated and ignored. We can also note that the redo log records in the physical redo log format of MDEV-12353 are slightly larger than the old physiological format. The situation could slightly improve after btr_page_reorganize() has been rewritten to write less log. There should be some room to reduce the log record volume further without changing the format.

            Given that there does not appear to be any performance regression when using a reasonably sized redo log file, I think that we can go one step further and deprecate and ignore the parameter. In MDEV-12353 there is no replacement for the MLOG_INDEX_LOAD record, which was written when some redo log was skipped.

            marko Marko Mäkelä added a comment - Given that there does not appear to be any performance regression when using a reasonably sized redo log file, I think that we can go one step further and deprecate and ignore the parameter. In MDEV-12353 there is no replacement for the MLOG_INDEX_LOAD record, which was written when some redo log was skipped.

            I ran a bigger benchmark, with 10× the rows (seq_1_to_1000000) and the data directory on PCIe NVMe:

            ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=1g,log-file-size=1g,log-files-in-group=1,flush-log-at-trx-commit=1,log-optimize-ddl=1} innodb.benchmark1m
            ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=1g,log-file-size=1g,log-files-in-group=1,flush-log-at-trx-commit=1,log-optimize-ddl=0} innodb.benchmark1m
            

            Before ADD INDEX, the t1.ibd file size is 2.5 GiB, and before the FORCE it is 3.3GiB. The tested revisions were as follows:

            • 10.4 9398c3dfa5f8c2b2b5bc51dbf079e3edd343ae9e
            • 10.5 2655984ed4d22cf6eeec59637e1bf31d78cac679
            step 10.4 log_optimize_ddl=1 10.4 log_optimize_ddl=0 10.5
            ADD INDEX 76.9s 39.7s 30.6s
            FORCE 45.5s 43.9s 37.7s
            entire test 154.9s 122.2s 100.3s

            Even with a larger test, we can observe that innodb_log_optimize_ddl=0 (reverting the change in MySQL 5.7 to write less log on ALTER TABLE) is actually reducing performance, when the redo log file is reasonably large. The reason for this ought to be the extra flushing of the buffer pool right before the redo logging is re-enabled.

            The variable innodb_lsn_current is not available before 10.5. On 10.5, I tried one more variant: performing FORCE, ALGORITHM=COPY as the last step. That should write roughly the same amount of pages, but the pre-sorting of index records will be disabled.

            10.5 step LSN,INPLACE time LSN,COPY time
            setup 1,927,534,933 32.0s 1,927,534,904 31.8s
            ADD INDEX 283,453,793 30.6s 283,453,765 30.4s
            FORCE 2,198,868,500 37.7s 2,088,222,782 64.9s

            Only the last step should differ. The first two steps are identical, but there is minor fluctuation, likely due to purge (MDEV-12288).
            Interestingly, for the non-native rebuild with ALGORITHM=COPY we are writing slightly less redo log than for the faster native rebuild. We might want to check if PageBulk::finishPage() could emit the higher-level redo log records that were introduced in MDEV-21724.

            Conclusion: 10.5 with a properly configured redo log file size is faster than 10.4, and that innodb_log_optimize_ddl=0 (disabling the supposed optimization that was introduced in MySQL 5.7 and inherited to MariaDB 10.2.2) is likely improving performance in MariaDB Server 10.2, 10.3, and 10.4.

            marko Marko Mäkelä added a comment - I ran a bigger benchmark, with 10× the rows ( seq_1_to_1000000 ) and the data directory on PCIe NVMe: ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=1g,log-file-size=1g,log-files-in-group=1,flush-log-at-trx-commit=1,log-optimize-ddl=1} innodb.benchmark1m ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=1g,log-file-size=1g,log-files-in-group=1,flush-log-at-trx-commit=1,log-optimize-ddl=0} innodb.benchmark1m Before ADD INDEX , the t1.ibd file size is 2.5 GiB, and before the FORCE it is 3.3GiB. The tested revisions were as follows: 10.4 9398c3dfa5f8c2b2b5bc51dbf079e3edd343ae9e 10.5 2655984ed4d22cf6eeec59637e1bf31d78cac679 step 10.4 log_optimize_ddl=1 10.4 log_optimize_ddl=0 10.5 ADD INDEX 76.9s 39.7s 30.6s FORCE 45.5s 43.9s 37.7s entire test 154.9s 122.2s 100.3s Even with a larger test, we can observe that innodb_log_optimize_ddl=0 (reverting the change in MySQL 5.7 to write less log on ALTER TABLE ) is actually reducing performance, when the redo log file is reasonably large. The reason for this ought to be the extra flushing of the buffer pool right before the redo logging is re-enabled. The variable innodb_lsn_current is not available before 10.5. On 10.5, I tried one more variant: performing FORCE, ALGORITHM=COPY as the last step. That should write roughly the same amount of pages, but the pre-sorting of index records will be disabled. 10.5 step LSN,INPLACE time LSN,COPY time setup 1,927,534,933 32.0s 1,927,534,904 31.8s ADD INDEX 283,453,793 30.6s 283,453,765 30.4s FORCE 2,198,868,500 37.7s 2,088,222,782 64.9s Only the last step should differ. The first two steps are identical, but there is minor fluctuation, likely due to purge ( MDEV-12288 ). Interestingly, for the non-native rebuild with ALGORITHM=COPY we are writing slightly less redo log than for the faster native rebuild. We might want to check if PageBulk::finishPage() could emit the higher-level redo log records that were introduced in MDEV-21724 . Conclusion: 10.5 with a properly configured redo log file size is faster than 10.4, and that innodb_log_optimize_ddl=0 (disabling the supposed optimization that was introduced in MySQL 5.7 and inherited to MariaDB 10.2.2) is likely improving performance in MariaDB Server 10.2, 10.3, and 10.4.

            I accidentally ran the 10.5 benchmark on /dev/shm instead of NVMe (ext4fs). For a true apples-to-apples comparison, here are the tables with updated numbers for 10.5:

            step 10.4 log_optimize_ddl=1 10.4 log_optimize_ddl=0 10.5
            ADD INDEX 76.9s 39.7s 49.1s
            FORCE 45.5s 43.9s 44.7s
            entire test 154.9s 122.2s 130.4s

            For ADD INDEX, 10.5 is clearly slower than 10.4 with innodb_log_optimize_ddl=0. This might be improved by reducing the log volume of PageBulk::finishPage().

            Here is the comparison between FORCE and ALGORITHM=COPY again, on NVMe instead of RAM disk:

            10.5 step LSN,INPLACE time LSN,COPY time
            setup 1,927,534,817 36.6s 1,927,534,788 36.0s
            ADD INDEX 283,454,354 49.1s 283,454,513 48.9s
            FORCE 2,198,868,733 44.7s 2,088,222,581 65.5s
            marko Marko Mäkelä added a comment - I accidentally ran the 10.5 benchmark on /dev/shm instead of NVMe (ext4fs). For a true apples-to-apples comparison, here are the tables with updated numbers for 10.5: step 10.4 log_optimize_ddl=1 10.4 log_optimize_ddl=0 10.5 ADD INDEX 76.9s 39.7s 49.1s FORCE 45.5s 43.9s 44.7s entire test 154.9s 122.2s 130.4s For ADD INDEX , 10.5 is clearly slower than 10.4 with innodb_log_optimize_ddl=0 . This might be improved by reducing the log volume of PageBulk::finishPage() . Here is the comparison between FORCE and ALGORITHM=COPY again, on NVMe instead of RAM disk: 10.5 step LSN,INPLACE time LSN,COPY time setup 1,927,534,817 36.6s 1,927,534,788 36.0s ADD INDEX 283,454,354 49.1s 283,454,513 48.9s FORCE 2,198,868,733 44.7s 2,088,222,581 65.5s

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.