[MDEV-19747] Deprecate and ignore innodb_log_optimize_ddl - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5
Fix Version/s: 10.5.1
Component/s: mariabackup, Storage Engine - InnoDB
Labels:

Description

MariaDB Server 10.2.2 disabled redo logging for certain ALTER TABLE operations. In ~~MDEV-16809~~ (10.2.17, 10.3.9) a possibility to enable the redo logging was introduced, by SET GLOBAL innodb_log_optimize_ddl=OFF. By default, the redo logging remains disabled. Backup can be much slower if such non-logged ALTER TABLE are performed while mariabackup --backup is running.

~~MDEV-19738~~ in 10.5 will skip the InnoDB doublewrite buffer in some cases. But, this cannot be done safely if redo logging can be disabled during ALTER TABLE. Hence, to get full advantage of ~~MDEV-19738~~, we should change the default value of innodb_log_optimize_ddl to OFF, and possibly deprecate and ignore this parameter altogether.

The reason why I would consider making innodb_log_optimize_ddl read-only is that with ~~MDEV-19738~~, changing the setting at runtime soon before or after affected ALTER TABLE is not crash-safe if the server is killed during a page write. But on the other hand, for ‘persons who know what they are doing’, maybe we should allow the parameter to modified while the server is running. SET GLOBAL requires special privileges anyway.

Before changing the default value, we should find out how much the speed of ALTER TABLE operations is affected by the change. With innodb_log_optimize_ddl=OFF, the extra page flushing will be avoided, and the affected forms of ALTER TABLE should become durable solely via redo log. Page flushing for the ALTER TABLE should be a little faster, because the doublewrite buffer can be avoided.

The affected forms of ALTER TABLE are:

--echo # rebuilding a table and all its indexes

ALTER TABLE t1 FORCE, ALGORITHM=INPLACE, LOCK=SHARED;

--echo # creating secondary indexes (one or several)

ALTER TABLE t1 ADD INDEX (b), ADD INDEX (c), ALGORITHM=INPLACE, LOCK=SHARED;

I would like to know the performance of this kind of ALTER TABLE on larger tables, with both settings of innodb_log_optimize_ddl.

Attachments

Issue Links

blocks

MDEV-12353 Efficient InnoDB redo log record format

Closed

relates to

MDEV-12353 Efficient InnoDB redo log record format

Closed

MDEV-16791 mariabackup : allow consistent backup, in presence of concurrent DDL, also without --lock-ddl-per-table

Closed

MDEV-19768 Mariabackup should write warning during backup if server does not have innodb_log_optimize_ddl=OFF set

Open

MDEV-23720 Change innodb_log_optimize_ddl=OFF by default

Closed

MDEV-36301 ALTER ENGINE INNODB LOG ...

Open

MDEV-16809 Allow full redo logging for ALTER TABLE

Closed

MDEV-17596 Assertion `block->page.flush_observer == __null || block->page.flush_observer == observer' failed in buf_flush_note_modification

Closed

MDEV-19738 Doublewrite buffer is unnecessarily used for newly (re)initialized pages

Closed

MDEV-20608 innodb_log_optimize_ddl=OFF may omit some redo log

Closed

(5 relates to)

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2019-06-13 15:14

I executed this crude benchmark:

--source include/have_innodb.inc

--source include/have_sequence.inc

SET @saved_frequency = @@GLOBAL.innodb_purge_rseg_truncate_frequency;

SET GLOBAL innodb_purge_rseg_truncate_frequency=1;

CREATE TABLE t1 (

 a BIGINT PRIMARY KEY,

 b CHAR(255) NOT NULL DEFAULT '',

 c CHAR(255) NOT NULL DEFAULT '',

 d CHAR(255) NOT NULL DEFAULT '',

 e CHAR(255) NOT NULL DEFAULT '',

 f CHAR(255) NOT NULL DEFAULT '',

 g CHAR(255) NOT NULL DEFAULT '',

 h CHAR(255) NOT NULL DEFAULT ''

) ENGINE=InnoDB;

INSERT INTO t1 (a) SELECT seq FROM seq_1_to_100000;

SET profiling = 1;

--enable_info

ALTER TABLE t1 ADD INDEX(b),ADD INDEX(c),ADD INDEX(d),

ADD INDEX(e(10)),ADD INDEX(f(4)),ADD INDEX(g(5)),ADD INDEX(h(5));

ALTER TABLE t1 FORCE;

--disable_info

SHOW PROFILES;

SET profiling = 0;

show status like 'innodb_d%_writes';

FLUSH TABLES t1 FOR EXPORT;

show status like 'innodb_d%_writes';

UNLOCK TABLES;

DROP TABLE t1;

SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency;

./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=1} innodb.benchmark

./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=0} innodb.benchmark

The results:

step	log_optimize_ddl=1	log_optimize_ddl=0
ADD INDEX	3.94s	7.10s
FORCE	3.25s	13.49s
total test time	18.7s	32.4s

Note: the "affected rows: 0" for a non-empty table means that native ALTER TABLE was used.

The time for rebuilding tables is much longer. In 10.5, we expect this to go down with the redo log format improvements (~~MDEV-12353~~ and ~~MDEV-14425~~). Until then, maybe we should allow the parameter to be set while the server is running, so that users can take the risk and temporarily disable redo logging for large table-rebuilding ALTER operations.

We should also keep in mind that thanks to ~~MDEV-11369~~ and various subtasks of ~~MDEV-11424~~, many ALTER TABLE operations are instantaneous, and table rebuilds should not occur very often.

I would be interested in seeing how much the innodb_dblwr_writes is reduced (and performance is improved) on generic DML workloads. Maybe it is enough justification to change the parameter to the safe default.

Marko Mäkelä added a comment - 2019-06-13 15:14 I executed this crude benchmark: --source include/have_innodb.inc --source include/have_sequence.inc SET @saved_frequency = @@ GLOBAL .innodb_purge_rseg_truncate_frequency; SET GLOBAL innodb_purge_rseg_truncate_frequency=1; CREATE TABLE t1 ( a BIGINT PRIMARY KEY , b CHAR (255) NOT NULL DEFAULT '' , c CHAR (255) NOT NULL DEFAULT '' , d CHAR (255) NOT NULL DEFAULT '' , e CHAR (255) NOT NULL DEFAULT '' , f CHAR (255) NOT NULL DEFAULT '' , g CHAR (255) NOT NULL DEFAULT '' , h CHAR (255) NOT NULL DEFAULT '' ) ENGINE=InnoDB; INSERT INTO t1 (a) SELECT seq FROM seq_1_to_100000; SET profiling = 1; --enable_info ALTER TABLE t1 ADD INDEX (b), ADD INDEX (c), ADD INDEX (d), ADD INDEX (e(10)), ADD INDEX (f(4)), ADD INDEX (g(5)), ADD INDEX (h(5)); ALTER TABLE t1 FORCE ; --disable_info SHOW PROFILES; SET profiling = 0; show status like 'innodb_d%_writes' ; FLUSH TABLES t1 FOR EXPORT; show status like 'innodb_d%_writes' ; UNLOCK TABLES; DROP TABLE t1; SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency; . /mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=1} innodb.benchmark . /mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=0} innodb.benchmark The results: step log_optimize_ddl=1 log_optimize_ddl=0 ADD INDEX 3.94s 7.10s FORCE 3.25s 13.49s total test time 18.7s 32.4s Note: the "affected rows: 0" for a non-empty table means that native ALTER TABLE was used. The time for rebuilding tables is much longer. In 10.5, we expect this to go down with the redo log format improvements ( MDEV-12353 and MDEV-14425 ). Until then, maybe we should allow the parameter to be set while the server is running, so that users can take the risk and temporarily disable redo logging for large table-rebuilding ALTER operations. We should also keep in mind that thanks to MDEV-11369 and various subtasks of MDEV-11424 , many ALTER TABLE operations are instantaneous, and table rebuilds should not occur very often. I would be interested in seeing how much the innodb_dblwr_writes is reduced (and performance is improved) on generic DML workloads. Maybe it is enough justification to change the parameter to the safe default.

Marko Mäkelä added a comment - 2019-09-17 08:20

I plan to deprecate and ignore the parameter innodb_log_optimize_ddl=OFF as part of the ~~MDEV-12353~~ implementation. The different redo log format should alleviate any performance regressions.

Marko Mäkelä added a comment - 2019-09-17 08:20 I plan to deprecate and ignore the parameter innodb_log_optimize_ddl=OFF as part of the MDEV-12353 implementation. The different redo log format should alleviate any performance regressions.

Marko Mäkelä added a comment - 2020-02-11 12:29

I repeated the microbenchmark after implementing some optimization for ~~MDEV-12353~~. The physical format is generating slightly more redo log.

I amended the test case with a few reads of innodb_lsn_current (log_sys.lsn):

--source include/have_innodb.inc

--source include/have_sequence.inc

SET @saved_frequency = @@GLOBAL.innodb_purge_rseg_truncate_frequency;

SET GLOBAL innodb_purge_rseg_truncate_frequency=1;

CREATE TABLE t1 (

 a BIGINT PRIMARY KEY,

 b CHAR(255) NOT NULL DEFAULT '',

 c CHAR(255) NOT NULL DEFAULT '',

 d CHAR(255) NOT NULL DEFAULT '',

 e CHAR(255) NOT NULL DEFAULT '',

 f CHAR(255) NOT NULL DEFAULT '',

 g CHAR(255) NOT NULL DEFAULT '',

 h CHAR(255) NOT NULL DEFAULT ''

) ENGINE=InnoDB;

INSERT INTO t1 (a) SELECT seq FROM seq_1_to_100000;

SET profiling = 1;

show status like 'innodb_lsn_current';

--enable_info

ALTER TABLE t1 ADD INDEX(b),ADD INDEX(c),ADD INDEX(d),

ADD INDEX(e(10)),ADD INDEX(f(4)),ADD INDEX(g(5)),ADD INDEX(h(5));

show status like 'innodb_lsn_current';

ALTER TABLE t1 FORCE;

--disable_info

SHOW PROFILES;

SET profiling = 0;

show status like 'innodb_lsn_current';

show status like 'innodb_d%_writes';

FLUSH TABLES t1 FOR EXPORT;

show status like 'innodb_d%_writes';

UNLOCK TABLES;

DROP TABLE t1;

SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency;

The execution time with the original settings

./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=1} innodb.benchmark

./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=0} innodb.benchmark

did not change much. However, the default redo log size must be so small that it is forcing page flushes. Let us fix that:

./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-file-size=512m,log-optimize-ddl=1} innodb.benchmark

./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-file-size=512m,log-optimize-ddl=1} innodb.benchmark

This will shrink the test time as follows:

step	log_optimize_ddl=1	log_optimize_ddl=0	~~MDEV-12353~~
ADD INDEX	3.84s	2.94s	2.73s
FORCE	3.21s	3.44s	3.45s
entire test	8.68s	8.01s	7.87s

I also collected statistics on the redo log bytes written:

step	log_optimize_ddl=1	log_optimize_ddl=0	~~MDEV-12353~~
setup	197,240,744	197,250,640	198,351,963
ADD INDEX	3,512,609	25,724,276	28,557,329
FORCE	1,419,512	219,959,416	220,527,231

So, it looks like with a reasonably large innodb_log_file_size, setting innodb_log_file_size=OFF may slightly improve performance. In ~~MDEV-12353~~, the setting is deprecated and ignored. We can also note that the redo log records in the physical redo log format of ~~MDEV-12353~~ are slightly larger than the old physiological format. The situation could slightly improve after btr_page_reorganize() has been rewritten to write less log. There should be some room to reduce the log record volume further without changing the format.

Marko Mäkelä added a comment - 2020-02-11 12:29 I repeated the microbenchmark after implementing some optimization for MDEV-12353 . The physical format is generating slightly more redo log. I amended the test case with a few reads of innodb_lsn_current ( log_sys.lsn ): --source include/have_innodb.inc --source include/have_sequence.inc SET @saved_frequency = @@ GLOBAL .innodb_purge_rseg_truncate_frequency; SET GLOBAL innodb_purge_rseg_truncate_frequency=1; CREATE TABLE t1 ( a BIGINT PRIMARY KEY , b CHAR (255) NOT NULL DEFAULT '' , c CHAR (255) NOT NULL DEFAULT '' , d CHAR (255) NOT NULL DEFAULT '' , e CHAR (255) NOT NULL DEFAULT '' , f CHAR (255) NOT NULL DEFAULT '' , g CHAR (255) NOT NULL DEFAULT '' , h CHAR (255) NOT NULL DEFAULT '' ) ENGINE=InnoDB; INSERT INTO t1 (a) SELECT seq FROM seq_1_to_100000; SET profiling = 1; show status like 'innodb_lsn_current' ; --enable_info ALTER TABLE t1 ADD INDEX (b), ADD INDEX (c), ADD INDEX (d), ADD INDEX (e(10)), ADD INDEX (f(4)), ADD INDEX (g(5)), ADD INDEX (h(5)); show status like 'innodb_lsn_current' ; ALTER TABLE t1 FORCE ; --disable_info SHOW PROFILES; SET profiling = 0; show status like 'innodb_lsn_current' ; show status like 'innodb_d%_writes' ; FLUSH TABLES t1 FOR EXPORT; show status like 'innodb_d%_writes' ; UNLOCK TABLES; DROP TABLE t1; SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency; The execution time with the original settings ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=1} innodb.benchmark ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-optimize-ddl=0} innodb.benchmark did not change much. However, the default redo log size must be so small that it is forcing page flushes. Let us fix that: ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-file-size=512m,log-optimize-ddl=1} innodb.benchmark ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=64m,log-file-size=512m,log-optimize-ddl=1} innodb.benchmark This will shrink the test time as follows: step log_optimize_ddl=1 log_optimize_ddl=0 MDEV-12353 ADD INDEX 3.84s 2.94s 2.73s FORCE 3.21s 3.44s 3.45s entire test 8.68s 8.01s 7.87s I also collected statistics on the redo log bytes written: step log_optimize_ddl=1 log_optimize_ddl=0 MDEV-12353 setup 197,240,744 197,250,640 198,351,963 ADD INDEX 3,512,609 25,724,276 28,557,329 FORCE 1,419,512 219,959,416 220,527,231 So, it looks like with a reasonably large innodb_log_file_size , setting innodb_log_file_size=OFF may slightly improve performance. In MDEV-12353 , the setting is deprecated and ignored. We can also note that the redo log records in the physical redo log format of MDEV-12353 are slightly larger than the old physiological format. The situation could slightly improve after btr_page_reorganize() has been rewritten to write less log. There should be some room to reduce the log record volume further without changing the format.

Marko Mäkelä added a comment - 2020-02-11 16:28

Given that there does not appear to be any performance regression when using a reasonably sized redo log file, I think that we can go one step further and deprecate and ignore the parameter. In ~~MDEV-12353~~ there is no replacement for the MLOG_INDEX_LOAD record, which was written when some redo log was skipped.

Marko Mäkelä added a comment - 2020-02-11 16:28 Given that there does not appear to be any performance regression when using a reasonably sized redo log file, I think that we can go one step further and deprecate and ignore the parameter. In MDEV-12353 there is no replacement for the MLOG_INDEX_LOAD record, which was written when some redo log was skipped.

Marko Mäkelä added a comment - 2020-04-23 19:19

I ran a bigger benchmark, with 10× the rows (seq_1_to_1000000) and the data directory on PCIe NVMe:

./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=1g,log-file-size=1g,log-files-in-group=1,flush-log-at-trx-commit=1,log-optimize-ddl=1} innodb.benchmark1m

./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=1g,log-file-size=1g,log-files-in-group=1,flush-log-at-trx-commit=1,log-optimize-ddl=0} innodb.benchmark1m

Before ADD INDEX, the t1.ibd file size is 2.5 GiB, and before the FORCE it is 3.3GiB. The tested revisions were as follows:

10.4 9398c3dfa5f8c2b2b5bc51dbf079e3edd343ae9e
10.5 2655984ed4d22cf6eeec59637e1bf31d78cac679

step	10.4 log_optimize_ddl=1	10.4 log_optimize_ddl=0	10.5
`ADD INDEX`	76.9s	39.7s	30.6s
`FORCE`	45.5s	43.9s	37.7s
entire test	154.9s	122.2s	100.3s

Even with a larger test, we can observe that innodb_log_optimize_ddl=0 (reverting the change in MySQL 5.7 to write less log on ALTER TABLE) is actually reducing performance, when the redo log file is reasonably large. The reason for this ought to be the extra flushing of the buffer pool right before the redo logging is re-enabled.

The variable innodb_lsn_current is not available before 10.5. On 10.5, I tried one more variant: performing FORCE, ALGORITHM=COPY as the last step. That should write roughly the same amount of pages, but the pre-sorting of index records will be disabled.

10.5 step	LSN,INPLACE	time	LSN,COPY	time
setup	1,927,534,933	32.0s	1,927,534,904	31.8s
`ADD INDEX`	283,453,793	30.6s	283,453,765	30.4s
`FORCE`	2,198,868,500	37.7s	2,088,222,782	64.9s

Only the last step should differ. The first two steps are identical, but there is minor fluctuation, likely due to purge (~~MDEV-12288~~).
Interestingly, for the non-native rebuild with ALGORITHM=COPY we are writing slightly less redo log than for the faster native rebuild. We might want to check if PageBulk::finishPage() could emit the higher-level redo log records that were introduced in ~~MDEV-21724~~.

Conclusion: 10.5 with a properly configured redo log file size is faster than 10.4, and that innodb_log_optimize_ddl=0 (disabling the supposed optimization that was introduced in MySQL 5.7 and inherited to MariaDB 10.2.2) is likely improving performance in MariaDB Server 10.2, 10.3, and 10.4.

Marko Mäkelä added a comment - 2020-04-23 19:19 I ran a bigger benchmark, with 10× the rows ( seq_1_to_1000000 ) and the data directory on PCIe NVMe: ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=1g,log-file-size=1g,log-files-in-group=1,flush-log-at-trx-commit=1,log-optimize-ddl=1} innodb.benchmark1m ./mtr --mysqld=--innodb-{page-size=4k,buffer-pool-size=1g,log-file-size=1g,log-files-in-group=1,flush-log-at-trx-commit=1,log-optimize-ddl=0} innodb.benchmark1m Before ADD INDEX , the t1.ibd file size is 2.5 GiB, and before the FORCE it is 3.3GiB. The tested revisions were as follows: 10.4 9398c3dfa5f8c2b2b5bc51dbf079e3edd343ae9e 10.5 2655984ed4d22cf6eeec59637e1bf31d78cac679 step 10.4 log_optimize_ddl=1 10.4 log_optimize_ddl=0 10.5 ADD INDEX 76.9s 39.7s 30.6s FORCE 45.5s 43.9s 37.7s entire test 154.9s 122.2s 100.3s Even with a larger test, we can observe that innodb_log_optimize_ddl=0 (reverting the change in MySQL 5.7 to write less log on ALTER TABLE ) is actually reducing performance, when the redo log file is reasonably large. The reason for this ought to be the extra flushing of the buffer pool right before the redo logging is re-enabled. The variable innodb_lsn_current is not available before 10.5. On 10.5, I tried one more variant: performing FORCE, ALGORITHM=COPY as the last step. That should write roughly the same amount of pages, but the pre-sorting of index records will be disabled. 10.5 step LSN,INPLACE time LSN,COPY time setup 1,927,534,933 32.0s 1,927,534,904 31.8s ADD INDEX 283,453,793 30.6s 283,453,765 30.4s FORCE 2,198,868,500 37.7s 2,088,222,782 64.9s Only the last step should differ. The first two steps are identical, but there is minor fluctuation, likely due to purge ( MDEV-12288 ). Interestingly, for the non-native rebuild with ALGORITHM=COPY we are writing slightly less redo log than for the faster native rebuild. We might want to check if PageBulk::finishPage() could emit the higher-level redo log records that were introduced in MDEV-21724 . Conclusion: 10.5 with a properly configured redo log file size is faster than 10.4, and that innodb_log_optimize_ddl=0 (disabling the supposed optimization that was introduced in MySQL 5.7 and inherited to MariaDB 10.2.2) is likely improving performance in MariaDB Server 10.2, 10.3, and 10.4.

Marko Mäkelä added a comment - 2020-04-23 19:43

I accidentally ran the 10.5 benchmark on /dev/shm instead of NVMe (ext4fs). For a true apples-to-apples comparison, here are the tables with updated numbers for 10.5:

step	10.4 log_optimize_ddl=1	10.4 log_optimize_ddl=0	10.5
`ADD INDEX`	76.9s	39.7s	49.1s
`FORCE`	45.5s	43.9s	44.7s
entire test	154.9s	122.2s	130.4s

For ADD INDEX, 10.5 is clearly slower than 10.4 with innodb_log_optimize_ddl=0. This might be improved by reducing the log volume of PageBulk::finishPage().

Here is the comparison between FORCE and ALGORITHM=COPY again, on NVMe instead of RAM disk:

10.5 step	LSN,INPLACE	time	LSN,COPY	time
setup	1,927,534,817	36.6s	1,927,534,788	36.0s
`ADD INDEX`	283,454,354	49.1s	283,454,513	48.9s
`FORCE`	2,198,868,733	44.7s	2,088,222,581	65.5s

Marko Mäkelä added a comment - 2020-04-23 19:43 I accidentally ran the 10.5 benchmark on /dev/shm instead of NVMe (ext4fs). For a true apples-to-apples comparison, here are the tables with updated numbers for 10.5: step 10.4 log_optimize_ddl=1 10.4 log_optimize_ddl=0 10.5 ADD INDEX 76.9s 39.7s 49.1s FORCE 45.5s 43.9s 44.7s entire test 154.9s 122.2s 130.4s For ADD INDEX , 10.5 is clearly slower than 10.4 with innodb_log_optimize_ddl=0 . This might be improved by reducing the log volume of PageBulk::finishPage() . Here is the comparison between FORCE and ALGORITHM=COPY again, on NVMe instead of RAM disk: 10.5 step LSN,INPLACE time LSN,COPY time setup 1,927,534,817 36.6s 1,927,534,788 36.0s ADD INDEX 283,454,354 49.1s 283,454,513 48.9s FORCE 2,198,868,733 44.7s 2,088,222,581 65.5s

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2019-06-13 11:25

Updated:: 2025-03-14 09:46

Resolved:: 2020-02-12 05:28

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.