Currently, if the parameters of the redo log change, InnoDB in the MariaDB Community Server will rebuild the redo log at server startup.
MariaDB Enterprise Server 10.5 and 10.6 allow dynamic tuning of the redo log parameters, rebuilding the redo log in a crash-safe manner without restarting the server.
SETGLOBAL innodb_log_file_size=…;
Before MDEV-14425 changed the redo log file format in 10.8, we were be unable to enable or disable innodb_encrypt_log without server restart, because starting with MDEV-12041 in MariaDB 10.4, the encrypted redo log blocks will have 4 bytes less payload per 512-byte log block.
Log resizing is tied to checkpoints. We can start writing a second redo log ib_logfile101 with the requested new size, starting from something close to the last written log sequence number. On log checkpoint completion, we can switch files, provided that the checkpoint LSN was not earlier than the start LSN of the resized log file. When resizing to a small log file during a heavy write workload, multiple checkpoints may be necessary.
While technically it would be possible to rebuild the log for changing innodb_encrypt_log, this task does not implement it, because it would require a non-trivial transformation between the log record streams that are being written to the current log file (ib_logfile0) and the future log file (ib_logfile101 that will replace ib_logfile0).
Rebuilding the log file will obviously cause disruption to mariadb-backup --backup, because the old log file will stop receiving writes once the server has switched to another log file. This could be addressed in MDEV-14992 by letting the server provide a log record stream directly.
Attachments
Issue Links
causes
MDEV-31311The test innodb.log_file_size_online occasionally hangs
Closed
MDEV-33361Excessive delays in SET GLOBAL innodb_log_file_size
Closed
MDEV-34446SIGSEGV on SET GLOBAL innodb_log_file_size with memory-mapped log file
Closed
MDEV-34909DDL during SET GLOBAL innodb_log_file_size may hang when using PMEM
Closed
MDEV-35802Race condition between log_t::persist() and log_t::write_checkpoint() on log resizing
Closed
MDEV-35810Missing error handling in log resizing around ib_logfile101
Closed
MDEV-36082Race condition between log_t::resize_start() and log_t::resize_abort()
Edit: The MoveFileEx() failure on Windows that prevented log resizing from succeeding was fixed by closing both file handles before the rename operation, and reopening ib_logfile0 afterwards (on Windows only). We already did something similar when resizing the log on server startup.
Marko Mäkelä
added a comment - - edited wlad , please review.
Edit: The MoveFileEx() failure on Windows that prevented log resizing from succeeding was fixed by closing both file handles before the rename operation, and reopening ib_logfile0 afterwards (on Windows only). We already did something similar when resizing the log on server startup.
I thought that it was 30×1000, which would have taken a few hours. I aborted it here. To run it on /dev/shm while having libpmem installed, I applied the following patch:
Log resizing will as a byproduct perform ‘log scrubbing’, which was an old MariaDB feature that was removed MDEV-21870 because it did not work correctly.
Marko Mäkelä
added a comment - Log resizing will as a byproduct perform ‘log scrubbing’, which was an old MariaDB feature that was removed MDEV-21870 because it did not work correctly.
wlad, thank you. Since your review, I had to refine the code a bit in the SET GLOBAL innodb_log_file_size update callback. The mtr based test that I conducted a few days ago was only ensuring that the resized log is restart-safe and crash-safe.
mleich produced rr replay traces and some core dumps for several race conditions that occurred when multiple threads were attempting to change the size concurrently, while some of the connections were being killed. I will wait for his final verdict before pushing this.
Marko Mäkelä
added a comment - wlad , thank you. Since your review, I had to refine the code a bit in the SET GLOBAL innodb_log_file_size update callback. The mtr based test that I conducted a few days ago was only ensuring that the resized log is restart-safe and crash-safe.
mleich produced rr replay traces and some core dumps for several race conditions that occurred when multiple threads were attempting to change the size concurrently, while some of the connections were being killed. I will wait for his final verdict before pushing this.
There were some problems with the log file wrap-around. I developed the following test to exercise that code. The idea of this test is to generate varying-length mini-transaction log from 2 DML connections while another connection is alternating the log file size between the two smallest allowed values, to maximize the probability of log buffer wrap-around events.
--source include/have_innodb.inc
CREATETABLE t1(a TINYINT PRIMARYKEY, b INTNOTNULL) ENGINE=InnoDB;
INSERTINTO t1 VALUES(1,1);
delimiter //;
createprocedure uproc(repeat_count int)
begin
declare current_num int;
set current_num = 0;
while current_num < repeat_count do
update t1 set b=0;
update t1 set b=256;
update t1 set b=65536;
update t1 set b=16777216;
set current_num = current_num + 1;
end while;
end//
createprocedure sproc(repeat_count int)
begin
declare current_num int;
set current_num = 0;
while current_num < repeat_count do
SETGLOBAL innodb_log_file_size=4096*1024;
SETGLOBAL innodb_log_file_size=4096*1025;
set current_num = current_num + 1;
end while;
end//
delimiter ;//
connect (u,localhost,root);
send call uproc(1000000);
connect (v,localhost,root);
send call uproc(1000000);
connectiondefault;
call sproc(100000);
connection u;
reap;
disconnect u;
connection v;
reap;
disconnect v;
connectiondefault;
droptable t1;
This test would be killed by mtr after 15 minutes (900 seconds) both with and without PMEM. I was only able to repeat the problems when running multiple instances of the test concurrently, and without using rr record:
After some fixes, the implementation survived the my tests for 2×15 minutes with PMEM, and another 2×15 minutes without PMEM.
Marko Mäkelä
added a comment - There were some problems with the log file wrap-around. I developed the following test to exercise that code. The idea of this test is to generate varying-length mini-transaction log from 2 DML connections while another connection is alternating the log file size between the two smallest allowed values, to maximize the probability of log buffer wrap-around events.
--source include/have_innodb.inc
CREATE TABLE t1(a TINYINT PRIMARY KEY , b INT NOT NULL ) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1,1);
delimiter //;
create procedure uproc(repeat_count int )
begin
declare current_num int ;
set current_num = 0;
while current_num < repeat_count do
update t1 set b=0;
update t1 set b=256;
update t1 set b=65536;
update t1 set b=16777216;
set current_num = current_num + 1;
end while;
end //
create procedure sproc(repeat_count int )
begin
declare current_num int ;
set current_num = 0;
while current_num < repeat_count do
SET GLOBAL innodb_log_file_size=4096*1024;
SET GLOBAL innodb_log_file_size=4096*1025;
set current_num = current_num + 1;
end while;
end //
delimiter ;//
connect (u,localhost,root);
send call uproc(1000000);
connect (v,localhost,root);
send call uproc(1000000);
connection default ;
call sproc(100000);
connection u;
reap;
disconnect u;
connection v;
reap;
disconnect v;
connection default ;
drop table t1;
This test would be killed by mtr after 15 minutes (900 seconds) both with and without PMEM. I was only able to repeat the problems when running multiple instances of the test concurrently, and without using rr record :
./mtr --parallel=auto innodb.MDEV-27812{,,,,,,,,,,,,,,,}
After some fixes, the implementation survived the my tests for 2×15 minutes with PMEM, and another 2×15 minutes without PMEM.
Starting, aborting and finishing the log resizing has to be protected by all of flush_lock, write_lock, and exclusive log_sys.latch to avoid race conditions with concurrent log_write_up_to(). Sufficient locking was in place in log_sys.resize_abort() since quite a time. The race conditions in starting and finishing the resizing were fixed today.
Marko Mäkelä
added a comment - Starting, aborting and finishing the log resizing has to be protected by all of flush_lock , write_lock , and exclusive log_sys.latch to avoid race conditions with concurrent log_write_up_to() . Sufficient locking was in place in log_sys.resize_abort() since quite a time. The race conditions in starting and finishing the resizing were fixed today.
Matthias Leich
added a comment -
The tree
origin/bb-10.9-MDEV-27812 05d1faec3661176b039db6beee60bcdbb3bc00d8 2022-03-02T14:14:47+02:00
behaved well in RQG testing.
Marko Mäkelä
added a comment - For the record, MySQL 8.0.30 includes a conceptually similar change to the MariaDB one ( fixup 1 , 2 ):
WL#12527 InnoDB: Dynamic configuration of space occupied by redo log files
People
Marko Mäkelä
Marko Mäkelä
Votes:
0Vote for this issue
Watchers:
3Start watching this issue
Dates
Created:
Updated:
Resolved:
Git Integration
Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.
{"report":{"fcp":1602.3999998569489,"ttfb":602.6999998092651,"pageVisibility":"visible","entityId":107950,"key":"jira.project.issue.view-issue","isInitial":true,"threshold":1000,"elementTimings":{},"userDeviceMemory":8,"userDeviceProcessors":64,"apdex":0.5,"journeyId":"b5a4221b-0685-4510-83fa-1f50c0b8ed72","navigationType":0,"readyForUser":1696.1999998092651,"redirectCount":0,"resourceLoadedEnd":1730,"resourceLoadedStart":608.6999998092651,"resourceTiming":[{"duration":466.5,"initiatorType":"link","name":"https://jira.mariadb.org/s/2c21342762a6a02add1c328bed317ffd-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/css/_super/batch.css","startTime":608.6999998092651,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":608.6999998092651,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1075.1999998092651,"responseStart":0,"secureConnectionStart":0},{"duration":466.60000014305115,"initiatorType":"link","name":"https://jira.mariadb.org/s/7ebd35e77e471bc30ff0eba799ebc151-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/css/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":608.8999998569489,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":608.8999998569489,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1075.5,"responseStart":0,"secureConnectionStart":0},{"duration":516.9000000953674,"initiatorType":"script","name":"https://jira.mariadb.org/s/0917945aaa57108d00c5076fea35e069-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/js/_super/batch.js?locale=en","startTime":609.0999999046326,"connectEnd":609.0999999046326,"connectStart":609.0999999046326,"domainLookupEnd":609.0999999046326,"domainLookupStart":609.0999999046326,"fetchStart":609.0999999046326,"redirectEnd":0,"redirectStart":0,"requestStart":609.0999999046326,"responseEnd":1126,"responseStart":1126,"secureConnectionStart":609.0999999046326},{"duration":606.2999999523163,"initiatorType":"script","name":"https://jira.mariadb.org/s/2d8175ec2fa4c816e8023260bd8c1786-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/js/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true&whisper-enabled=true","startTime":609.2999999523163,"connectEnd":609.2999999523163,"connectStart":609.2999999523163,"domainLookupEnd":609.2999999523163,"domainLookupStart":609.2999999523163,"fetchStart":609.2999999523163,"redirectEnd":0,"redirectStart":0,"requestStart":609.2999999523163,"responseEnd":1215.5999999046326,"responseStart":1215.5999999046326,"secureConnectionStart":609.2999999523163},{"duration":609.8999998569489,"initiatorType":"script","name":"https://jira.mariadb.org/s/a9324d6758d385eb45c462685ad88f1d-CDN/lu2cib/820016/12ta74/c92c0caa9a024ae85b0ebdbed7fb4bd7/_/download/contextbatch/js/atl.global,-_super/batch.js?locale=en","startTime":609.5,"connectEnd":609.5,"connectStart":609.5,"domainLookupEnd":609.5,"domainLookupStart":609.5,"fetchStart":609.5,"redirectEnd":0,"redirectStart":0,"requestStart":609.5,"responseEnd":1219.3999998569489,"responseStart":1219.3999998569489,"secureConnectionStart":609.5},{"duration":610.2000000476837,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-en/jira.webresources:calendar-en.js","startTime":609.7999999523163,"connectEnd":609.7999999523163,"connectStart":609.7999999523163,"domainLookupEnd":609.7999999523163,"domainLookupStart":609.7999999523163,"fetchStart":609.7999999523163,"redirectEnd":0,"redirectStart":0,"requestStart":609.7999999523163,"responseEnd":1220,"responseStart":1220,"secureConnectionStart":609.7999999523163},{"duration":610.5,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-localisation-moment/jira.webresources:calendar-localisation-moment.js","startTime":609.8999998569489,"connectEnd":609.8999998569489,"connectStart":609.8999998569489,"domainLookupEnd":609.8999998569489,"domainLookupStart":609.8999998569489,"fetchStart":609.8999998569489,"redirectEnd":0,"redirectStart":0,"requestStart":609.8999998569489,"responseEnd":1220.3999998569489,"responseStart":1220.3999998569489,"secureConnectionStart":609.8999998569489},{"duration":662.7999999523163,"initiatorType":"link","name":"https://jira.mariadb.org/s/b04b06a02d1959df322d9cded3aeecc1-CDN/lu2cib/820016/12ta74/a2ff6aa845ffc9a1d22fe23d9ee791fc/_/download/contextbatch/css/jira.global.look-and-feel,-_super/batch.css","startTime":610.0999999046326,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":610.0999999046326,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1272.8999998569489,"responseStart":0,"secureConnectionStart":0},{"duration":610.7000000476837,"initiatorType":"script","name":"https://jira.mariadb.org/rest/api/1.0/shortcuts/820016/47140b6e0a9bc2e4913da06536125810/shortcuts.js?context=issuenavigation&context=issueaction","startTime":610.1999998092651,"connectEnd":610.1999998092651,"connectStart":610.1999998092651,"domainLookupEnd":610.1999998092651,"domainLookupStart":610.1999998092651,"fetchStart":610.1999998092651,"redirectEnd":0,"redirectStart":0,"requestStart":610.1999998092651,"responseEnd":1220.8999998569489,"responseStart":1220.8999998569489,"secureConnectionStart":610.1999998092651},{"duration":662.7000000476837,"initiatorType":"link","name":"https://jira.mariadb.org/s/3ac36323ba5e4eb0af2aa7ac7211b4bb-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/css/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.css?jira.create.linked.issue=true","startTime":610.3999998569489,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":610.3999998569489,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1273.0999999046326,"responseStart":0,"secureConnectionStart":0},{"duration":610.9000000953674,"initiatorType":"script","name":"https://jira.mariadb.org/s/5d5e8fe91fbc506585e83ea3b62ccc4b-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/js/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.js?jira.create.linked.issue=true&locale=en","startTime":610.5999999046326,"connectEnd":610.5999999046326,"connectStart":610.5999999046326,"domainLookupEnd":610.5999999046326,"domainLookupStart":610.5999999046326,"fetchStart":610.5999999046326,"redirectEnd":0,"redirectStart":0,"requestStart":610.5999999046326,"responseEnd":1221.5,"responseStart":1221.5,"secureConnectionStart":610.5999999046326},{"duration":879.5,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-js/jira.webresources:bigpipe-js.js","startTime":612.3999998569489,"connectEnd":612.3999998569489,"connectStart":612.3999998569489,"domainLookupEnd":612.3999998569489,"domainLookupStart":612.3999998569489,"fetchStart":612.3999998569489,"redirectEnd":0,"redirectStart":0,"requestStart":612.3999998569489,"responseEnd":1491.8999998569489,"responseStart":1491.8999998569489,"secureConnectionStart":612.3999998569489},{"duration":1033.5999999046326,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-init/jira.webresources:bigpipe-init.js","startTime":617.5999999046326,"connectEnd":617.5999999046326,"connectStart":617.5999999046326,"domainLookupEnd":617.5999999046326,"domainLookupStart":617.5999999046326,"fetchStart":617.5999999046326,"redirectEnd":0,"redirectStart":0,"requestStart":617.5999999046326,"responseEnd":1651.1999998092651,"responseStart":1651.1999998092651,"secureConnectionStart":617.5999999046326},{"duration":228.80000019073486,"initiatorType":"xmlhttprequest","name":"https://jira.mariadb.org/rest/webResources/1.0/resources","startTime":1286.1999998092651,"connectEnd":1286.1999998092651,"connectStart":1286.1999998092651,"domainLookupEnd":1286.1999998092651,"domainLookupStart":1286.1999998092651,"fetchStart":1286.1999998092651,"redirectEnd":0,"redirectStart":0,"requestStart":1286.1999998092651,"responseEnd":1515,"responseStart":1514.8999998569489,"secureConnectionStart":1286.1999998092651},{"duration":176.20000004768372,"initiatorType":"link","name":"https://jira.mariadb.org/s/d5715adaadd168a9002b108b2b039b50-CDN/lu2cib/820016/12ta74/be4b45e9cec53099498fa61c8b7acba4/_/download/contextbatch/css/jira.project.sidebar,-_super,-project.issue.navigator,-jira.general,-jira.browse.project,-jira.view.issue,-jira.global,-atl.general,-com.atlassian.jira.projects.sidebar.init/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":1553.7999999523163,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":1553.7999999523163,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1730,"responseStart":0,"secureConnectionStart":0},{"duration":108.89999985694885,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/e65b778d185daf5aee24936755b43da6/_/download/contextbatch/js/browser-metrics-plugin.contrib,-_super,-project.issue.navigator,-jira.view.issue,-atl.general/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":1554.7999999523163,"connectEnd":1554.7999999523163,"connectStart":1554.7999999523163,"domainLookupEnd":1554.7999999523163,"domainLookupStart":1554.7999999523163,"fetchStart":1554.7999999523163,"redirectEnd":0,"redirectStart":0,"requestStart":1554.7999999523163,"responseEnd":1663.6999998092651,"responseStart":1663.6999998092651,"secureConnectionStart":1554.7999999523163},{"duration":115.5,"initiatorType":"script","name":"https://jira.mariadb.org/s/097ae97cb8fbec7d6ea4bbb1f26955b9-CDN/lu2cib/820016/12ta74/be4b45e9cec53099498fa61c8b7acba4/_/download/contextbatch/js/jira.project.sidebar,-_super,-project.issue.navigator,-jira.general,-jira.browse.project,-jira.view.issue,-jira.global,-atl.general,-com.atlassian.jira.projects.sidebar.init/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true&whisper-enabled=true","startTime":1555.0999999046326,"connectEnd":1555.0999999046326,"connectStart":1555.0999999046326,"domainLookupEnd":1555.0999999046326,"domainLookupStart":1555.0999999046326,"fetchStart":1555.0999999046326,"redirectEnd":0,"redirectStart":0,"requestStart":1555.0999999046326,"responseEnd":1670.5999999046326,"responseStart":1670.5999999046326,"secureConnectionStart":1555.0999999046326},{"duration":197.79999995231628,"initiatorType":"script","name":"https://www.google-analytics.com/analytics.js","startTime":1595.8999998569489,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":1595.8999998569489,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1793.6999998092651,"responseStart":0,"secureConnectionStart":0}],"fetchStart":0,"domainLookupStart":0,"domainLookupEnd":0,"connectStart":0,"connectEnd":0,"requestStart":400,"responseStart":602,"responseEnd":610,"domLoading":606,"domInteractive":1845,"domContentLoadedEventStart":1845,"domContentLoadedEventEnd":1898,"domComplete":2159,"loadEventStart":2159,"loadEventEnd":2160,"userAgent":"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)","marks":[{"name":"bigPipe.sidebar-id.start","time":1807.6999998092651},{"name":"bigPipe.sidebar-id.end","time":1808.5999999046326},{"name":"bigPipe.activity-panel-pipe-id.start","time":1808.8999998569489},{"name":"bigPipe.activity-panel-pipe-id.end","time":1815.0999999046326},{"name":"activityTabFullyLoaded","time":1932.5999999046326}],"measures":[],"correlationId":"757a646a9a7e7d","effectiveType":"4g","downlink":9.5,"rtt":0,"serverDuration":143,"dbReadsTimeInMs":29,"dbConnsTimeInMs":40,"applicationHash":"9d11dbea5f4be3d4cc21f03a88dd11d8c8687422","experiments":[]}}
wlad, please review.
Edit: The MoveFileEx() failure on Windows that prevented log resizing from succeeding was fixed by closing both file handles before the rename operation, and reopening ib_logfile0 afterwards (on Windows only). We already did something similar when resizing the log on server startup.