Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27812

Allow innodb_log_file_size to change without server restart

Details

    Description

      Currently, if the parameters of the redo log change, InnoDB in the MariaDB Community Server will rebuild the redo log at server startup.

      MariaDB Enterprise Server 10.5 and 10.6 allow dynamic tuning of the redo log parameters, rebuilding the redo log in a crash-safe manner without restarting the server.

      SET GLOBAL innodb_log_file_size=…;
      

      Before MDEV-14425 changed the redo log file format in 10.8, we were be unable to enable or disable innodb_encrypt_log without server restart, because starting with MDEV-12041 in MariaDB 10.4, the encrypted redo log blocks will have 4 bytes less payload per 512-byte log block.

      Log resizing is tied to checkpoints. We can start writing a second redo log ib_logfile101 with the requested new size, starting from something close to the last written log sequence number. On log checkpoint completion, we can switch files, provided that the checkpoint LSN was not earlier than the start LSN of the resized log file. When resizing to a small log file during a heavy write workload, multiple checkpoints may be necessary.

      While technically it would be possible to rebuild the log for changing innodb_encrypt_log, this task does not implement it, because it would require a non-trivial transformation between the log record streams that are being written to the current log file (ib_logfile0) and the future log file (ib_logfile101 that will replace ib_logfile0).

      Rebuilding the log file will obviously cause disruption to mariadb-backup --backup, because the old log file will stop receiving writes once the server has switched to another log file. This could be addressed in MDEV-14992 by letting the server provide a log record stream directly.

      Attachments

        Issue Links

          Activity

            wlad, thank you. Since your review, I had to refine the code a bit in the SET GLOBAL innodb_log_file_size update callback. The mtr based test that I conducted a few days ago was only ensuring that the resized log is restart-safe and crash-safe.

            mleich produced rr replay traces and some core dumps for several race conditions that occurred when multiple threads were attempting to change the size concurrently, while some of the connections were being killed. I will wait for his final verdict before pushing this.

            marko Marko Mäkelä added a comment - wlad , thank you. Since your review, I had to refine the code a bit in the SET GLOBAL innodb_log_file_size update callback. The mtr based test that I conducted a few days ago was only ensuring that the resized log is restart-safe and crash-safe. mleich produced rr replay traces and some core dumps for several race conditions that occurred when multiple threads were attempting to change the size concurrently, while some of the connections were being killed. I will wait for his final verdict before pushing this.

            There were some problems with the log file wrap-around. I developed the following test to exercise that code. The idea of this test is to generate varying-length mini-transaction log from 2 DML connections while another connection is alternating the log file size between the two smallest allowed values, to maximize the probability of log buffer wrap-around events.

            --source include/have_innodb.inc
             
            CREATE TABLE t1(a TINYINT PRIMARY KEY, b INT NOT NULL) ENGINE=InnoDB;
            INSERT INTO t1 VALUES(1,1);
             
            delimiter //;
            create procedure uproc(repeat_count int)
            begin
              declare current_num int;
              set current_num = 0;
              while current_num < repeat_count do
                update t1 set b=0;
                update t1 set b=256;
                update t1 set b=65536;
                update t1 set b=16777216;
                set current_num = current_num + 1;
              end while;
            end//
             
            create procedure sproc(repeat_count int)
            begin
              declare current_num int;
              set current_num = 0;
              while current_num < repeat_count do
                SET GLOBAL innodb_log_file_size=4096*1024;
                SET GLOBAL innodb_log_file_size=4096*1025;
                set current_num = current_num + 1;
              end while;
            end//
             
            delimiter ;//
             
            connect (u,localhost,root);
            send call uproc(1000000);
            connect (v,localhost,root);
            send call uproc(1000000);
             
            connection default;
            call sproc(100000);
             
            connection u;
            reap;
            disconnect u;
            connection v;
            reap;
            disconnect v;
             
            connection default;
             
            drop table t1;
            

            This test would be killed by mtr after 15 minutes (900 seconds) both with and without PMEM. I was only able to repeat the problems when running multiple instances of the test concurrently, and without using rr record:

            ./mtr --parallel=auto innodb.MDEV-27812{,,,,,,,,,,,,,,,}
            

            After some fixes, the implementation survived the my tests for 2×15 minutes with PMEM, and another 2×15 minutes without PMEM.

            marko Marko Mäkelä added a comment - There were some problems with the log file wrap-around. I developed the following test to exercise that code. The idea of this test is to generate varying-length mini-transaction log from 2 DML connections while another connection is alternating the log file size between the two smallest allowed values, to maximize the probability of log buffer wrap-around events. --source include/have_innodb.inc   CREATE TABLE t1(a TINYINT PRIMARY KEY , b INT NOT NULL ) ENGINE=InnoDB; INSERT INTO t1 VALUES (1,1);   delimiter //; create procedure uproc(repeat_count int ) begin declare current_num int ; set current_num = 0; while current_num < repeat_count do update t1 set b=0; update t1 set b=256; update t1 set b=65536; update t1 set b=16777216; set current_num = current_num + 1; end while; end //   create procedure sproc(repeat_count int ) begin declare current_num int ; set current_num = 0; while current_num < repeat_count do SET GLOBAL innodb_log_file_size=4096*1024; SET GLOBAL innodb_log_file_size=4096*1025; set current_num = current_num + 1; end while; end //   delimiter ;//   connect (u,localhost,root); send call uproc(1000000); connect (v,localhost,root); send call uproc(1000000);   connection default ; call sproc(100000);   connection u; reap; disconnect u; connection v; reap; disconnect v;   connection default ;   drop table t1; This test would be killed by mtr after 15 minutes (900 seconds) both with and without PMEM. I was only able to repeat the problems when running multiple instances of the test concurrently, and without using rr record : ./mtr --parallel=auto innodb.MDEV-27812{,,,,,,,,,,,,,,,} After some fixes, the implementation survived the my tests for 2×15 minutes with PMEM, and another 2×15 minutes without PMEM.

            Starting, aborting and finishing the log resizing has to be protected by all of flush_lock, write_lock, and exclusive log_sys.latch to avoid race conditions with concurrent log_write_up_to(). Sufficient locking was in place in log_sys.resize_abort() since quite a time. The race conditions in starting and finishing the resizing were fixed today.

            marko Marko Mäkelä added a comment - Starting, aborting and finishing the log resizing has to be protected by all of flush_lock , write_lock , and exclusive log_sys.latch to avoid race conditions with concurrent log_write_up_to() . Sufficient locking was in place in log_sys.resize_abort() since quite a time. The race conditions in starting and finishing the resizing were fixed today.

            The tree
            origin/bb-10.9-MDEV-27812 05d1faec3661176b039db6beee60bcdbb3bc00d8 2022-03-02T14:14:47+02:00
            behaved well in RQG testing.
            

            mleich Matthias Leich added a comment - The tree origin/bb-10.9-MDEV-27812 05d1faec3661176b039db6beee60bcdbb3bc00d8 2022-03-02T14:14:47+02:00 behaved well in RQG testing.

            For the record, MySQL 8.0.30 includes a conceptually similar change to the MariaDB one (fixup 1, 2):
            WL#12527 InnoDB: Dynamic configuration of space occupied by redo log files

            marko Marko Mäkelä added a comment - For the record, MySQL 8.0.30 includes a conceptually similar change to the MariaDB one ( fixup 1 , 2 ): WL#12527 InnoDB: Dynamic configuration of space occupied by redo log files

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.