Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36159

mariabackup failed after upgrade 10.11.10

Details

    • Bug
    • Status: In Progress (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 10.11.10
    • 10.11
    • Backup

    Description

      Hi Team,

      We upgraded from 10.11.8 to 10.11.10 two weeks ago and since then mariabackup keeps failing with the following logs.

      [00] 2025-02-24 00:41:42 Waiting for log copy thread to read lsn 160568393706001
      [00] 2025-02-24 00:41:43 Retrying read of log at LSN=160515356496404
      [00] 2025-02-24 00:41:44 Retrying read of log at LSN=160515356496404
      [00] 2025-02-24 00:41:45 Retrying read of log at LSN=160515356496404
      [00] 2025-02-24 00:41:46 Retrying read of log at LSN=160515356496404
      [00] 2025-02-24 00:41:47 Retrying read of log at LSN=160515356496404
      [00] 2025-02-24 00:41:47 Was only able to copy log from 160487092534834 to 160515356496404, not 160568393706001; try increasing innodb_log_file_size
      mariabackup: Stopping log copying thread.
      

      I judged that this was caused by a small innodb_log_file_size value, so I tested to change it to an appropriate value and got the following results.

      • MariaDB engine ver : 10.11.10 / mariabackup engine ver : 10.11.10
        innodb_log_file_size = 1G - failed
        innodb_log_file_size = 4G - failed
        innodb_log_file_size = 8G - failed
      • MariaDB engine ver : 10.11.8 / mariabackup engine ver : 10.11.8
        innodb_log_file_size = 1G - success
        innodb_log_file_size = 4G - success
      • MariaDB engine ver : 10.11.10 / mariabackup engine ver : 10.11.8
        innodb_log_file_size = 1G - success

      Is this a new bug different from MDEV-34062?
      I would like to know if there is any impact on the acceptability of backups from engine 10.11.10 to mariabackup 10.11.8 in production environments.

      Thanks and Regard.

      Attachments

        Issue Links

          Activity

            supbaek baek seung ho created issue -
            supbaek baek seung ho made changes -
            Field Original Value New Value
            Summary mariaback failed after upgrade 10.11.10 mariabackup failed after upgrade 10.11.10
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -

            According to the numbers in the message, backup would need to copy 81,301,171,167 bytes (81 GB, 75.7 GiB) of log since the latest checkpoint at the time when the backup was started. It only managed to copy 28,263,961,570 bytes, a bit less than a third of that. Also that is a rather good achievement, because the circular ib_logfile0 that you configured must have been overwritten over 9 times (over 28 times if you used innodb_log_file_size=1g) while the backup is in progress.

            If you had configured a large enough log file size, then this failure should occur only when the log is corrupted, possibly due to a file system error. MDEV-35791 might be such a case.

            Given that the amount of log that needs to be copied is much larger than the configured log file size, I do not think that an attempt to force more frequent checkpoints (as discussed in MDEV-30000) would help. What would definitely help would be to have some form of server-assisted log copying (MDEV-14992) or log archiving. In that way, the server would automatically throttle is write activity to ensure that the log for the backup is not missing anything.

            marko Marko Mäkelä added a comment - According to the numbers in the message, backup would need to copy 81,301,171,167 bytes (81 GB, 75.7 GiB) of log since the latest checkpoint at the time when the backup was started. It only managed to copy 28,263,961,570 bytes, a bit less than a third of that. Also that is a rather good achievement, because the circular ib_logfile0 that you configured must have been overwritten over 9 times (over 28 times if you used innodb_log_file_size=1g ) while the backup is in progress. If you had configured a large enough log file size, then this failure should occur only when the log is corrupted, possibly due to a file system error. MDEV-35791 might be such a case. Given that the amount of log that needs to be copied is much larger than the configured log file size, I do not think that an attempt to force more frequent checkpoints (as discussed in MDEV-30000 ) would help. What would definitely help would be to have some form of server-assisted log copying ( MDEV-14992 ) or log archiving. In that way, the server would automatically throttle is write activity to ensure that the log for the backup is not missing anything.

            Can you test if forcing an InnoDB log checkpoint immediately before starting the backup, as discussed in MDEV-30000, would improve the chances of completing the backup?

            marko Marko Mäkelä added a comment - Can you test if forcing an InnoDB log checkpoint immediately before starting the backup, as discussed in MDEV-30000 , would improve the chances of completing the backup?
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] Needs Feedback [ 10501 ]
            supbaek baek seung ho added a comment -

            We did a backup on 10.11.10 after causing a checkpoint as follows, but the backup failed the same way.

            SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.01;
            SET GLOBAL innodb_max_dirty_pages_pct_lwm=0;
            

            Is there any other way besides this?

            If we need to increase innodb_log_file_size, how much should we increase it?

            supbaek baek seung ho added a comment - We did a backup on 10.11.10 after causing a checkpoint as follows, but the backup failed the same way. SET GLOBAL innodb_max_dirty_pages_pct_lwm= 0.01 ; SET GLOBAL innodb_max_dirty_pages_pct_lwm= 0 ; Is there any other way besides this? If we need to increase innodb_log_file_size, how much should we increase it?
            elenst Elena Stepanova made changes -
            Status Needs Feedback [ 10501 ] Open [ 1 ]
            elenst Elena Stepanova made changes -
            Assignee Marko Mäkelä [ marko ]
            supbaek baek seung ho added a comment -

            First, let's change the operating environment as follows and check if the backup is successful.

            • innodb_log_file_size : 16GB
            • innodb_log_buffer_size : 64MB
            supbaek baek seung ho added a comment - First, let's change the operating environment as follows and check if the backup is successful. innodb_log_file_size : 16GB innodb_log_buffer_size : 64MB
            Sagbo Agbo Steven added a comment -

            Hi,

            Same behavior here with the version: mariadb-backup 1:10.11.11+maria~deb12

            With these specs:

            • innodb_log_file_size = 8024M
            • innodb_log_buffer_size = 32M

            ==> Failed

            • innodb_log_file_size = 16048M
            • innodb_log_buffer_size = 64M

            ==> Failed

            Sagbo Agbo Steven added a comment - Hi, Same behavior here with the version: mariadb-backup 1:10.11.11+maria~deb12 With these specs: innodb_log_file_size = 8024M innodb_log_buffer_size = 32M ==> Failed innodb_log_file_size = 16048M innodb_log_buffer_size = 64M ==> Failed

            Same here with MariaDB 10.11.11.

            # mariadb-backup --user=root --backup --stream=xbstream
            ...
            [00] 2025-02-27 21:19:52 Finished backing up non-InnoDB tables and files
            [00] 2025-02-27 21:19:52 Waiting for log copy thread to read lsn 14369606880777
            [00] 2025-02-27 21:19:53 Retrying read of log at LSN=14369590462614
            [00] 2025-02-27 21:19:54 Retrying read of log at LSN=14369590462614
            [00] 2025-02-27 21:19:55 Retrying read of log at LSN=14369590462614
            [00] 2025-02-27 21:19:56 Retrying read of log at LSN=14369590462614
            [00] 2025-02-27 21:19:57 Was only able to copy log from 14369570090711 to 14369590462614, not 14369606880777; try increasing innodb_log_file_size
            mariabackup: Stopping log copying thread.[00] 2025-02-27 21:19:57 Retrying read of log at LSN=14369590462614
            

            # mariadb-backup --prepare --target-dir=.
            [00] 2025-02-28 12:18:00 cd to /var/lib/mysql/
            [00] 2025-02-28 12:18:00 open files limit requested 0, set to 1024
            [00] FATAL ERROR: 2025-02-28 12:18:00 Can't open backup-my.cnf for reading
            

            hydrapolic Tomáš Mózes added a comment - Same here with MariaDB 10.11.11. # mariadb-backup --user=root --backup --stream=xbstream ... [00] 2025-02-27 21:19:52 Finished backing up non-InnoDB tables and files [00] 2025-02-27 21:19:52 Waiting for log copy thread to read lsn 14369606880777 [00] 2025-02-27 21:19:53 Retrying read of log at LSN=14369590462614 [00] 2025-02-27 21:19:54 Retrying read of log at LSN=14369590462614 [00] 2025-02-27 21:19:55 Retrying read of log at LSN=14369590462614 [00] 2025-02-27 21:19:56 Retrying read of log at LSN=14369590462614 [00] 2025-02-27 21:19:57 Was only able to copy log from 14369570090711 to 14369590462614, not 14369606880777; try increasing innodb_log_file_size mariabackup: Stopping log copying thread.[00] 2025-02-27 21:19:57 Retrying read of log at LSN=14369590462614 # mariadb-backup --prepare --target-dir=. [00] 2025-02-28 12:18:00 cd to /var/lib/mysql/ [00] 2025-02-28 12:18:00 open files limit requested 0, set to 1024 [00] FATAL ERROR: 2025-02-28 12:18:00 Can't open backup-my.cnf for reading
            supbaek baek seung ho added a comment -

            @Tomas Mozes
            Can you tell me what your innodb_log_file_size setting is?

            supbaek baek seung ho added a comment - @Tomas Mozes Can you tell me what your innodb_log_file_size setting is?

            MariaDB [(none)]> show variables like 'innodb_log_file_size';
            +----------------------+-----------+
            | Variable_name        | Value     |
            +----------------------+-----------+
            | innodb_log_file_size | 100663296 |
            +----------------------+-----------+
            1 row in set (0.001 sec)
            

            hydrapolic Tomáš Mózes added a comment - MariaDB [(none)]> show variables like 'innodb_log_file_size'; +----------------------+-----------+ | Variable_name | Value | +----------------------+-----------+ | innodb_log_file_size | 100663296 | +----------------------+-----------+ 1 row in set (0.001 sec)
            quulah Miika Kankare added a comment -

            Our backups have also been failing intermittently for a while. Roughly every two weeks.

            This started happening some time last year. I think after the upgrade from 10.11.7 to 10.11.9. But as the problem didn't start immediately, I can't rule out all other differences. But perhaps that gives some indication when a potential bug may have appeared. Now we are running 10.11.11 with the problem still persisting.

            The database is fairly small with a limited number of traffic, and if I've learned something from this thread and the linked ones, our innodb_log_file_size set at 1G is plenty for this (probably too much):

            [00] 2025-02-20 06:16:04 Was only able to copy log from 297374420802 to 297377118180, not 297377133957; try increasing innodb_log_file_size
            

            According to the logs mmap is used and since the amount of data is rather small, I think this isn't a performance problem.

            To debug, I've already added a new disk to the server (it's a cloud VM) for the destination of the backups. The source database is still on the same disk it has been since the beginning. I could perhaps take the server offline next and run a check on the filesystem, but somehow I feel like there's something else going on here.

            quulah Miika Kankare added a comment - Our backups have also been failing intermittently for a while. Roughly every two weeks. This started happening some time last year. I think after the upgrade from 10.11.7 to 10.11.9. But as the problem didn't start immediately, I can't rule out all other differences. But perhaps that gives some indication when a potential bug may have appeared. Now we are running 10.11.11 with the problem still persisting. The database is fairly small with a limited number of traffic, and if I've learned something from this thread and the linked ones, our innodb_log_file_size set at 1G is plenty for this (probably too much): [00] 2025-02-20 06:16:04 Was only able to copy log from 297374420802 to 297377118180, not 297377133957; try increasing innodb_log_file_size According to the logs mmap is used and since the amount of data is rather small, I think this isn't a performance problem. To debug, I've already added a new disk to the server (it's a cloud VM) for the destination of the backups. The source database is still on the same disk it has been since the beginning. I could perhaps take the server offline next and run a check on the filesystem, but somehow I feel like there's something else going on here.

            For those failures where the amount of data to be copied is small compared to the server’s innodb_log_file_size, MDEV-36201 might share a root cause with this. Unfortunately, to analyze this, I would need a copy of all data and logs, at the very least including an affected server’s ib_logfile0 file and the output of mariadb-backup --backup. If someone can reproduce this with some dummy data that can be shared, that would be great.

            marko Marko Mäkelä added a comment - For those failures where the amount of data to be copied is small compared to the server’s innodb_log_file_size , MDEV-36201 might share a root cause with this. Unfortunately, to analyze this, I would need a copy of all data and logs, at the very least including an affected server’s ib_logfile0 file and the output of mariadb-backup --backup . If someone can reproduce this with some dummy data that can be shared, that would be great.
            supbaek baek seung ho added a comment -

            After changing innodb_log_file_size and innodb_log_buffer_size, the full backup on March 2 was successful, but the incremental backup on March 3 failed.
            We believe this phenomenon is most likely a bug in mariabackup and would like to verify whether backing up MariaDB 10.11.10 with mariabackup 10.11.8 has no effect.

            supbaek baek seung ho added a comment - After changing innodb_log_file_size and innodb_log_buffer_size, the full backup on March 2 was successful, but the incremental backup on March 3 failed. We believe this phenomenon is most likely a bug in mariabackup and would like to verify whether backing up MariaDB 10.11.10 with mariabackup 10.11.8 has no effect.
            supbaek baek seung ho added a comment -

            The mariabackup test was conducted with sysbench.
            The table size was about 40GB, and the backup was tested while performing an update with 100 threads using sysbench.
            When conducting the test, when MariaDB was restarted to change the innodb_log_file_size, the first backup was almost successful, but the second and subsequent backups failed.

            supbaek baek seung ho added a comment - The mariabackup test was conducted with sysbench. The table size was about 40GB, and the backup was tested while performing an update with 100 threads using sysbench. When conducting the test, when MariaDB was restarted to change the innodb_log_file_size, the first backup was almost successful, but the second and subsequent backups failed.

            supbaek, thank you for the updates. Thanks to MDEV-27812, you can actually invoke SET GLOBAL innodb_log_file_size while the server is running. However, if you concurrently run mariadb-backup --backup, it will hang as soon as the to-be-new log file ib_logfile101 replaces the original ib_logfile0.

            Can you share your scripts that you use for reproducing this?

            marko Marko Mäkelä added a comment - supbaek , thank you for the updates. Thanks to MDEV-27812 , you can actually invoke SET GLOBAL innodb_log_file_size while the server is running. However, if you concurrently run mariadb-backup --backup , it will hang as soon as the to-be-new log file ib_logfile101 replaces the original ib_logfile0 . Can you share your scripts that you use for reproducing this?
            supbaek baek seung ho made changes -
            supbaek baek seung ho made changes -
            supbaek baek seung ho added a comment -

            I've attached the log files when mariabackup succeeded and failed, and I'll share the commands I used during the test.

            sysbench /usr/share/sysbench/oltp_update_index.lua \
            --threads=50 \
            --mysql-host=192.168.6.28 \
            --mysql-port=3306 \
            --mysql-db=test \
            --mysql-user=growin \
            --mysql-password=growin \
            --db-driver=mysql \
            --tables=5 \
            --table-size=10000000 \
            --time=7200 \
            $CMD
             
            sh sysbench_for_mysql.sh run &
            sh sysbench_for_mysql.sh run &
            

            And I have a few questions.
            We upgraded from 10.4.15 -> 10.11.3 -> 10.11.8 -> 10.11.10.
            1. After the upgrade, innodb_log_file was reduced from 3 to 1. Are there any side effects?
            2. I changed innodb_log_file_size (1.5G -> 16G) and innodb_log_buffer_size (32MB -> 64MB). Is there a monitoring indicator that can compare DB performance before and after the change?
            3. Is there any problem if I back up MariaDB 10.11.10 with mariabackup 10.11.8?

            supbaek baek seung ho added a comment - I've attached the log files when mariabackup succeeded and failed, and I'll share the commands I used during the test. sysbench /usr/share/sysbench/oltp_update_index.lua \ --threads= 50 \ --mysql-host= 192.168 . 6.28 \ --mysql-port= 3306 \ --mysql-db=test \ --mysql-user=growin \ --mysql-password=growin \ --db-driver=mysql \ --tables= 5 \ --table-size= 10000000 \ --time= 7200 \ $CMD   sh sysbench_for_mysql.sh run & sh sysbench_for_mysql.sh run & And I have a few questions. We upgraded from 10.4.15 -> 10.11.3 -> 10.11.8 -> 10.11.10. 1. After the upgrade, innodb_log_file was reduced from 3 to 1. Are there any side effects? 2. I changed innodb_log_file_size (1.5G -> 16G) and innodb_log_buffer_size (32MB -> 64MB). Is there a monitoring indicator that can compare DB performance before and after the change? 3. Is there any problem if I back up MariaDB 10.11.10 with mariabackup 10.11.8?

            Before innodb_log_files_in_group was hard-wired to 1 in MariaDB Server 10.5, it was possible to treat multiple log files as a single one. To retain the same log file size, you need to ensure that innodb_log_files_in_group multiplied by innodb_log_file_size will remain unchanged.

            The log format is not supposed to change within a major release. mariadb-backup 10.11.8 will lack some performance fixes, such as MDEV-34062. Some recovery or backup bugs that we find and fix based on our internal testing are on the "write" side (the server instance that was being backed up or that was killed), some on the "recovery" or mariadb-backup --prepare side.

            marko Marko Mäkelä added a comment - Before innodb_log_files_in_group was hard-wired to 1 in MariaDB Server 10.5, it was possible to treat multiple log files as a single one. To retain the same log file size, you need to ensure that innodb_log_files_in_group multiplied by innodb_log_file_size will remain unchanged. The log format is not supposed to change within a major release. mariadb-backup 10.11.8 will lack some performance fixes, such as MDEV-34062 . Some recovery or backup bugs that we find and fix based on our internal testing are on the "write" side (the server instance that was being backed up or that was killed), some on the "recovery" or mariadb-backup --prepare side.

            Thank you for the test script. I will need to return to this. The script might also be useful for testing MDEV-34070, which I did not get back to after fixing MDEV-34062.

            I checked the change history since 10.11.10, and I don’t think that anything could possibly have fixed this reported problem (which I have not reproduced yet).

            marko Marko Mäkelä added a comment - Thank you for the test script. I will need to return to this. The script might also be useful for testing MDEV-34070 , which I did not get back to after fixing MDEV-34062 . I checked the change history since 10.11.10, and I don’t think that anything could possibly have fixed this reported problem (which I have not reproduced yet).
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Labels performance
            marko Marko Mäkelä made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            supbaek baek seung ho added a comment -

            If we need to backup database safely, should we downgrade to MariaDB 10.11.8? Or can we just use mariadb-backup 10.11.8 on MariaDB 10.11.10 database?
            In mariadb-backup 10.11.8, even though some performance fixes are missing, if the major version is same, We are wondering if there is any problem doing the backup in upper minor version.

            supbaek baek seung ho added a comment - If we need to backup database safely, should we downgrade to MariaDB 10.11.8? Or can we just use mariadb-backup 10.11.8 on MariaDB 10.11.10 database? In mariadb-backup 10.11.8, even though some performance fixes are missing, if the major version is same, We are wondering if there is any problem doing the backup in upper minor version.
            hydrapolic Tomáš Mózes added a comment - - edited

            After some experimenting, I've decided to switch back my replicas from 10.11.11 to 10.6.21 (primary on 10.6.20).

            I've tried the backups on 3 different databases, ranging from 100GB to 1TB. On the smallest database, setting innodb_log_file_size = 10G helped and a mariadb-backup with 10.11.11 was successful. However on the bigger data sets, even 10GB wasn't enough, I had to raise to 50GB for the backup to work. On the biggest database (1TB) however, even setting to 300GB didn't help.

            I've also tried using mariadb-backup from 10.11.8, but it didn't help either.

            Today after downgrading one of the replicas from 10.11.11 -> 10.6.21, the backup works fine (tested 5x times) even with the default innodb_log_file_size value.

            hydrapolic Tomáš Mózes added a comment - - edited After some experimenting, I've decided to switch back my replicas from 10.11.11 to 10.6.21 (primary on 10.6.20). I've tried the backups on 3 different databases, ranging from 100GB to 1TB. On the smallest database, setting innodb_log_file_size = 10G helped and a mariadb-backup with 10.11.11 was successful. However on the bigger data sets, even 10GB wasn't enough, I had to raise to 50GB for the backup to work. On the biggest database (1TB) however, even setting to 300GB didn't help. I've also tried using mariadb-backup from 10.11.8, but it didn't help either. Today after downgrading one of the replicas from 10.11.11 -> 10.6.21, the backup works fine (tested 5x times) even with the default innodb_log_file_size value.
            JIraAutomate JiraAutomate made changes -
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.6 [ 24028 ]
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.11 [ 27614 ]
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.6 [ 24028 ]

            By the way, the same problem occurs on 10.11 and 11.4. It does NOT happen on 10.6.

            hydrapolic Tomáš Mózes added a comment - By the way, the same problem occurs on 10.11 and 11.4. It does NOT happen on 10.6.
            supbaek baek seung ho made changes -
            Attachment backup_success.log [ 74810 ]
            Attachment backup_failed.log [ 74811 ]
            supbaek baek seung ho added a comment - - edited

            Yesterday I have successfully backed up my stage database with mariabackup 10.11.10.
            There is some option that is not noticed in the mariabackup options page of documents.
            I think it is in the mariabackup for enterprise edition, not community, which is innodb-log-file-buffering and innodb-log-file-mmap.

            I have the following questions:
            1. when running mariabackup, if I use the --skip-innodb-log-file-buffering option, the backup completes normally, but if I run a backup with innodb-log-file-buffering turned off, is there any side effect on the server or the backup?

            2. if a backup succeeds when run with innodb-log-file-buffering disabled, is there a reason for this? In our tests we noticed that ib_logfile0 in the backup file did not grow when the backup failed, but ib_logfile0 continued to grow when the backup was performed with innodb-log-file-buffering disabled.

            3. what is the exact meaning of the parameter innodb_log_file_buffering? The description of the parameter says whether the file system cache is enabled for ib_logfile0, but I'm wondering what that means exactly.

            4. can you tell me if this problem will be fixed in the next release 10.11.12 and when it will be released?

            I will upload mariabackup logs which are both failed and success with --verbose option.
            backup_failed.log
            backup_success.log

            supbaek baek seung ho added a comment - - edited Yesterday I have successfully backed up my stage database with mariabackup 10.11.10. There is some option that is not noticed in the mariabackup options page of documents. I think it is in the mariabackup for enterprise edition, not community, which is innodb-log-file-buffering and innodb-log-file-mmap. I have the following questions: 1. when running mariabackup, if I use the --skip-innodb-log-file-buffering option, the backup completes normally, but if I run a backup with innodb-log-file-buffering turned off, is there any side effect on the server or the backup? 2. if a backup succeeds when run with innodb-log-file-buffering disabled, is there a reason for this? In our tests we noticed that ib_logfile0 in the backup file did not grow when the backup failed, but ib_logfile0 continued to grow when the backup was performed with innodb-log-file-buffering disabled. 3. what is the exact meaning of the parameter innodb_log_file_buffering? The description of the parameter says whether the file system cache is enabled for ib_logfile0, but I'm wondering what that means exactly. 4. can you tell me if this problem will be fixed in the next release 10.11.12 and when it will be released? I will upload mariabackup logs which are both failed and success with --verbose option. backup_failed.log backup_success.log

            The fundamental difference between 10.6 and 10.11 is that until MDEV-14425 was implemented, the write-ahead log ib_logfile0 was divided into 512-byte blocks. Backup would copy these log blocks and validate the CRC-32C. It would not try to parse individual log records. This format was slow to write, because InnoDB would hold log_sys.mutex while copying data into log blocks, optionally encrypting the blocks (innodb_encrypt_log=ON) and computing the CRC-32C. The new format makes each individual mini-transaction a ‘block’ on its own. This allows any threads that modify persistent data to perform the encryption and CRC-32C concurrently. Also the actual memcpy() into the log buffer log_sys.buf is concurrent. Concurrency will be improved even further after the bottleneck MDEV-21923 has been removed.

            While the server has gotten faster to write the log, backup has gotten slower, because it is only copying and parsing the ib_logfile0 in one thread, and it now has to parse individual log records in order to find the mini-transaction boundaries and to be able to validate the CRC-32C for each mini-transaction. This creates a producer-consumer buffer overflow problem. The fix of MDEV-30000 could alleviate this a little, by forcing a checkpoint at the start of the backup, so that less log would have to be copied. Another possible help is to configure a larger innodb_log_file_size.

            A better fix would be to integrate the backup in the server in some way (MDEV-14992) or to make the server responsible for producing a log for backups (something like log archiving). If the server were writing the log for backup in sync with the recovery log, it would naturally slow down. This is a large change that will take time to implement, and it would only appear in a new major release of MariaDB Server, and possibly in the MariaDB Enterprise Server 11.4 release.

            The options in mariadb-backup are somewhat of a mess. The only part where innodb_log_file_buffering could make a difference is when reading the server’s ib_logfile0. innodb_log_file_buffering=OFF means that an attempt is made to open the log with O_DIRECT. Reading or writing the backed-up ib_logfile0 will not use O_DIRECT. The parameter was introduced in MDEV-30136 when innodb_flush_method was deprecated. I made some tests in May 2024 in MDEV-34062. The column "server innodb_log_file_mmap" in the tables is referring to a prototype that would allow the server to write log via mmap(). In the final version, this parameter only has effect during crash recovery or in backup, when the server’s log is being read. Those tests suggested that disabling O_DIRECT on the server for the log file or enabling memory-mapped access to parsing the file would enable the Linux kernel block cache. Of course, the results could vary between file system and kernel versions. I tested it only on one system.

            marko Marko Mäkelä added a comment - The fundamental difference between 10.6 and 10.11 is that until MDEV-14425 was implemented, the write-ahead log ib_logfile0 was divided into 512-byte blocks. Backup would copy these log blocks and validate the CRC-32C. It would not try to parse individual log records. This format was slow to write, because InnoDB would hold log_sys.mutex while copying data into log blocks, optionally encrypting the blocks ( innodb_encrypt_log=ON ) and computing the CRC-32C. The new format makes each individual mini-transaction a ‘block’ on its own. This allows any threads that modify persistent data to perform the encryption and CRC-32C concurrently. Also the actual memcpy() into the log buffer log_sys.buf is concurrent. Concurrency will be improved even further after the bottleneck MDEV-21923 has been removed. While the server has gotten faster to write the log, backup has gotten slower, because it is only copying and parsing the ib_logfile0 in one thread, and it now has to parse individual log records in order to find the mini-transaction boundaries and to be able to validate the CRC-32C for each mini-transaction. This creates a producer-consumer buffer overflow problem. The fix of MDEV-30000 could alleviate this a little, by forcing a checkpoint at the start of the backup, so that less log would have to be copied. Another possible help is to configure a larger innodb_log_file_size . A better fix would be to integrate the backup in the server in some way ( MDEV-14992 ) or to make the server responsible for producing a log for backups (something like log archiving). If the server were writing the log for backup in sync with the recovery log, it would naturally slow down. This is a large change that will take time to implement, and it would only appear in a new major release of MariaDB Server, and possibly in the MariaDB Enterprise Server 11.4 release. The options in mariadb-backup are somewhat of a mess. The only part where innodb_log_file_buffering could make a difference is when reading the server’s ib_logfile0 . innodb_log_file_buffering=OFF means that an attempt is made to open the log with O_DIRECT . Reading or writing the backed-up ib_logfile0 will not use O_DIRECT . The parameter was introduced in MDEV-30136 when innodb_flush_method was deprecated. I made some tests in May 2024 in MDEV-34062 . The column "server innodb_log_file_mmap " in the tables is referring to a prototype that would allow the server to write log via mmap() . In the final version, this parameter only has effect during crash recovery or in backup, when the server’s log is being read. Those tests suggested that disabling O_DIRECT on the server for the log file or enabling memory-mapped access to parsing the file would enable the Linux kernel block cache. Of course, the results could vary between file system and kernel versions. I tested it only on one system.

            axel, can you please verify the claim that backup got more failure-prone between 10.11.8 and 10.11.10?

            marko Marko Mäkelä added a comment - axel , can you please verify the claim that backup got more failure-prone between 10.11.8 and 10.11.10?
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Axel Schwenke [ axel ]
            axel Axel Schwenke made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            supbaek baek seung ho added a comment -

            I would like to know how the marie backup is going in the test, and I would like to know if there is a way to continue the backup while keeping the current DB version.

            As an additional workaround, I would like to run backups on a slave, and if I backup using the Safe Slave Backup and Slave Info options, will there be any issues with backing up to the current version?

            Also, I would like to know the release schedule for MariaDB 10.11.12.

            supbaek baek seung ho added a comment - I would like to know how the marie backup is going in the test, and I would like to know if there is a way to continue the backup while keeping the current DB version. As an additional workaround, I would like to run backups on a slave, and if I backup using the Safe Slave Backup and Slave Info options, will there be any issues with backing up to the current version? Also, I would like to know the release schedule for MariaDB 10.11.12.

            People

              axel Axel Schwenke
              supbaek baek seung ho
              Votes:
              3 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.