Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35517

mariadb-backup failing at --backup stage: try increasing innodb_log_file_size

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 11.4.4
    • None
    • Backup
    • None
    • RHEL-9
      Using the repositories from mariadb.com

    Description

      We recently upgraded a three-node master-slave cluster from 10.11 to 11.4. The cluster runs 100+ databases with ~25,000 tables, but the overall data size is just about 100 GB. As it is mostly business operations, the cluster is busier during the daytime and sees less load during thge nighttime and weekends. Each mode takes one backup with mariadb-backup per day, shifted by 8 hours (i.e. node A taks a backup at midnight, node B at 8 AM, node C at 4 PM).

      Until the upgrade this worked without any issues. After the upgrade backups frequently (up to 50% of the time) fail with a message like this duirng the --backup stage:

      [00] 2024-11-28 00:12:53 Finished backing up non-InnoDB tables and files
      [00] 2024-11-28 00:12:53 Waiting for log copy thread to read lsn 7676153454539
      [00] 2024-11-28 00:12:53 Retrying read of log at LSN=7676152544983
      [00] 2024-11-28 00:12:54 Retrying read of log at LSN=7676152544983
      [00] 2024-11-28 00:12:55 Retrying read of log at LSN=7676152544983
      [00] 2024-11-28 00:12:57 Retrying read of log at LSN=7676152544983
      [00] 2024-11-28 00:12:58 Retrying read of log at LSN=7676152544983
      [00] 2024-11-28 00:12:58 Was only able to copy log from 7675927450389 to 7676152544983, not 7676153454539; try increasing innodb_log_file_size
      [00] 2024-11-28 00:12:58 Retrying read of log at LSN=7676152544983

      What we did so far:

      • Increased the InnoDB log four-fold, from its previous value of 256 MB to 1 GB. This did not help.
      • Took notice of MDEV-34850, but it is set to "fixed in 11.4.4" and we have exactly this version:

        [root@cgdcpsql1 mysql]# rpm -qa | grep MariaDB
        MariaDB-shared-11.4.4-1.el9.x86_64
        MariaDB-common-11.4.4-1.el9.x86_64
        MariaDB-client-11.4.4-1.el9.x86_64
        MariaDB-server-11.4.4-1.el9.x86_64
        MariaDB-backup-11.4.4-1.el9.x86_64

      On a side note: we have the binlog size also set to 1 GB and we typically see 2 rotations per 24 hours. As the whole --backup stage prior to the error takes 10-15 minutes, it is a bit hard to believe that a full gigabyte of changes landed during this small time window - and even if it did, it would had caused a binlog rotation, which we don't see. To match the above excerpt from the mariadb-backup output, here are the binlogs on the same machine; the failed backup got as started exactly at 12:00 on Nov 18, while binlog 1536 was opened at 11:10 on Nov 17 and was rotated at 08:07 on Nov 18.

      [root@cgdcpsql1 mysql]# ls -l cgdcpsql1-bin*
      -rw-rw----. 1 mysql mysql 1073741927 Nov 25 12:59 cgdcpsql1-bin.001532
      -rw-rw----. 1 mysql mysql     327680 Nov 25 12:59 cgdcpsql1-bin.001532.idx
      -rw-rw----. 1 mysql mysql 1073741995 Nov 26 09:15 cgdcpsql1-bin.001533
      -rw-rw----. 1 mysql mysql     323584 Nov 26 09:15 cgdcpsql1-bin.001533.idx
      -rw-rw----. 1 mysql mysql 1073809451 Nov 27 01:25 cgdcpsql1-bin.001534
      -rw-rw----. 1 mysql mysql     307200 Nov 27 01:25 cgdcpsql1-bin.001534.idx
      -rw-rw----. 1 mysql mysql 1074324069 Nov 27 11:10 cgdcpsql1-bin.001535
      -rw-rw----. 1 mysql mysql     237568 Nov 27 11:10 cgdcpsql1-bin.001535.idx
      -rw-rw----. 1 mysql mysql 1097485064 Nov 28 08:07 cgdcpsql1-bin.001536
      -rw-rw----. 1 mysql mysql     335872 Nov 28 08:07 cgdcpsql1-bin.001536.idx
      -rw-rw----. 1 mysql mysql   94644419 Nov 28 09:07 cgdcpsql1-bin.001537
      -rw-rw----. 1 mysql mysql      24576 Nov 28 09:02 cgdcpsql1-bin.001537.idx
      -rw-rw----. 1 mysql mysql        138 Nov 28 08:07 cgdcpsql1-bin.index

      We run mariadb-backup via cron with a minimal set of options like this:

      /usr/bin/mariabackup --open-files-limit=131072 --user root --target-dir=... --backup

      Is there anything we need to add to mariadb-backup for 11.x? (E.g., I see MDEV-34850 mentioning innodb_log_file_mmap from MDEV-34062, but I don's see this in neither "mariadb-backup --help" nor in the InnDB system variables page at https://mariadb.com/kb/en/innodb-system-variables ). Even mmap is not used, the system seems to have plenty of free memory:

      [root@cgdcpsql1 mysql]#  free -m
                     total        used        free      shared  buff/cache   available
      Mem:           31840       17849        1173         494       13765       13990
      Swap:           2047           1        2046

      The platform is RHEL-9, we use the RPM packages, provided by mariadb.com.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Totin Assen Totin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.