Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29581

mariadb-backup prepare lists wrong binary log position

Details

    • Bug
    • Status: Confirmed (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.6.9, 10.6.10
    • 10.4(EOL)
    • mariabackup
    • Linux, Ubuntu 22.04, 20.04, n.a.

    Description

      MariaDB Backup provides IMHO wrong binary log position in output:

      IMHO this is correct:

      shell> mariadb-backup --backup ...

      [00] 2022-09-20 19:00:23 >> log scanned up to (1876443359)
      [00] 2022-09-20 19:00:23 Executing BACKUP STAGE END
      [00] 2022-09-20 19:00:23 All tables unlocked
      [00] 2022-09-20 19:00:23 Backup created in directory '/tmp/backup/'
      [00] 2022-09-20 19:00:23 MySQL binlog position: filename 'laptop5_mariadb-106_binlog.000165', position '522187', GTID of the last change '0-3337-32300'
      [00] 2022-09-20 19:00:23 Writing backup-my.cnf
      [00] 2022-09-20 19:00:23 ...done
      [00] 2022-09-20 19:00:23 Writing xtrabackup_info
      [00] 2022-09-20 19:00:23 ...done
      [00] 2022-09-20 19:00:23 Redo log (from LSN 1875941778 to 1876443359) was copied.
      [00] 2022-09-20 19:00:23 completed OK!

      This is still correct:

      SQL> SHOW MASTER STATUS;
      -----------------------------------------------------------------------+

      File Position Binlog_Do_DB Binlog_Ignore_DB

      -----------------------------------------------------------------------+

      laptop5_mariadb-106_binlog.000165 522187    

      -----------------------------------------------------------------------+

      But this is wrong! After prepare we get an lower/older position in the output of the prepare.

      shell> mariadb-backup --prepare

      2022-09-20 19:01:15 0 [Note] InnoDB: Starting final batch to recover 175 pages from redo log.
      2022-09-20 19:01:15 0 [Note] InnoDB: Last binlog file '/home/mysql/database/mariadb-106/binlog/laptop5_mariadb-106_binlog.000165', position 493513
      [00] 2022-09-20 19:01:15 Last binlog file /home/mysql/database/mariadb-106/binlog/laptop5_mariadb-106_binlog.000165, position 493513
      [00] 2022-09-20 19:01:15 completed OK!

      The output of xtrabackup_binlog_info is correct again:

      shell> cat xtrabackup_binlog_info
      laptop5_mariadb-106_binlog.000165 522187 0-3337-32300

      So the output of prepare is misleading and thus dangerous! Either suppress it or print it correctly. Or I have a complete wrong understanding of the output...

      Attachments

        Activity

          danblack Daniel Black added a comment - acceptable - https://github.com/MariaDB/server/pull/2756 ?

          The problem here is a regression introduced by MDEV-22351: "InnoDB may recover wrong information after RESET MASTER"

          The correct fix thus is to revert MDEV-22351 (and find a correct fix for the original problem in that bug, but the regression introduced by MDEV-22351 is much worse than the problem it's trying to fix, which is if RESET MASTER is done around the time of the backup).

          Additionally, MDEV-18917 introduces another regression, that the position is no longer written to xtrabackup_binlog_pos_innodb . This makes it necessary for users to parse the prepare logfile to get this information (as described here). This patch should also be reverted.

          I have attached a test case that demonstrates the problem.
          It starts a backup load, then takes a mariabackup --no-lock and uses it to provision a slave from the InnoDB binlog position provided during the prepare.
          This test fails due to the patch for MDEV-22351 (0e10d7ea14cf795ada8aee7fe1afc590ef6de32c). Note that the problem appears non-deterministically, so it is necessary to run it until it fails, eg. --repeat=1000.

          Also note that the test case uses the xtrabackup_binlog_pos_innodb file. So the patch for MDEV-18917 must be reverted to run the test (cd805a5f1c2aabede411e6732d7047088f61233e and 396cf60ac064b47ca0406bf7f870df9880d3ad94).

          Reverting these patches makes the attached test case pass deterministically, demonstrating that the position is correct for provisioning a new slave without locking the master, as long as DDL is avoided during the backup.

          knielsen Kristian Nielsen added a comment - The problem here is a regression introduced by MDEV-22351 : "InnoDB may recover wrong information after RESET MASTER" The correct fix thus is to revert MDEV-22351 (and find a correct fix for the original problem in that bug, but the regression introduced by MDEV-22351 is much worse than the problem it's trying to fix, which is if RESET MASTER is done around the time of the backup). Additionally, MDEV-18917 introduces another regression, that the position is no longer written to xtrabackup_binlog_pos_innodb . This makes it necessary for users to parse the prepare logfile to get this information (as described here). This patch should also be reverted. I have attached a test case that demonstrates the problem. It starts a backup load, then takes a mariabackup --no-lock and uses it to provision a slave from the InnoDB binlog position provided during the prepare. This test fails due to the patch for MDEV-22351 (0e10d7ea14cf795ada8aee7fe1afc590ef6de32c). Note that the problem appears non-deterministically, so it is necessary to run it until it fails, eg. --repeat=1000. Also note that the test case uses the xtrabackup_binlog_pos_innodb file. So the patch for MDEV-18917 must be reverted to run the test (cd805a5f1c2aabede411e6732d7047088f61233e and 396cf60ac064b47ca0406bf7f870df9880d3ad94). Reverting these patches makes the attached test case pass deterministically, demonstrating that the position is correct for provisioning a new slave without locking the master, as long as DDL is avoided during the backup.

          So to summarize, no do not remove this information from the output.

          Instead, revert the patch that makes the output incorrect, and restore the functionality to write the output to file xtrabackup_binlog_pos_innodb so users don't have to rely on parsing log files.

          knielsen Kristian Nielsen added a comment - So to summarize, no do not remove this information from the output. Instead, revert the patch that makes the output incorrect, and restore the functionality to write the output to file xtrabackup_binlog_pos_innodb so users don't have to rely on parsing log files.

          knielsen, thank you for the feedback. Without looking at the details yet, I can imagine that some rollback segment header pages could be updated for other reasons than writing the binlog offset. One example might be a ROLLBACK of a transaction. Maybe there really is no other feasible option than to revert MDEV-22351 and possibly fix the "write side" of it (by letting RESET MASTER update each rollback segment header page).

          marko Marko Mäkelä added a comment - knielsen , thank you for the feedback. Without looking at the details yet, I can imagine that some rollback segment header pages could be updated for other reasons than writing the binlog offset. One example might be a ROLLBACK of a transaction. Maybe there really is no other feasible option than to revert MDEV-22351 and possibly fix the "write side" of it (by letting RESET MASTER update each rollback segment header page).
          danblack Daniel Black added a comment -

          Thanks knielsen, I assume you are most of the way through untangling this and wish to continue?

          danblack Daniel Black added a comment - Thanks knielsen , I assume you are most of the way through untangling this and wish to continue?

          danblack Sure, I can do that.

          knielsen Kristian Nielsen added a comment - danblack Sure, I can do that.

          People

            knielsen Kristian Nielsen
            oli Oli Sennhauser
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.