[MDEV-29581] mariadb-backup prepare lists wrong binary log position Created: 2022-09-20  Updated: 2023-09-14

Status: Stalled
Project: MariaDB Server
Component/s: mariabackup
Affects Version/s: 10.6.9, 10.6.10
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Oli Sennhauser Assignee: Kristian Nielsen
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Linux, Ubuntu 22.04, 20.04, n.a.


Attachments: File slave_provision_nolock.cnf     File slave_provision_nolock.test    

 Description   

MariaDB Backup provides IMHO wrong binary log position in output:

IMHO this is correct:

shell> mariadb-backup --backup ...

[00] 2022-09-20 19:00:23 >> log scanned up to (1876443359)
[00] 2022-09-20 19:00:23 Executing BACKUP STAGE END
[00] 2022-09-20 19:00:23 All tables unlocked
[00] 2022-09-20 19:00:23 Backup created in directory '/tmp/backup/'
[00] 2022-09-20 19:00:23 MySQL binlog position: filename 'laptop5_mariadb-106_binlog.000165', position '522187', GTID of the last change '0-3337-32300'
[00] 2022-09-20 19:00:23 Writing backup-my.cnf
[00] 2022-09-20 19:00:23 ...done
[00] 2022-09-20 19:00:23 Writing xtrabackup_info
[00] 2022-09-20 19:00:23 ...done
[00] 2022-09-20 19:00:23 Redo log (from LSN 1875941778 to 1876443359) was copied.
[00] 2022-09-20 19:00:23 completed OK!

This is still correct:

SQL> SHOW MASTER STATUS;
-----------------------------------------------------------------------+

File Position Binlog_Do_DB Binlog_Ignore_DB

-----------------------------------------------------------------------+

laptop5_mariadb-106_binlog.000165 522187    

-----------------------------------------------------------------------+

But this is wrong! After prepare we get an lower/older position in the output of the prepare.

shell> mariadb-backup --prepare

2022-09-20 19:01:15 0 [Note] InnoDB: Starting final batch to recover 175 pages from redo log.
2022-09-20 19:01:15 0 [Note] InnoDB: Last binlog file '/home/mysql/database/mariadb-106/binlog/laptop5_mariadb-106_binlog.000165', position 493513
[00] 2022-09-20 19:01:15 Last binlog file /home/mysql/database/mariadb-106/binlog/laptop5_mariadb-106_binlog.000165, position 493513
[00] 2022-09-20 19:01:15 completed OK!

The output of xtrabackup_binlog_info is correct again:

shell> cat xtrabackup_binlog_info
laptop5_mariadb-106_binlog.000165 522187 0-3337-32300

So the output of prepare is misleading and thus dangerous! Either suppress it or print it correctly. Or I have a complete wrong understanding of the output...



 Comments   
Comment by Daniel Black [ 2023-09-13 ]

acceptable - https://github.com/MariaDB/server/pull/2756 ?

Comment by Kristian Nielsen [ 2023-09-13 ]

The problem here is a regression introduced by MDEV-22351: "InnoDB may recover wrong information after RESET MASTER"

The correct fix thus is to revert MDEV-22351 (and find a correct fix for the original problem in that bug, but the regression introduced by MDEV-22351 is much worse than the problem it's trying to fix, which is if RESET MASTER is done around the time of the backup).

Additionally, MDEV-18917 introduces another regression, that the position is no longer written to xtrabackup_binlog_pos_innodb . This makes it necessary for users to parse the prepare logfile to get this information (as described here). This patch should also be reverted.

I have attached a test case that demonstrates the problem.
It starts a backup load, then takes a mariabackup --no-lock and uses it to provision a slave from the InnoDB binlog position provided during the prepare.
This test fails due to the patch for MDEV-22351 (0e10d7ea14cf795ada8aee7fe1afc590ef6de32c). Note that the problem appears non-deterministically, so it is necessary to run it until it fails, eg. --repeat=1000.

Also note that the test case uses the xtrabackup_binlog_pos_innodb file. So the patch for MDEV-18917 must be reverted to run the test (cd805a5f1c2aabede411e6732d7047088f61233e and 396cf60ac064b47ca0406bf7f870df9880d3ad94).

Reverting these patches makes the attached test case pass deterministically, demonstrating that the position is correct for provisioning a new slave without locking the master, as long as DDL is avoided during the backup.

Comment by Kristian Nielsen [ 2023-09-13 ]

So to summarize, no do not remove this information from the output.

Instead, revert the patch that makes the output incorrect, and restore the functionality to write the output to file xtrabackup_binlog_pos_innodb so users don't have to rely on parsing log files.

Comment by Marko Mäkelä [ 2023-09-13 ]

knielsen, thank you for the feedback. Without looking at the details yet, I can imagine that some rollback segment header pages could be updated for other reasons than writing the binlog offset. One example might be a ROLLBACK of a transaction. Maybe there really is no other feasible option than to revert MDEV-22351 and possibly fix the "write side" of it (by letting RESET MASTER update each rollback segment header page).

Comment by Daniel Black [ 2023-09-14 ]

Thanks knielsen, I assume you are most of the way through untangling this and wish to continue?

Comment by Kristian Nielsen [ 2023-09-14 ]

danblack Sure, I can do that.

Generated at Thu Feb 08 10:09:42 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.