[MDEV-31687] Mariabackup Incremental backup does not advance LSN Created: 2023-07-14  Updated: 2023-09-19  Resolved: 2023-08-21

Status: Closed
Project: MariaDB Server
Component/s: mariabackup
Affects Version/s: 10.5, 10.6, 10.9, 10.10, 10.11, 11.0, 11.1
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Edward Stoever Assignee: Marko Mäkelä
Resolution: Not a Bug Votes: 0
Labels: None

Attachments: File QUERY_for_redo_occupancy.sql     PNG File incremental_backup_does_not_increase_LSN_10.11.png     PNG File incremental_backup_does_not_increase_LSN_10.6_enterprise.png     Text File steps_taken_incremental_backups.txt     Zip Archive thousand_rows_inserts.zip    

 Description   

# CREATE COMPLETELY NEW INSTANCE:
systemctl stop mariadb; rm -fr /var/lib/mysql/*; rm -fr /var/log/mysql/* 
/usr/bin/mariadb-install-db 1>/dev/null 2>&1 && echo OK
chown -R mysql:mysql /var/lib/mysql; chown -R mysql:mysql /var/log/mysql
systemctl start mariadb && echo OK
 
mariadb -ABNe "select version();"
 
mariadb -Ae "DROP SCHEMA IF EXISTS TBK; CREATE SCHEMA TBK;CREATE TABLE TBK.bktiming (c1 int(11) NOT NULL AUTO_INCREMENT,c2 datetime DEFAULT current_timestamp(),c3 varchar(100) DEFAULT NULL,PRIMARY KEY (c1)) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;CREATE TABLE TBK.thousand_rows (c1 int(11) NOT NULL, c2_ipsum varchar(500) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;CREATE USER 'mariabackup'@'localhost' IDENTIFIED BY 'mypassword'; GRANT RELOAD, PROCESS, LOCK TABLES, BINLOG MONITOR ON *.* TO 'mariabackup'@'localhost';"
 
mariadb -Ae "insert into TBK.bktiming(c3) values ('before any backup');"
rm -fr /var/mariadb; mkdir -p /var/mariadb
mariabackup --backup --user=mariabackup --password=mypassword --target-dir=/var/mariadb/full 1>/dev/null 2>&1 && echo OK
mariadb -Ae "insert into TBK.bktiming(c3) values ('after full backup');"
cat /var/mariadb/full/xtrabackup_checkpoints
 
# ADD 1,000 rows to a table, do it 100 times
for ii in {1..100}; do mariadb TBK < thousand_rows.sql; done;
mariadb TBK -ABNe "select format(count(*),0) as row_count from thousand_rows;"
mariadb -Ae "select (DATA_LENGTH+INDEX_LENGTH+DATA_FREE)/1024/1024 AS thousand_rows_IN_MB from information_schema.tables where table_schema='TBK' and table_name='thousand_rows';"
 
mariadb -Ae "insert into TBK.bktiming(c3) values ('before incremental backup');"
mariabackup --backup  --user=mariabackup --password=mypassword  --incremental-basedir=/var/mariadb/full  --target-dir=/var/mariadb/inc 1>/dev/null 2>&1 && echo OK
mariadb -Ae "insert into TBK.bktiming(c3) values ('after incremental backup');"
 
cat /var/mariadb/inc/xtrabackup_checkpoints

You will see that xtrabackup_checkpoints will look like this:

backup_type = incremental
from_lsn = 46574
to_lsn = 46574
last_lsn = 57945480

Which means it does not backup changes since previous full backup.
Tested on these two versions:
10.11.1-MariaDB-1:10.11.1+maria~deb11
10.6.12-7-MariaDB-enterprise-log



 Comments   
Comment by Elena Stepanova [ 2023-07-16 ]

edward, thanks for the report with a reproducer.

Which means it does not backup changes since previous full backup.

Did you observe an actual data loss, or is it a conclusion based on the numbers?

I can reproduce the effect as described, but I'm not sure whether it is a problem in itself. At least in my case the inserted data was still there after restoring from the backup.

For the LSNs, according to the file name, it is supposed to contain checkpoints, and that's what it does.
If we add show engine innodb status to the steps after the second mariabackup, we'll see that the checkpoint is indeed still the same, even although LSN has gone far ahead.

10.5 841e905f

++ bin/mariadb --protocol=tcp -uroot -e 'show engine innodb status\G'
++ grep -Ei 'checkpoint|sequence number'
Log sequence number 45544709
Last checkpoint at  45079
++ bin/mariadb --protocol=tcp -uroot test -e 'insert into TBK.bktiming(c3) values ('\''after incremental backup'\'');'
++ cat ./full/xtrabackup_checkpoints
backup_type = full-backuped
from_lsn = 0
to_lsn = 45079
last_lsn = 49328
++ cat ./inc/xtrabackup_checkpoints
backup_type = incremental
from_lsn = 45079
to_lsn = 45079
last_lsn = 45544709

It is so in all of 10.5+. In 10.4, the checkpoint does change, and again, it corresponds the one in innodb status:

10.4 e146940a

++ bin/mariadb --protocol=tcp -uroot -e 'show engine innodb status\G'
++ grep -Ei 'checkpoint|lsn|sequence number'
Log sequence number 48648838
Last checkpoint at  35874685
++ bin/mariadb --protocol=tcp -uroot test -e 'insert into TBK.bktiming(c3) values ('\''after incremental backup'\'');'
++ cat ./full/xtrabackup_checkpoints
backup_type = full-backuped
from_lsn = 0
to_lsn = 60943
last_lsn = 66905
++ cat ./inc/xtrabackup_checkpoints
backup_type = incremental
from_lsn = 60943
to_lsn = 35874685
last_lsn = 48648838

I'll leave it to InnoDB people to decide what needs to be done here, if anything.

Comment by Edward Stoever [ 2023-07-17 ]

Hello elenst
Thank you for a quick reply. You asked if I observed an actual data loss. In one way, yes. When I restore the database using the appropriate command to include the FULL and INCREMENTAL, the restore only includes data up to the moment of the FULL backup. That means the changes that were supposed to be saved by the incremental are not restored.
I am curious what causes this. Is the incremental backup taken too soon? Is a command or checkpoint or log switch missing? If that is the case, then we should update our documentation. If it is not the case, then this is a bug.
– Edward

Comment by Marko Mäkelä [ 2023-08-03 ]

As far as I can tell, in the file xtrabackup_checkpoints (or mariadb_backup_checkpoints in some later versions) there are three numbers. In the data directory that I just uploaded to MDEV-30100 I happened to encounter the following mariadb_backup_checkpoints contents:

backup_type = full-backuped
from_lsn = 0
to_lsn = 46554143
last_lsn = 51638168

Without looking at the code, I would assume the following:

field explanation
from_lsn 0, or the checkpoint LSN from which the last preceding incremental backup started
to_lsn the checkpoint LSN from which the last backup started
last_lsn the LSN up to which log was copied

In the case of the MDEV-30100 dataset, the size of the ib_logfile0 file from the backup is 5096313 bytes = 12288 + 5084025 bytes, or 12288 + (last_lsn - to_lsn) bytes. This simple mapping between LSN and file sizes holds for the MDEV-14425 log format.

If you execute an incremental backup in a rather quick succession, it is possible that the server will not execute any checkpoints in between. Please check MDEV-30000.

Would this claimed issue be resolved if you tell InnoDB to execute a log checkpoint before executing each incremental backup?

SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001;
SET GLOBAL innodb_max_dirty_pages_pct_lwm=0;

Comment by Edward Stoever [ 2023-08-20 ]

Thank you to elenst and marko for helping with this issue. You have led to a better understanding of the mechanics of the innodb redo log and its effect on mariabackup incremental backup. If the redo log is not yet flushed, the new blocks added by running thousand_rows.sql (or any changed blocks) are saved in redo until the thread that flushes them starts. This will not happen for a while with the default configuration of MariaDB. You can see it with the attached query Query_for_redo_occupancy.sql. Sample output here:

+--------------------+--------------------+--------------------------------+----------------------------+
| REDO_OCCUPANCY_PCT | INNODB_LOG_FILE_MB | INNODB_MAX_DIRTY_PAGES_PCT_LWM | INNODB_MAX_DIRTY_PAGES_PCT |
+--------------------+--------------------+--------------------------------+----------------------------+
| 63.49              |                 96 | 0.000000                       | 90.000000                  |
+--------------------+--------------------+--------------------------------+----------------------------+

The default is 90% occupancy to start the flushing, and I had not yet reached that point. This is valuable and helpful!
– Edward

Comment by Edward Stoever [ 2023-08-20 ]

Final note, if you add in the command to start the process that flushes redo, you get the expected result proving that the incremental includes new blocks:

mariadb -Ae "SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001;"
sleep 60 # pausing 1 minute to let redo flush complete
mariadb -Ae "insert into TBK.bktiming(c3) values ('before incremental backup');"
mariabackup --backup  --user=mariabackup --password=mypassword  --incremental-basedir=/var/mariadb/full  --target-dir=/var/mariadb/inc 1>/dev/null 2>&1 && echo OK
mariadb -Ae "insert into TBK.bktiming(c3) values ('after incremental backup');"

And, review the xtrabackup_checkpoints from the incremental backup:

backup_type = incremental
from_lsn = 42120
to_lsn = 63953645
last_lsn = 63953967

After a restore that includes the incremental, I see 100,000 rows on the table thousand_rows and the table bktiming is:

MariaDB [TBK]> select * from bktiming;
+----+---------------------+---------------------------+
| c1 | c2                  | c3                        |
+----+---------------------+---------------------------+
|  2 | 2023-08-20 17:30:59 | before any backup         |
|  3 | 2023-08-20 17:31:02 | after full backup         |
|  4 | 2023-08-20 18:08:06 | before incremental backup |
+----+---------------------+---------------------------+

Comment by Marko Mäkelä [ 2023-08-21 ]

I’m closing this as “not a bug”. In MDEV-14992, we might want to support a special form of incremental backup that will only read and copy the log, if the log checkpoint LSN has not been advanced beyond the end LSN of the previously finished incremental backup.

Generated at Thu Feb 08 10:25:43 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.