[MDEV-18963] Mariabackup 10.2.22 Restore Intermittent Failures Created: 2019-03-19  Updated: 2023-12-04  Resolved: 2023-12-04

Status: Closed
Project: MariaDB Server
Component/s: mariabackup
Affects Version/s: 10.2.22
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Stephen Hames Assignee: Sergei Golubchik
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

CentOS Linux 7.6.1810 (core)



 Description   

We recently upgraded from MariaDB 10.2.17 to 10.2.22; since this update, we are hitting intermittent issues with mariadb server failing to start on the mariabackup cloned host, with the following error logged:

2019-03-18  0:18:02 139848147691712 [Note] Found 1 prepared transaction(s) in InnoDB
2019-03-18  0:18:02 139848147691712 [ERROR] Found 1 prepared transactions! It means that mysqld was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start mysqld with --tc-heuristic-recover switch to commit or rollback pending transactions.
2019-03-18  0:18:02 139848147691712 [ERROR] Aborting

We use mariabackup (streaming backup over SSH) to clone a server, several times a day. The source server is a slave, all tables are InnoDB.
I reviewed the updates between 10.2.17 and 10.2.22, and could not find anything obviously incompatible in how we are using mariabackup; I have included the options below in case I have missed something.

We use following options when performing the backup:

mariabackup \
    --backup \
    --no-lock \
    --host=127.0.0.1 \
    --user=redacted \
    --password=redacted \
    --safe-slave-backup \
    --stream=xbstream \
    --target-dir=./ 2> $LOGFILE \
    | ssh -o StrictHostKeyChecking=no hostname-redacted "mbstream -x -C /mysql-import/mysql/"

We then prepare the backup on the cloned server using the following options:

mariabackup \
  --prepare \
  --apply-log-only \
  --target-dir /mysql-import/mysql > $LOGFILE 2>&1

Prepare logs as below, on a job, where server subsequently fails to start; no obvious errors shown - prepare job reported success.

[00] 2019-03-18 00:17:55 mariabackup based on MariaDB server 10.2.22-MariaDB Linux (x86_64)
[00] 2019-03-18 00:17:55 cd to /mysql-import/mysql/
[00] 2019-03-18 00:17:55 This target seems to be not prepared yet.
[00] 2019-03-18 00:17:55 mariabackup: using the following InnoDB configuration for recovery:
[00] 2019-03-18 00:17:55 innodb_data_home_dir = .
[00] 2019-03-18 00:17:55 innodb_data_file_path = ibdata1:500M:autoextend
[00] 2019-03-18 00:17:55 innodb_log_group_home_dir = .
[00] 2019-03-18 00:17:55 InnoDB: Using Linux native AIO
[00] 2019-03-18 00:17:55 Starting InnoDB instance for recovery.
[00] 2019-03-18 00:17:55 mariabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter)
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Uses event mutexes
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Compressed tables use zlib 1.2.7
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Using Linux native AIO
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Number of pools: 1
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Using SSE2 crc32 instructions
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Initializing buffer pool, total size = 100M, instances = 1, chunk size = 100M
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Completed initialization of buffer pool
2019-03-18  0:17:55 139977107965696 [Note] InnoDB: page_cleaner coordinator priority: -20
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Highest supported file format is Barracuda.
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Starting crash recovery from checkpoint LSN=839911059558
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Starting final batch to recover 643 pages from redo log.
2019-03-18  0:17:55 139977393436864 [Note] InnoDB: Last binlog file '/var/lib/mysql/logs/bin.000768', position 228369213
[00] 2019-03-18 00:17:56 Last binlog file /var/lib/mysql/logs/bin.000768, position 228369213
[00] 2019-03-18 00:17:56 completed OK!

Note: We remove the empty ib_logfile0 file after mariabackup prepare due to server refusing to start with the empty mariabackup generated ib_logfile.



 Comments   
Comment by Vladislav Vaintroub [ 2019-03-19 ]

Assigning to Marko for evaluation.

Comment by Marko Mäkelä [ 2023-12-04 ]

I only know that --tc-heuristic-recover has something to do with the binlog’s internal use of XA transactions for keeping the binlog in sync with the InnoDB redo log.

Comment by Sergei Golubchik [ 2023-12-04 ]

10.2 is EOL. The earliest still maintained version is 10.4, in 10.4 you can run mariabackup --rollback-xa, like we do in this test in our test harness.

from 10.5 there is no --rollback-xa, you need to start the server with --tc-heuristic-recover=ROLLBACK there. Like we do in the same test in 10.5

Generated at Thu Feb 08 08:48:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.