[MDEV-28974] mariadb-backup --prepare fails to prepare a backup that involves DDL Created: 2022-06-28  Updated: 2023-04-13  Resolved: 2022-08-01

Status: Closed
Project: MariaDB Server
Component/s: Backup, Storage Engine - InnoDB
Affects Version/s: 10.8.3, 10.6, 10.7
Fix Version/s: 10.6.9, 10.7.5, 10.8.4, 10.9.2, 10.10.1

Type: Bug Priority: Critical
Reporter: Nuno Assignee: Marko Mäkelä
Resolution: Fixed Votes: 1
Labels: rr-profile

Attachments: File data.tar.xz     Text File encryption_keys.txt    
Issue Links:
Problem/Incident
causes MDEV-29282 atomic.rename_trigger fails occasionally Closed
is caused by MDEV-24626 Remove synchronous write of page0 and... Closed

 Description   

Hi

Last Saturday, I migrated from 10.5 to 10.8.3, and since then, the backups created by mariabackup daily seem to always fail when trying to prepare them.

The destination server (tried Docker and an actual server with similar specs to the original live server) also runs 10.8.3.

This worries me because if I have an actual crash and need to restore from a backup, I may not be able to recover.

These new table extensions listed below are all tables that have been re-created and then swapped with the old ones, during the day.

At the end of these logs, you can see it crashes due to "Attempted to open a previously opened tablespace. Previous tablespace:"

On the actual server, the logs are very similar, except that I used 3G memory (instead of 10GB) and if failed on a later table (also in the list of renames).
But otherwise, same symptoms.

This never happened before, with 10.5.

Docker logs:

mariabackup based on MariaDB server 10.8.3-MariaDB Linux (x86_64)
[00] 2022-06-28 08:02:43 cd to /var/lib/mysql/
[00] 2022-06-28 08:02:43 open files limit requested 0, set to 1048576
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl1_data1.new to ./db_mydb1/tbl1_data1.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl1_level2.new to ./db_mydb1/tbl1_level2.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl1_webclient.new to ./db_mydb1/tbl1_webclient.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl1_tbl2.new to ./db_mydb1/tbl1_tbl2.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl_c_data4.new to ./db_mydb1/tbl_c_data4.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/aws_data4.new to ./db_mydb1/aws_data4.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl_a_data4.new to ./db_mydb1/tbl_a_data4.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl_d_data3.new to ./db_mydb1/tbl_d_data3.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl_d_data5.new to ./db_mydb1/tbl_d_data5.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl3.new to ./db_mydb1/tbl3.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl1_webserver.new to ./db_mydb1/tbl1_webserver.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl1_level3.new to ./db_mydb1/tbl1_level3.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl_a_data3.new to ./db_mydb1/tbl_a_data3.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl_d_data4.new to ./db_mydb1/tbl_d_data4.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl1_level1.new to ./db_mydb1/tbl1_level1.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl_highrisk.new to ./db_mydb1/tbl_highrisk.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl_data2.new to ./db_mydb1/tbl_data2.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tor_ips.new to ./db_mydb1/tor_ips.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl_c_data3.new to ./db_mydb1/tbl_c_data3.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/tbl1_data6_1d.new to ./db_mydb1/tbl1_data6_1d.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb1/aws_data3.new to ./db_mydb1/aws_data3.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb2_archive/tbl10_internal.new to ./db_mydb2_archive/tbl10_internal.ibd
[00] 2022-06-28 08:02:44 Renaming ./db_mydb2_archive/tbl10.new to ./db_mydb2_archive/tbl10.ibd
[00] 2022-06-28 08:02:44 This target seems to be not prepared yet.
[00] 2022-06-28 08:02:44 mariabackup: using the following InnoDB configuration for recovery:
[00] 2022-06-28 08:02:44 innodb_data_home_dir = .
[00] 2022-06-28 08:02:44 innodb_data_file_path = ibdata1:12M:autoextend
[00] 2022-06-28 08:02:44 innodb_log_group_home_dir = .
[00] 2022-06-28 08:02:44 InnoDB: Using Linux native AIO
[00] 2022-06-28 08:02:44 Starting InnoDB instance for recovery.
[00] 2022-06-28 08:02:44 mariabackup: Using 10737418240 bytes for buffer pool (set by --use-memory parameter)
2022-06-28  8:02:44 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
2022-06-28  8:02:44 0 [Note] InnoDB: Number of transaction pools: 1
2022-06-28  8:02:44 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2022-06-28  8:02:44 0 [Note] InnoDB: Using Linux native AIO
2022-06-28  8:02:44 0 [Note] InnoDB: Initializing buffer pool, total size = 10.000GiB, chunk size = 10.000GiB
2022-06-28  8:02:44 0 [Note] InnoDB: Completed initialization of buffer pool
2022-06-28  8:02:44 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=10124335601183
2022-06-28  8:02:59 0 [Note] InnoDB: Read redo log up to LSN=10125491182709
2022-06-28  8:03:14 0 [Note] InnoDB: Read redo log up to LSN=10126769017887
2022-06-28  8:03:29 0 [Note] InnoDB: Read redo log up to LSN=10125566511156
2022-06-28  8:03:44 0 [Note] InnoDB: Read redo log up to LSN=10127351889951
2022-06-28  8:03:46 0 [Note] InnoDB: Starting a batch to recover 249655 pages from redo log.
2022-06-28  8:05:44 0 [Note] InnoDB: Read redo log up to LSN=10127544782324
2022-06-28  8:05:45 0 [ERROR] InnoDB: Attempted to open a previously opened tablespace. Previous tablespace: ./db_mydb1/tbl_a_data4.ibd uses space ID: 283158. Cannot open filepath: db_mydb1/tbl_a_data4.ibd which uses the same space ID.
2022-06-28  8:05:45 0 [Warning] InnoDB: We do not continue the crash recovery, because the table may become corrupt if we cannot apply the log records in the InnoDB log to it. To fix the problem and start mariadbd:
2022-06-28  8:05:45 0 [Note] InnoDB: 1) If there is a permission problem in the file and mysqld cannot open the file, you should modify the permissions.
2022-06-28  8:05:45 0 [Note] InnoDB: 2) If the tablespace is not needed, or you can restore an older version from a backup, then you can remove the .ibd file, and use --innodb_force_recovery=1 to force startup without this file.
2022-06-28  8:05:45 0 [Note] InnoDB: 3) If the file system or the disk is broken, and you cannot remove the .ibd file, you can set --innodb_force_recovery.
[00] 2022-06-28 08:05:45 mariadb-backup: srv_start() returned 11 (Generic error).

Actual server (slightly different):

2022-06-28 16:16:57 0 [Note] InnoDB: Initializing buffer pool, total size = 3.000GiB, chunk size = 3.000GiB
2022-06-28 16:16:57 0 [Note] InnoDB: Completed initialization of buffer pool
2022-06-28 16:16:57 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=10124335601183
2022-06-28 16:17:12 0 [Note] InnoDB: Read redo log up to LSN=10125664839619
2022-06-28 16:17:12 0 [Note] InnoDB: Starting a batch to recover 155833 pages from redo log.
2022-06-28 16:18:18 0 [Note] InnoDB: Read redo log up to LSN=10125853282335
2022-06-28 16:18:23 0 [Note] InnoDB: Starting a batch to recover 88042 pages from redo log.
2022-06-28 16:19:06 0 [Note] InnoDB: Read redo log up to LSN=10126641181182
2022-06-28 16:19:06 0 [ERROR] InnoDB: Attempted to open a previously opened tablespace. Previous tablespace: ./db_mydb1/tbl_a_data4.ibd uses space ID: 283140. Cannot open filepath: db_mydb1/tbl_a_data4.ibd which uses the same space ID.
2022-06-28 16:19:06 0 [Warning] InnoDB: We do not continue the crash recovery, because the table may become corrupt if we cannot apply the log records in the InnoDB log to it. To fix the problem and start mariadbd:
2022-06-28 16:19:06 0 [Note] InnoDB: 1) If there is a permission problem in the file and mysqld cannot open the file, you should modify the permissions.
2022-06-28 16:19:06 0 [Note] InnoDB: 2) If the tablespace is not needed, or you can restore an older version from a backup, then you can remove the .ibd file, and use --innodb_force_recovery=1 to force startup without this file.
2022-06-28 16:19:06 0 [Note] InnoDB: 3) If the file system or the disk is broken, and you cannot remove the .ibd file, you can set --innodb_force_recovery.
[00] 2022-06-28 16:19:06 mariadb-backup: srv_start() returned 11 (Generic error).



 Comments   
Comment by Marko Mäkelä [ 2022-06-28 ]

nunop, a fundamentally new thing in 10.8 is MDEV-14425 and MDEV-27774. That is, multiple threads can write to the shared log_sys.buf concurrently, and each mini-transaction is a logical log ‘block’ on its own. There is also a mmap based interface that works on persistent memory (mount -o dax).

The format and the recovery of the log records was rewritten already in 10.5 by MDEV-12353 and related tickets. I think that this problem could be possible in older versions as well, but the probability could be much lower due to different timing characteristics in 10.8.

Could you try to construct a reproducible test case for this? For test data, you could use some large dump, say, Wikipedia or an OpenStreetMap planet extract. Recovery (or preparing a backup) with a smaller buffer pool (or --use-memory is supposed to work ever since MDEV-21351 was implemented in 10.5. There could be a glitch between small innodb_open_files and multi-batch recovery.

Comment by Nuno [ 2022-06-29 ]

Hi marko

I spent a lot of time on this, but wasn't able to reproduce this on a new installation, with the same data tables (the ones problematic at least) and same my.cnf, and same server specs.

Basically what I did:

  • New installation of mariadb in a CentOS 7
  • Replace /etc/my.cnf

$ mysql < db_mydb.sql
(basically this is a SQL dump of the database that contains a lot of data)
(I ran this twice)

$ mysql < db_mydb_swap.sql

(this does the following, for each of the 10+ tables)

CREATE TABLE `mytable_new` LIKE `mytable`;
 
INSERT INTO `mytable_new`
SELECT * FROM `mytable`;
 
RENAME TABLE `mytable` TO `mytable_old`, `mytable_new` TO `mytable`;
DROP TABLE `mytable_old`;

$ rm -rf /backup/mysql/

$ /usr/bin/mariabackup --backup --parallel=4 --target-dir=/backup/mysql/

(should see MANY "DDL tracking" and "log scanned up to", in a loop that seems to take "forever / repeating")

$ /usr/bin/mariabackup --prepare --use-memory=3G --target-dir=/backup/mysql/

I was hoping that prepare was going to fail for the same reason.

Instead, the output was this:

[00] 2022-06-29 00:07:07 cd to /backup/mysql/
[00] 2022-06-29 00:07:07 open files limit requested 0, set to 1048576
[00] 2022-06-29 00:07:07 Renaming ./db_mydb/tbl1_ipv6.new to ./db_mydb/tbl1_ipv6.ibd
 
[00] 2022-06-29 00:07:07 Renaming ./db_mydb/tbl2_ipv6.new to ./db_mydb/tbl2_ipv6.ibd
 
[00] 2022-06-29 00:07:07 Renaming ./db_mydb/tbl4_ips.new to ./db_mydb/tbl4_ips.ibd
 
[00] 2022-06-29 00:07:07 Renaming ./db_mydb/tbl3.new to ./db_mydb/tbl3.ibd
 
[00] 2022-06-29 00:07:07 Renaming ./db_mydb/tbl2_ipv4.new to ./db_mydb/tbl2_ipv4.ibd
 
[00] 2022-06-29 00:07:07 This target seems to be not prepared yet.
[00] 2022-06-29 00:07:07 mariabackup: using the following InnoDB configuration for recovery:
[00] 2022-06-29 00:07:07 innodb_data_home_dir = .
[00] 2022-06-29 00:07:07 innodb_data_file_path = ibdata1:12M:autoextend
[00] 2022-06-29 00:07:07 innodb_log_group_home_dir = .
[00] 2022-06-29 00:07:07 InnoDB: Using Linux native AIO
[00] 2022-06-29 00:07:07 Starting InnoDB instance for recovery.
[00] 2022-06-29 00:07:07 mariabackup: Using 3221225472 bytes for buffer pool (set by --use-memory parameter)
2022-06-29  0:07:07 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
2022-06-29  0:07:07 0 [Note] InnoDB: Number of transaction pools: 1
2022-06-29  0:07:07 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2022-06-29  0:07:07 0 [Note] InnoDB: Using Linux native AIO
2022-06-29  0:07:07 0 [Note] InnoDB: Initializing buffer pool, total size = 3.000GiB, chunk size = 3.000GiB
2022-06-29  0:07:07 0 [Note] InnoDB: Completed initialization of buffer pool
2022-06-29  0:07:07 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=6577439278
2022-06-29  0:07:08 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:08 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:08 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:08 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:08 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:08 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:10 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv4_new.ibd' with space ID 151. Another data file called db_mydb/tbl1_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:11 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv4_new.ibd' with space ID 151. Another data file called db_mydb/tbl1_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:12 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv4_new.ibd' with space ID 151. Another data file called db_mydb/tbl1_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:12 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv4_new.ibd' with space ID 151. Another data file called db_mydb/tbl1_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:12 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv4_new.ibd' with space ID 151. Another data file called db_mydb/tbl1_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:12 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv4_new.ibd' with space ID 151. Another data file called db_mydb/tbl1_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:12 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv4_new.ibd' with space ID 151. Another data file called db_mydb/tbl1_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:12 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:12 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:12 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:12 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:13 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:13 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:13 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:13 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:13 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:14 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:14 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:14 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:14 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:14 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:14 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:14 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:14 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:14 0 [Note] InnoDB: Starting a batch to recover 48964 pages from redo log.
2022-06-29  0:07:35 0 [Note] InnoDB: Read redo log up to LSN=7256753619
2022-06-29  0:07:35 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl1_ipv6_new.ibd' with space ID 152. Another data file called db_mydb/tbl1_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:35 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:35 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:35 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:35 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:35 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:35 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:36 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:36 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:36 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:36 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:37 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:37 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:37 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:37 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:37 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv4_new.ibd' with space ID 153. Another data file called db_mydb/tbl2_ipv4.ibd exists with the same space ID.
2022-06-29  0:07:37 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:37 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:37 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:37 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:38 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:38 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:38 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:38 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:07:38 0 [Note] InnoDB: Starting a batch to recover 58421 pages from redo log.
2022-06-29  0:08:09 0 [Note] InnoDB: Read redo log up to LSN=7900423702
2022-06-29  0:08:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:08:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:08:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:08:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:08:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:08:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:08:09 0 [Note] InnoDB: Ignoring data file './db_mydb/tbl2_ipv6_new.ibd' with space ID 154. Another data file called db_mydb/tbl2_ipv6.ibd exists with the same space ID.
2022-06-29  0:08:10 0 [Note] InnoDB: Starting final batch to recover 28181 pages from redo log.
[00] 2022-06-29 00:08:26 Last binlog file , position 0
[00] 2022-06-29 00:08:26 completed OK!

If it helps, these are some of the my.cnf configs I have:

performance_schema              = 0
innodb_stats_on_metadata        = 0
#log_bin (disabled)
sync_binlog                     = 0
innodb_page_size = 16384
table_open_cache                = 3072
open_files_limit                = 10000
innodb_file_per_table           = 1
innodb_open_files               = 6000
innodb_read_io_threads          = 64
innodb_write_io_threads         = 64
innodb_buffer_pool_size         = 80G
innodb_log_file_size            = 5G
innodb_doublewrite              = 0
innodb_flush_method             = O_DIRECT
innodb_flush_neighbors          = 0
innodb_io_capacity              = 1000
innodb_flush_log_at_trx_commit  = 2
innodb_lock_wait_timeout        = 15
optimizer_use_condition_selectivity = 1

I don't have any MyISAM tables.

Please let me know if anything else would be helpful here.
(or if you suggest some change)

Thank you!

Comment by Marko Mäkelä [ 2022-06-29 ]

nunop, thank you. Perhaps some internal writes, mainly from the purge of committed transaction history, could play a role here. On a fresh setup, you would not have them.

Our best bet might be that mleich will try to reproduce this with the Random Query Generator and much smaller parameters. Preparing a backup shares code with the crash recovery. I think that also recovery should be affected. My suggested recipe is as follows:

  1. Create lots of tables (or tables with lots of partitions).
  2. Kill the server while those files are being written to.
  3. Start crash recovery with a much smaller buffer pool and innodb_open_files=10.
Comment by Marko Mäkelä [ 2022-06-29 ]

Side note: Using innodb_doublewrite=0 might be safe with innodb_page_size=4k. With larger page sizes, back in 2015, Linux could truncate any write to a multiple of 4096 bytes. I do not know how atomic writes would be nowadays, and it could depend on the file system and the type of the block device.

While using innodb_doublewrite=0 may cause a failure of the database to recover from a crash, it should not affect data copied from a running server by mariadb-backup --backup. Backup never consults the doublewrite buffer; it will retry reads of failed pages. If you are using the (MDEV-19534 default) innodb_checksum_algorithm=full_crc32 format, the checksums should be reliable.

nunop, you might want to check if innodb_doublewrite=1 performs acceptably for you. As part of MDEV-23855, the doublewrite mechanism was refactored. Asynchronous writes in batches of up to 128 pages will be used, and they should clash less with fsync() or fdatasync(). Before that, the doublewrite buffer was written synchronously, up to 64 pages at a time.

Comment by Nuno [ 2022-06-29 ]

Thanks marko

Actually, since you mentioned innodb_page_size, mine is:

innodb_page_size = 16384
#innodb_checksum_algorithm -- not defined at all, so I'm using the default

Before making the innodb_doublewrite=1 change,
I prepared the backup I took after changing to innodb_flush_method = fsync as discussed in MDEV-28772, and it went fine. ("completed OK")
But again, I don't see all the table renames and stuff like that, most likely because the checkpoints were already made as part of the restart.

I've just changed now to innodb_doublewrite=1 and restarted again.

I'll see how things go tomorrow.

Comment by Marko Mäkelä [ 2022-06-30 ]

nunop, I had a quick look at the code. Do you use the --close-files parameter when preparing the backup? It might be that xb_fil_cur_close() is not using the correct interface. The mechanism for closing and opening files is different from the server; I know it because I just fixed MDEV-26293.

Would preparing the same backup (a copy of it, so that the previous failed attempt cannot ruin anything) succeed with a larger --use-memory?

Comment by Nuno [ 2022-06-30 ]

Hi marko
Thank you for looking into that.

No, this is my command:

$ /usr/bin/mariabackup --prepare --use-memory=3G --target-dir=/backup/mysql/

Now - I suspect that either these changes helped fixing the problem:

innodb_doublewrite              = 1
innodb_flush_method             = fsync

or the fact that in the past 2 times I had run the backup twice,
because the prepare hasn't failed since.

Tomorrow, I'm going to run the backup just 1 time (using --log-copy-interval=1 so I don't get the long lock), and I'll see if the prepare fails.
If it doesn't, then it means that one of the options above has accommodated this issue.
If it does, then it means I have to run the backup twice...

Thanks!

Comment by Matthias Leich [ 2022-06-30 ]

Scenario:
1. Start the source server and create some initial data
2. One session runs a DDL/DML mix
3. During 2. is ongoing run within a loop
    mariabackup --backup
    mariabackup --prepare
   ....
   and abort the complete test if something fails.
 
origin/10.8 5ab9265677b161f5ac61c12bc51486db3776a6ac 2022-06-29T19:44:29+02:00
mariabackup --prepare fails with
...
# 2022-06-30T17:45:36 [280784] | [rr 282123 18122]2022-06-30 17:43:48 0 [Note] InnoDB: To recover: 360 pages from log
# 2022-06-30T17:45:36 [280784] | [rr 282123 18525]2022-06-30 17:43:56 0 [ERROR] InnoDB: Attempted to open a previously opened tablespace. Previous tablespace: ./test/t7.ibd uses space ID: 16. Cannot open filepath: test/t7.ibd which uses the same space ID.
# 2022-06-30T17:45:36 [280784] | [rr 282123 18529]2022-06-30 17:43:56 0 [Warning] InnoDB: We do not continue the crash recovery, 
...
 
RQG
===
# git clone https://github.com/mleich1/rqg --branch experimental RQG
#
# GIT_SHOW: HEAD -> experimental, origin/experimental a32979c6d789ddf0982e6a56af687565be00590c 2022-06-14T17:21:56+02:00
# rqg.pl  : Version 4.0.6 (2022-05)
#
# $RQG_HOME/rqg.pl \
# --grammar=conf/mariadb/innodb_compression_encryption.yy \
# --gendata=conf/mariadb/innodb_compression_encryption.zz \
# --max_gd_duration=1800 \
# --reporters=Mariabackup_linux \
# --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
# --mysqld=--loose-idle_write_transaction_timeout=0 \
# --mysqld=--loose-idle_transaction_timeout=0 \
# --mysqld=--loose-idle_readonly_transaction_timeout=0 \
# --mysqld=--connect_timeout=60 \
# --mysqld=--interactive_timeout=28800 \
# --mysqld=--slave_net_timeout=60 \
# --mysqld=--net_read_timeout=30 \
# --mysqld=--net_write_timeout=60 \
# --mysqld=--loose-table_lock_wait_timeout=50 \
# --mysqld=--wait_timeout=28800 \
# --mysqld=--lock-wait-timeout=86400 \
# --mysqld=--innodb-lock-wait-timeout=50 \
# --no-mask \
# --queries=10000000 \
# --seed=random \
# --reporters=Backtrace \
# --reporters=ErrorLog \
# --reporters=Deadlock1 \
# --validators=None \
# --mysqld=--log_output=none \
# --mysqld=--log_bin_trust_function_creators=1 \
# --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
# --engine=InnoDB \
# --restart_timeout=240 \
# --mysqld=--plugin-load-add=file_key_management.so \
# --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
# --mysqld=--plugin-load-add=provider_lzo.so \
# --mysqld=--plugin-load-add=provider_bzip2.so \
# --mysqld=--plugin-load-add=provider_lzma.so \
# --mysqld=--plugin-load-add=provider_snappy.so \
# --mysqld=--plugin-load-add=provider_lz4.so \
# --duration=300 \
# --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
# --mysqld=--loose-innodb_read_only_compressed=OFF \
# --mysqld=--loose-innodb-sync-debug \
# --mysqld=--innodb_stats_persistent=off \
# --mysqld=--innodb_adaptive_hash_index=off \
# --mysqld=--log-bin \
# --mysqld=--sync-binlog=1 \
# --mysqld=--loose-innodb_evict_tables_on_commit_debug=on \
# --mysqld=--loose-max-statement-time=30 \
# --threads=1 \
# --mysqld=--innodb-use-native-aio=0 \
# --mysqld=--loose-gdb \
# --mysqld=--loose-debug-gdb \
# --rr=Extended \
# --rr_options=--wait \
# --mysqld=--loose_innodb_change_buffering=none \
# --mysqld=--innodb_rollback_on_timeout=ON \
# --vardir_type=fast \
# --mysqld=--innodb_page_size=32K \
# --mysqld=--innodb-buffer-pool-size=24M \
# --no_mask \
# <local settings>
 
pluto:/data/results/1656606789/TBR-1038$ _RR_TRACE_DIR=./1/rr/ rr replay  # The source DB server till getting killed
pluto:/data/results/1656606789/TBR-1038$ _RR_TRACE_DIR=./1_clone/rr/ rr replay  # mariabackup --prepare
pluto:/data/results/1656606789/TBR-1038/1_clone/fbackup  # Filebackup of the backuped data after "--backup" and before "--prepare"

Comment by Nuno [ 2022-07-01 ]

Thank you mleich - looks like you were able to reproduce it!

marko - seems this is no longer happening to me. I've run my usual automated process for creating the backup and restore the snapshot, without running twice, so one of these 2 configs have definitely helped resolving this:

innodb_doublewrite              = 1
innodb_flush_method             = fsync

I'm also using --log-copy-interval=1 to backup.

Comment by Marko Mäkelä [ 2022-08-01 ]

In the failure that mleich reproduced, there are 2 copies of the file test/t7.ibd in the directory fbackup, corresponding to the server that was running and to the backup that was made. One of the files contains the tablespace ID 11 and the other the ID 16. The file with ID 16 was initially created by a table-rebuilding operation as test/#sql-alter-39c22-12.ibd:

ALTER TABLE `t7` ROW_FORMAT=DYNAMIC /* E_R Thread1 QNO 27 CON_ID 18 */;

In the failed --prepare run, the problem occurs when fil_ibd_load() is invoked on tablespace 11. At that point, we complain that test/t7.ibd is already known as tablespace 16.

There was a deferred recovery (MDEV-24626) of tablespace 16.

A possible cause is that we fail to track renamed files correctly. This might be a regression in 10.6 already, due to MDEV-24626.

Comment by Marko Mäkelä [ 2022-08-01 ]

Preparing data.tar.xz takes a very long time (MDEV-25909?) and fails differently for me when my attempted fix to fil_name_process() and its caller recv_sys_t::parse() is present:

10.8 f79cebb4d02a7b5151ac617bc762c3e094436562 with an attempted fix

2022-08-01 13:05:23 0 [Note] InnoDB: Starting a batch to recover 469 pages from redo log.
2022-08-01 13:05:32 0 [ERROR] InnoDB: Cannot apply log to [page id: space=14, page number=0] of corrupted file './test/t5.ibd'

If I revert the fix, it fails like this:

10.8 f79cebb4d02a7b5151ac617bc762c3e094436562

2022-08-01 13:07:27 0 [Note] InnoDB: Starting a batch to recover 496 pages from redo log.
2022-08-01 13:07:36 0 [ERROR] InnoDB: Cannot apply log to [page id: space=14, page number=0] of corrupted file './test/t5.ibd'
2022-08-01 13:07:36 0 [ERROR] InnoDB: Cannot apply log to [page id: space=15, page number=0] of corrupted file './test/t3.ibd'
[00] 2022-08-01 13:07:36 mariadb-backup: srv_start() returned 11 (Generic error).

Curiously, t7.ibd was not even mentioned in the output.

The file name test/#sql-alter-39c22-12.ibd should only occur in ib_logfile0, because mariadb-backup --backup will avoid copying files with such names.

Comment by Marko Mäkelä [ 2022-08-01 ]

The long execution time is only an anomaly of a debug build. With CMAKE_BUILD_TYPE=RelWithDebInfo, the bottleneck is crc32_3way, seemingly for validating PAGE_CHECKSUM records (MDEV-18976) that are only written by debug builds. The mariadb-backup --prepare would finish in significantly less than a second.

Comment by Marko Mäkelä [ 2022-08-01 ]

My attempt to prepare the backup failed, because the encryption plugin could not be loaded. fil_space_crypt_t::is_key_found() would not hold in deferred_spaces.create(). I must correct that configuration issue before I can test my fix.

Comment by Nuno [ 2022-08-01 ]

Thank you for your time on this, marko. Have a great day.

Comment by Marko Mäkelä [ 2022-08-01 ]

--- backup-my.cnf	2022-08-01 13:50:20.196789945 +0300
+++ backup-my.cnf	2022-06-30 20:41:51.000000000 +0300
@@ -10,10 +10,10 @@
 innodb_undo_tablespaces=0
 innodb_compression_level=6
 
-plugin_dir="/data/Server_bin/10.8F_asan/lib/plugin/"
+plugin_dir="/dev/shm/10.8/plugin/file_key_management/"
 plugin_load=file_key_management=file_key_management;provider_lzo=provider_lzo;provider_bzip2=provider_bzip2;provider_lzma=provider_lzma;provider_snappy=provider_snappy;provider_lz4=provider_lz4
 file_key_management_encryption_algorithm=aes_cbc
 file_key_management_filekey=
-file_key_management_filename=/data/RQG_neu/conf/mariadb/encryption_keys.txt
+file_key_management_filename=/dev/shm/encryption_keys.txt
 innodb_encrypt_log=0
 

This and encryption_keys.txt allows me to repeat the problem when the fix is not present:

10.8 f79cebb4d02a7b5151ac617bc762c3e094436562

2022-08-01 13:50:51 0 [Note] InnoDB: Starting a batch to recover 496 pages from redo log.
2022-08-01 13:50:51 0 [ERROR] InnoDB: Attempted to open a previously opened tablespace. Previous tablespace: ./test/t7.ibd uses space ID: 16. Cannot open filepath: test/t7.ibd which uses the same space ID.

If I try to re-prepare that directory without restoring and adjusting data.tar.xz as above, the recovery with the fixed code would fail like this:

10.8 f79cebb4d02a7b5151ac617bc762c3e094436562 RelWithDebInfo with fix

2022-08-01 13:51:23 0 [Note] InnoDB: Starting a batch to recover 470 pages from redo log.
2022-08-01 13:51:23 0 [ERROR] InnoDB: Cannot apply log to [page id: space=14, page number=0] of corrupted file './test/t5.ibd'

On a pristine backup directory, it succeeded:

10.8 f79cebb4d02a7b5151ac617bc762c3e094436562 RelWithDebInfo with fix

2022-08-01 13:53:16 0 [Note] InnoDB: Starting a batch to recover 469 pages from redo log.
2022-08-01 13:53:17 0 [Note] InnoDB: Starting final batch to recover 240 pages from redo log.
2022-08-01 13:53:17 0 [Note] InnoDB: Last binlog file './mysql-bin.000001', position 2851658
[00] 2022-08-01 13:53:18 Last binlog file ./mysql-bin.000001, position 2851658
[00] 2022-08-01 13:53:18 completed OK!

The same on a debug build of the same source code:

10.8 f79cebb4d02a7b5151ac617bc762c3e094436562 Debug with fix

2022-08-01 13:56:36 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=6286181
2022-08-01 13:56:36 0 [Note] InnoDB: Starting a batch to recover 469 pages from redo log.
2022-08-01 13:56:45 0 [Note] InnoDB: Starting final batch to recover 240 pages from redo log.
2022-08-01 13:56:46 0 [Note] InnoDB: Last binlog file './mysql-bin.000001', position 2851658
[00] 2022-08-01 13:56:47 Last binlog file ./mysql-bin.000001, position 2851658
[00] 2022-08-01 13:56:47 completed OK!

Comment by Nuno [ 2022-08-01 ]

Just putting here some logs in my side.

I thought this issue was "resolved"/accommodated for me, but seems it does keep failing, more or less randomly. (well, usually after the GeoIP or ip2location updates, which do a table swap)

Here are the recent events.
All the Prepare logs are from my Docker container on the Snapshot server.
The backup happens every day at 7:00, finishes at 7:01, and the rsync transfer to the Snapshot server takes ~40-50 mins.

----------------------------------------


2022-07-25 06:40 - ip2location update in Prod

2022-07-25  7:49:09 0 [Note] InnoDB: Starting final batch to recover 141441 pages from redo log.
[00] 2022-07-25 07:50:01 Last binlog file , position 0
[00] 2022-07-25 07:50:01 completed OK!


2022-07-26 06:30 - GeoIP update in Prod

2022-07-26  7:48:27 0 [Note] InnoDB: Starting final batch to recover 77145 pages from redo log.
[00] 2022-07-26 07:48:46 Last binlog file , position 0
[00] 2022-07-26 07:48:46 completed OK!

2022-07-27  7:49:24 0 [Note] InnoDB: Starting final batch to recover 210083 pages from redo log.
[00] 2022-07-27 07:50:56 Last binlog file , position 0
[00] 2022-07-27 07:50:57 completed OK!


2022-07-27 12:16 - RESTART in Production (due to too much swap - see MDEV-29097)

2022-07-28  7:48:36 0 [Note] InnoDB: To recover: 38532 pages from log
[00] 2022-07-28 07:49:14 Last binlog file , position 0
[00] 2022-07-28 07:49:14 completed OK!


2022-07-29 06:30 - GeoIP update in Prod

2022-07-29  7:53:52 0 [ERROR] InnoDB: Attempted to open a previously opened tablespace. Previous tablespace: ./geo/geoip2_city_ipv4.ibd uses space ID: 301523. Cannot open filepath: geo/geoip2_city_ipv4.ibd which uses the same space ID.
2022-07-29  7:53:52 0 [Warning] InnoDB: We do not continue the crash recovery, because the table may become corrupt if we cannot apply the log records in the InnoDB log to it. To fix the problem and start mariadbd:
2022-07-29  7:53:52 0 [Note] InnoDB: 1) If there is a permission problem in the file and mysqld cannot open the file, you should modify the permissions.
2022-07-29  7:53:52 0 [Note] InnoDB: 2) If the tablespace is not needed, or you can restore an older version from a backup, then you can remove the .ibd file, and use --innoforce_recovery=1 to force startup without this file.
2022-07-29  7:53:52 0 [Note] InnoDB: 3) If the file system or the disk is broken, and you cannot remove the .ibd file, you can set --innoforce_recovery.
[00] 2022-07-29 07:53:52 mariadb-backup: srv_start() returned 11 (Generic error).

2022-07-30  7:49:25 0 [Note] InnoDB: To recover: 50630 pages from log
[00] 2022-07-30 07:50:43 Last binlog file , position 0
[00] 2022-07-30 07:50:44 completed OK!


2022-07-30 12:27 - RESTART in Production - increased log file from 5GB to 24GB (only in Prod, not the Snapshot server config)

2022-07-31  7:51:16 0 [Note] InnoDB: To recover: 8907 pages from log
[00] 2022-07-31 07:53:32 Last binlog file , position 0
[00] 2022-07-31 07:53:33 completed OK!


2022-08-01 06:40 - ip2location update in Prod

2022-08-01  8:01:50 0 [ERROR] InnoDB: Attempted to open a previously opened tablespace. Previous tablespace: ./geo/ip2location_ipv4.ibd uses space ID: 303305. Cannot open filepath: geo/ip2location_ipv4.ibd which uses the same space ID.
2022-08-01  8:01:50 0 [Warning] InnoDB: We do not continue the crash recovery, because the table may become corrupt if we cannot apply the log records in the InnoDB log to it. To fix the problem and start mariadbd:
2022-08-01  8:01:50 0 [Note] InnoDB: 1) If there is a permission problem in the file and mysqld cannot open the file, you should modify the permissions.
2022-08-01  8:01:50 0 [Note] InnoDB: 2) If the tablespace is not needed, or you can restore an older version from a backup, then you can remove the .ibd file, and use --innoforce_recovery=1 to force startup without this file.
2022-08-01  8:01:50 0 [Note] InnoDB: 3) If the file system or the disk is broken, and you cannot remove the .ibd file, you can set --innoforce_recovery.
[00] 2022-08-01 08:01:50 mariadb-backup: srv_start() returned 11 (Generic error).

Comment by Marko Mäkelä [ 2022-08-01 ]

Theoretically, this might also affect the crash recovery of DDL operations. I would not be surprised if this only affects preparing a backup. After a crash, the state of the data directory should be more deterministic than after making a backup. The "end LSN" of recovery is the last durable log write before the server was killed. In backup (at least until MDEV-14992 changes it), it is a little fuzzier.

Comment by Marko Mäkelä [ 2022-08-23 ]

Unfortunately, my fix caused DDL recovery failures (MDEV-29282). thiru developed a better fix that would allow this backup to be restored while not causing such a regression.

Generated at Thu Feb 08 10:04:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.