Actually, the reason why this does not fail for normal XtraDB or InnoDB crash recovery is that files cannot be extended during redo log apply. In xtrabackup, this is possible, and the problem is that open_or_create_data_files() reads the file sizes as they were before the files were extended.
We should either extend the files before this point, by calling fil_extend_space_to_desired_size(), or we should relax the check. For the relaxation, I was thinking about the following:
diff --git a/storage/xtradb/srv/srv0start.cc b/storage/xtradb/srv/srv0start.cc
|
index b65366a4863..a6d38263ffd 100644
|
--- a/storage/xtradb/srv/srv0start.cc
|
+++ b/storage/xtradb/srv/srv0start.cc
|
@@ -2958,7 +2958,16 @@ innobase_start_or_create_for_mysql(void)
|
|
if (!srv_read_only_mode
|
&& srv_auto_extend_last_data_file
|
- && sum_of_data_file_sizes < tablespace_size_in_header) {
|
+ && sum_of_data_file_sizes < tablespace_size_in_header
|
+ /* In xtrabackup, open_or_create_data_files() initialized the
|
+ srv_data_file_sizes[] before the redo log was applied.
|
+ Because data files can be extended during xtrabackup
|
+ redo log apply, we must relax the check. The files will
|
+ be extended later in xtrabackup_prepare_func(), which
|
+ invokes fil_extend_space_to_desired_size(). */
|
+ && (!IS_XTRABACKUP()
|
+ || (srv_last_file_size_max != 0/* :max: was specified */
|
+ && srv_last_file_size_max < tablespace_size_in_header))) {
|
|
ut_print_timestamp(stderr);
|
fprintf(stderr,
|
However, this does not look correct to me. I think that we should first extend the files, then apply all redo log to those pages (at which point it becomes possible that data is actually written back to the files), and only after that start generating more redo log (start the rollback of incomplete transactions etc.)
In problem.tar.gz the last nonzero page of ibdata1 is 523, and there last page number for which recv_add_to_hash_table() is invoked on space=0 is page_no=522. (It is a bit strange why ibdata1 was extended from 768 to 4864 pages in the first place, with so many empty pages at the end. Maybe still a bug in the InnoDB page allocation?)
I think that we need a test case with innodb_file_per_table=0 that would cause ibdata1 to be extended such that there are redo log records to be applied for the extended area.
I fear that we may be losing redo log records in the current implementation, because by the time fil_extend_space_to_desired_size() is called, recv_apply_hashed_log_recs(TRUE) would already have applied all redo logs from recv_sys->addr_hash, including any for the ‘missing’ pages that would be created by fil_extend_space_to_desired_size().
We can find out the total size of the system tablespace already when scanning the redo log, like this:
diff --git a/storage/xtradb/log/log0recv.cc b/storage/xtradb/log/log0recv.cc
|
index 10dbfdfae6b..cdf166fbdbc 100644
|
--- a/storage/xtradb/log/log0recv.cc
|
+++ b/storage/xtradb/log/log0recv.cc
|
@@ -2253,6 +2253,7 @@ recv_parse_log_rec(
|
}
|
#endif /* UNIV_LOG_LSN_DEBUG */
|
|
+ byte* old_ptr = new_ptr;
|
new_ptr = recv_parse_or_apply_log_rec_body(*type, new_ptr, end_ptr,
|
NULL, NULL, *space);
|
if (UNIV_UNLIKELY(new_ptr == NULL)) {
|
@@ -2260,6 +2261,13 @@ recv_parse_log_rec(
|
return(0);
|
}
|
|
+ if (*space == 0 && *page_no == 0 && *type == MLOG_4BYTES
|
+ && mach_read_from_2(old_ptr) == FSP_HEADER_OFFSET + FSP_SIZE) {
|
+ ulint size;
|
+ mach_parse_compressed(old_ptr + 2, end_ptr, &size);
|
+ fprintf(stderr, "size: %lu\n", (ulong) size);
|
+ }
|
+
|
if (*page_no > recv_max_parsed_page_no) {
|
recv_max_parsed_page_no = *page_no;
|
}
|
I think that we should extend the last system tablespace file before actual redo log application starts (recv_read_in_area() is called).
Could you also attach the compressed data directory and backup directory from the test? Thanks!