[MDEV-11556] InnoDB redo log apply fails to adjust data file sizes Created: 2016-12-13 Updated: 2022-11-22 Resolved: 2016-12-30 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Backup, Storage Engine - InnoDB |
| Affects Version/s: | 10.1.20 |
| Fix Version/s: | 10.1.21, 10.2.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | Andrii Nikitin (Inactive) | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | 10.2.4-5 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Prepare of partial backup (with --tables option) sporadically fails with following error : InnoDB: Error: tablespace size stored in header is 4864 pages, but Attached are: Interesting observation is that all tables are backed up - this should be another bug eventually if not fixed as part of this. |
| Comments |
| Comment by Vladislav Vaintroub [ 2016-12-13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Could you also attach the compressed data directory and backup directory from the test? Thanks! | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-15 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
I looked at this.
The immediate problem is that backup-my.cnf contains the following line, which is handled incorrectly:
This is interpreted as a hard limit of 768 pages, while we should notice that there is no :max:12M or similar appended. Another problem is that the XtraDB startup is done in the wrong order. By the time we wrongly decide to report the size mismatch, we have already applied the redo log and started the rollback of incomplete transactions. | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-15 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Actually, the reason why this does not fail for normal XtraDB or InnoDB crash recovery is that files cannot be extended during redo log apply. In xtrabackup, this is possible, and the problem is that open_or_create_data_files() reads the file sizes as they were before the files were extended. We should either extend the files before this point, by calling fil_extend_space_to_desired_size(), or we should relax the check. For the relaxation, I was thinking about the following:
However, this does not look correct to me. I think that we should first extend the files, then apply all redo log to those pages (at which point it becomes possible that data is actually written back to the files), and only after that start generating more redo log (start the rollback of incomplete transactions etc.) In problem.tar.gz the last nonzero page of ibdata1 is 523, and there last page number for which recv_add_to_hash_table() is invoked on space=0 is page_no=522. (It is a bit strange why ibdata1 was extended from 768 to 4864 pages in the first place, with so many empty pages at the end. Maybe still a bug in the InnoDB page allocation?) I think that we need a test case with innodb_file_per_table=0 that would cause ibdata1 to be extended such that there are redo log records to be applied for the extended area. I fear that we may be losing redo log records in the current implementation, because by the time fil_extend_space_to_desired_size() is called, recv_apply_hashed_log_recs(TRUE) would already have applied all redo logs from recv_sys->addr_hash, including any for the ‘missing’ pages that would be created by fil_extend_space_to_desired_size(). We can find out the total size of the system tablespace already when scanning the redo log, like this:
I think that we should extend the last system tablespace file before actual redo log application starts (recv_read_in_area() is called). | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
I got an idea how to write a test case for this. It would also make the InnoDB/XtraDB crash recovery properly support file extension.
After this, skip the test if the redo log checkpoint is after the checkpoint that we recorded before the CREATE TABLE, or if the ibdata1 file starting from $page is not zero-initialized. If the preconditions are met, truncate the ibdata1 file at $page * innodb_page_size bytes and start the server to execute the following:
The DROP TABLE will access the B-tree root page which should have been in the truncated portion of the ibdata1 file. Without any patch, InnoDB should crash in some way when trying to access a non-existing page in the system tablespace file. | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Actually the same test should also be applicable for innodb_file_per_table=1 when we truncate the file to 3 pages so that the clustered index root page (page number 3) will be missing. Something like this:
Then, check the LSN. If it is OK, truncate the ibd.ibd file. If also the page number is OK, truncate ibdata1. Finally start up the server and continue:
If an unwanted log checkpoint occurred during the two CREATE TABLE (the checkpoint LSN in the redo log is too new), after the above restart and DROP TABLE, we will report that the test was skipped. | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-23 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
We should truncate .ibd files to 0, 1, 2, or 3 pages, and also some fractional number of pages (incomplete page). | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-27 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
I developed a mysql-test case that truncates the data files as suggested above. If we simply truncate the ibdata1 file which had already been enlarged to the desired size, we cannot expect the revised recovery to work, because there would not be any redo log record for updating the file size in the tablespace header. With single-table tablespaces, it is easier to construct the necessary scenario. Currently, startup would actually trigger | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-28 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Please see bb-10.1-mdev-11556 With this patch applied, bb-10.1-wlad-xtrabackup was able to recover from the problematic dataset (it did extend the system tablespace). | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2016-12-29 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Based on your comment "Currently, it is theoretically possible that some file pages are read from a tablespace file whose size has not been adjusted yet. The transaction system state is being restored concurrently with redo log apply. It could theoretically mean that the logic for extending the system tablespace or the undo log tablespaces is broken under some scenario, such as when undo log pages were written to the end of a file just after resizing. | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-29 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
I am going to try a revised fix that would introduce a field fil_space_t::recv_size. This field would be normally 0, and it would be set by recv_parse_log_rec(). When this field is nonzero for a tablespace that fil_io() is about to access, fil_io() would extend the tablespace on the spot, before posting the read or write request. This mechanism should guarantee that each file be resized before any redo log is applied. | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-29 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Please see the updated bb-10.1-mdev-11556 branch. The problem when extending the system tablespace with innodb_page_size=4k or 8k remains. It looks unrelated to my changes, but it must definitely be fixed before this patch is committed. | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-29 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
The remaining problem was solved by removing some inappropriate rounding of the InnoDB system tablespace size:
| ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2016-12-29 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
ok to push. | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-29 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
I repushed once more to bb-10.1-mdev-11556. There was a bug in my test file that caused it to fail when run against a non-debug server. | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2016-12-30 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
I pushed this to 10.1 and am now working on the merge to 10.2. | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-01-24 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
I did not merge this correctly to MariaDB 10.2. A follow-up adjustment will fix |