[MDEV-15110] InnoDB crash on recovery Created: 2018-01-29 Updated: 2020-08-25 Resolved: 2018-03-13 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.1.22, 10.1.30 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Hartmut Holzgraefe | Assignee: | Marko Mäkelä |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
linux, backup taken by Zmanda |
||
| Issue Links: |
|
||||||||
| Description |
|
Startup error log:
GDB backtrace:
|
| Comments |
| Comment by Elena Stepanova [ 2018-01-31 ] | ||||||||||||||||||||||
|
Might be related to | ||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-02-02 ] | ||||||||||||||||||||||
|
I do not think that this is related to
This could be a transaction that is writing undo log to page 0:301 and then modifying two tables, in tablespaces 3 and 4. The table in tablespace 3 should be almost empty, because the data appears to have been in the root page. The table name is mysql.gtid_slave_pos.
Now, the problem is that the MLOG_COMP_PAGE_INSERT record (type 38) refers to the insert position 0x179 from the start of the page, but as you can see, the page is empty; it does not contain anything after the page supremum record at 0x70. The next-record pointer of the page infimum is pointing to the page supremum (the 16-bit big endian field at 0x61 contains the relative offset 0x0d). I see only one set of ib_logfile* in the provided mysql.zip, so I do not think that could have used the wrong redo logs for the data files. Above, we see that the 64-bit field FIL_PAGE_LSN at offset 0x10 is 0x18a12f, or 1,614,127. It is much less than the redo log LSN of 685,398,305. These numbers are roughly bytes written to the redo log. It is possible that the LSN 1,614,127 was written when the database was originally initialized. All the nonempty (first 4) pages of the file contain the same FIL_PAGE_LSN. It seems to me that something must have gone wrong when creating the backup. It looks like a much too old version of the data files was copied or restored. It is also possible that in case a file system snapshot was used, it did not work correctly. This does not look like a bug to me, but rather incorrect usage. My wild guess is that someone tried to replace the gtid_slave_pos.ibd with an empty file from a newly initialized instance, to work around some problem. In that case, it would be a better idea to delete the file altogether, and let InnoDB skip any redo log for it. |