[MDEV-32103] InnoDB ALTER TABLE is not crash safe Created: 2023-09-05 Updated: 2023-10-03 Resolved: 2023-09-11 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Data Definition - Alter Table, Storage Engine - InnoDB |
| Affects Version/s: | N/A |
| Fix Version/s: | 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Matthias Leich | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | corruption, recovery, regression, rr-profile-analyzed | ||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
|
| Comments |
| Comment by Marko Mäkelä [ 2023-09-06 ] | |||||||||||||||||||||||||||
|
The first server instance after bootstrap executed an ALTER TABLE slightly before it was killed:
It looks like the last persistent write (update of log_sys.flushed_to_disk_lsn) was some time earlier in this thread:
The later call to ha_innobase::commit_inplace_alter_table() returned false (indicating success), while log_sys.lsn=19710260 and log_sys.flushed_to_disk_lsn=19691339 (at the value of the above write). This is definitely wrong. These values remained until the SIGKILL. Because there was no call to innodb_check_version on the subsequent server startup, I suppose that the SQL layer had replaced the .frm file and cleared the ddl_recovery.log. That subsequent recovery reported the following:
I believe that this is a regression due to | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-09-06 ] | |||||||||||||||||||||||||||
|
The atomic DDL commit that fails to be persisted is here:
| |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-09-06 ] | |||||||||||||||||||||||||||
|
The reason why the DDL commit was not written is related to
| |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-09-07 ] | |||||||||||||||||||||||||||
|
It looks like this must have been broken for a longer time. wlad confirmed to me that we would either need a call to thd->async_state.wait_for_pending_ops(); after the DDL operation returns from InnoDB, or a synchronous call of log_write_up_to() (without a completion callback) is needed, to ensure the correct durability on DDL. | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-09-07 ] | |||||||||||||||||||||||||||
|
After all, this does look like a regression due to this preparatory change of In fact, mleich did catch this type of failure while testing
We can make use of the field trx_t::flush_log_later in all DDL operations. The only DDL operation that does not invoke trx_t::commit(std::vector<pfs_os_file_t> &) is ha_innobase::rename_table(). |