[MDEV-20198] Instant ALTER TABLE is not crash-safe Created: 2019-07-29 Updated: 2022-03-29 Resolved: 2022-03-28 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Data Definition - Alter Table |
| Affects Version/s: | 10.4.6, 10.5, 10.6, 10.7, 10.8 |
| Fix Version/s: | 10.6.8, 10.7.4, 10.8.3, 10.9.1 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Nicolas Reynis (Inactive) | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | affects-tests, recovery, regression-10.4, rr-profile-analyzed | ||
| Environment: |
Official docker image. Running on multiple hosts, docker for desktop (windows 10 pro) and rancherOS |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Description |
|
I'm using the official MariaDB mariadb:10.4.6-bionic image and I have a problem: my tables corrupt themselves. This had happened at least four times this month:
Everytime the symptom is that a table exists in the information schema but it can't be read or altered at all. Any operation leads to a 'Table does not exists in engine' error. The container logs contain this error at startup: [ERROR] InnoDB: Table sfdb.specialist contains unrecognizable instant ALTER metadata I've checked the server system variables with SHOW VARIABLES; And alter_algorithm has the value DEFAULT which means (if I understand the documentation correctly) that it should not use the INSTANT algorithm, only INPLACE and COPY. Trying to dig into the MariaDB codebase led me to this check: But I do not have enough knowledge of the inner workings of MariaDB to understand what is going on. What am I doing wrong? Should I set some flag or configuration to have a stable database in docker? PS: if this JIRA is not the correct place to ask this can you point me to a more suitable place to post my question ? |
| Comments |
| Comment by Anel Husakovic [ 2019-07-31 ] | ||||||||||||||||||||||||||||||||||||
|
Hi nreynisama, | ||||||||||||||||||||||||||||||||||||
| Comment by Nicolas Reynis (Inactive) [ 2019-08-01 ] | ||||||||||||||||||||||||||||||||||||
|
Sorry I don't have any test case, we have a hard time isolating the problem. It just happens sporadicaly for no apparent reason. I was hoping they may be some misconfiguration on my side or some best practice I failed to follow. | ||||||||||||||||||||||||||||||||||||
| Comment by Anel Husakovic [ 2019-08-02 ] | ||||||||||||||||||||||||||||||||||||
|
Hi nreynisama, | ||||||||||||||||||||||||||||||||||||
| Comment by Judah Wright [ 2019-09-06 ] | ||||||||||||||||||||||||||||||||||||
|
I am seeing the same issue with our database. We are using version 10.4.7 on both Ubuntu Bionic and Disco, as well as CentOS. On server restart there is maybe a 20-30% chance that one of two specific tables will get the error described above. It is always one of these two tables. This is reproducible locally, on a few co-workers machines, and on a few different cloud providers. It is not just with production data either, if we wipe and re-seed the database with test data the issue persists. I cannot seem to reproduce the problem independent of our specific schema, but I would be happy to share it privately if that might help. | ||||||||||||||||||||||||||||||||||||
| Comment by Juan Gabriel Covas [ 2019-09-06 ] | ||||||||||||||||||||||||||||||||||||
|
Hi, I'm affected by The only way I've found to run 10.4.7 making ALTERs that avoid the "instant" stuff is setting alter_algorithm=COPY (sadly), so DEFAULT is not safe (it will use Instant on some cases). I just want to point this in case this is a solution for Nicolas to achieve stability with his mariadb servers and this reported bug. Perhaps some of us can have time on trying to reproduce, but for me it's for sure a thing of doing ALTERs to those tables (with DEFAULT alter_algorithm) and get the metadata corrupted as mentioned in this bug report (and perhaps involving a mariadb instance restart, dunno...). | ||||||||||||||||||||||||||||||||||||
| Comment by Nicolas Reynis (Inactive) [ 2019-09-09 ] | ||||||||||||||||||||||||||||||||||||
|
Thanks Juan, I'll give it a shot. | ||||||||||||||||||||||||||||||||||||
| Comment by Christopher v D [ 2019-11-26 ] | ||||||||||||||||||||||||||||||||||||
|
So, we are having the exact same problem in our company with mariadb 10.4.10-1 currently only affecting Linux and MacOS environments. It seems to us, that putting a machine to sleep mode and waking it corrupts the InnoDB tablespace data in some way. Not only productive tables are being corrupted, but internal InnoDB tables as well. We haven't fiddled with any config settings other than logsizes and are using lower_case_table_names = 1. | ||||||||||||||||||||||||||||||||||||
| Comment by Nicolas Reynis (Inactive) [ 2019-11-26 ] | ||||||||||||||||||||||||||||||||||||
|
@Christopher: | ||||||||||||||||||||||||||||||||||||
| Comment by Christopher v D [ 2019-11-26 ] | ||||||||||||||||||||||||||||||||||||
|
@Nicolas | ||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-11-26 ] | ||||||||||||||||||||||||||||||||||||
|
I suspect that this ticket reports what was fixed in MariaDB 10.4.7 in jgcovas, you could prevent instantaneous operation and request a table rebuild with the FORCE keyword. That is supported even with ALGORITHM=INPLACE. Once the table is corrupted, I am afraid that it is very hard to fix it. Restoring from a SQL dump or a backup should work. | ||||||||||||||||||||||||||||||||||||
| Comment by Juan Gabriel Covas [ 2019-11-26 ] | ||||||||||||||||||||||||||||||||||||
|
Thanks Marko. I've tested If mariadb server is run using alter_algorithm=COPY, then it does NOT matter if you put ALGORITHM=INSTANT on the ALTER statement: the table is OK afterwards (no Instant bug there). If you restart the server with alter_algorithm=DEFAULT, then the ALTER gets the table metadata corrupted (Instant bug). This confuses me, but just to follow-up if this leads to something (shouldn't the ALTER honor the ALGORITHM keyword?). Setting the session alter_algorithm to INSTANT also works (makes the ALTER to corrupt the table metadata). | ||||||||||||||||||||||||||||||||||||
| Comment by Oren Bissick (Inactive) [ 2019-12-02 ] | ||||||||||||||||||||||||||||||||||||
|
Having the same issue it started when i updated from 10.4.8 to 10.4.10 i downgraded back to 10.4.8 but the issue still exist. It also seems to only be happening to table with a dash | ||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-01-12 ] | ||||||||||||||||||||||||||||||||||||
|
mleich reproduced this error in internal testing of something else. In that case, the server was killed during an instant reorder column (
At the point of the SIGKILL, in the durably written state of the page contains a metadata record stub for which the metadata BLOB had not been written out yet. Therefore, btr_cur_instant_init_low() would fail due to the following:
I think that the correct fix is that when the metadata BLOB pointer is all-zero, btr_cur_instant_init_low() will assume that the metadata record is in uncommitted state and pretend that it does not exist. In fact, at that point of time we should have access to the transaction metadata and could easily check if DB_TRX_ID points to an incomplete transaction. | ||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-01-12 ] | ||||||||||||||||||||||||||||||||||||
|
Here is the start of a fix.
The fix is incomplete, and the ut_a() assertion would fail for the data directory that I have. According to the rr replay of the killed server, an instant ADD COLUMN as well as some column reordering had already been executed on the table before the column-reordering ALTER TABLE started to execute. I think that we would have to fetch a previous version of the metadata record where the metadata is available. If a previous version is not found (the server was killed during the very first column-reordering ALTER TABLE), then !index->is_instant() should hold and we can return DB_SUCCESS. I seem to remember that dict_load_indexes() is using the READ UNCOMMITTED isolation level. If we have to refer to the preceding version of the metadata record, I think that we would also have to refer to the corresponding version of the data dictionary tables when loading the metadata. The correct fix might be that we return a special error code from btr_cur_instant_init_low() and then retry loading the data dictionary definition using the earlier version, and on a subsequent call to btr_cur_instant_init_low() pass some state that tells that the previous version of the metadata record needs to be used. For fault injection testing, I think that we must let the server hang at this point:
To test the scenario where the server is being during the very first | ||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-01-12 ] | ||||||||||||||||||||||||||||||||||||
|
Something like this should cover the scenario that the server is killed during first-time
This test did not deliver the expected result for me: the table was recovered as (b,a)=(NULL,1) instead of (a)=(1). | ||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-03-28 ] | ||||||||||||||||||||||||||||||||||||
|
This was fixed by Before 10.6, DDL operations are not crash-safe, not even if we ignore the mismatch between the two data dictionaries .frm files and what is stored inside InnoDB. The .frm file mismatch mostly affected ALTER TABLE and was fixed in |