[MDEV-19783] Random crashes and corrupt data in INSTANT-added columns Created: 2019-06-17 Updated: 2022-06-09 Resolved: 2019-11-15 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera, Server, Storage Engine - InnoDB |
| Affects Version/s: | 10.3.13, 10.3.14, 10.3.15 |
| Fix Version/s: | 10.3.17, 10.4.7 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Bernard Grymonpon | Assignee: | Marko Mäkelä |
| Resolution: | Duplicate | Votes: | 1 |
| Labels: | corruption, instant | ||
| Environment: |
Centos 7 |
||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
This will be a bit vague bug report, but I'll file it anyhow, as I think it relates to a couple of other reports and might give additional insight. We run a 5 node galera cluster, 1.1TB of data, quite heavy read/write during usage and high parallellism. This runs on Centos7. The core of the issue are random crashes, giving either terminations on signal 6 or signal 11. The signal 6 ones are always intentionally crashed by innodb as it detects some sort of corruption, the signal 11s mostly come after it reports on a corrupted index, and then crashes without info a couple of seconds later. In retrospect, we believe it started while we were at version 10.3.14. The first manifestation was with this stacktrace on one of our nodes:
Server restarted and continued happily after that. We didn't find a cause, and blamed it as a freak accident. However, after a while, we started seeing issues on our backups. We take the backups with mariabackup, and part of the backup process is to actually restore the full backup again and verify that it load and runs fine on a test instance. Part of the verification is running a mysqlcheck. We had a couple of tables reporting that they were corrupt, but an optimize fixed this (secondary indexes being corrupt). We blamed this on running an older version of mariabackup on the recovery node to apply the logs, but think this might be unrelated. Things got worse last week. We had a crash on 3 of our nodes, exactly at the same time, with exactly the same error on all three nodes (this is probably galera pushing a change to the same table on all nodes):
We recovered from this issue, and further investigation gave some tables which had the index corruption (redacted some names):
We recovered eventually from this by mysqldump-ing the data, dropping the table and reload the content. The dumping needed in one case the startup of the server with force_innodb_recovery set to 6. There is one symptom which made us raise some eyebrows, and maybe this might help in debugging. The tables which had corruption had a field appended (alter table add column varchar...), using the new "instant" method (https://mariadb.com/kb/en/library/instant-add-column-for-innodb/). When mysqldumping the content from the corrupted tables, this field was actually filled with corrupted data (while it should have been NULL, we added it but didn't use/fill it yet). The data in the field looked like binary content of the table itself, as we could recognise parts of the other varchar columns in there. All other fields seemed correct. My apologies if this is vague. We have the binary datafiles still on disk in our backups, so if needed we can "restore" the corrupted situation. We thought our case has similarities to https://jira.mariadb.org/browse/MDEV-19666 and https://jira.mariadb.org/browse/MDEV-18223. |
| Comments |
| Comment by Marko Mäkelä [ 2019-06-18 ] | |||||||||||||||||||||||||||
|
wonko, can you please provide the table schema? Preferrably, CREATE TABLE statement, then ALTER TABLE to add the columns. You can obfuscate the table and column names, but it has to be valid SQL. That would greatly help us reproduce the problem. Thanks! | |||||||||||||||||||||||||||
| Comment by Bernard Grymonpon [ 2019-06-18 ] | |||||||||||||||||||||||||||
|
I only have the current create statement (`show create table`), I don't have the exact statement used to add the column which was added through the instant-add method. But we didn't do anything special, just an "alter table add column ... ". So I would say the original table was what is below, without the instant-added-field, and the instant-added-field is added later.
| |||||||||||||||||||||||||||
| Comment by Bernard Grymonpon [ 2019-06-18 ] | |||||||||||||||||||||||||||
|
Btw, if it might help debugging the case, I can provide that frm/ibd with the corruption, but I can't just post it here (there is customer data in there). | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-08-21 ] | |||||||||||||||||||||||||||
|
wonko, I am very sorry for the delay, due other assignments, and holidays. The .ibd file would help a lot, if you still have it. You can upload it to ftp://ftp.mariadb.com/uploads and let me know the file name. This case looks very similar to | |||||||||||||||||||||||||||
| Comment by Bernard Grymonpon [ 2019-08-22 ] | |||||||||||||||||||||||||||
|
Uploaded the two ibd/frm files which caused the issues. Filename is You can clearly see the problem when you do a simple select (with a limit 10), and you'll notice the uuid field has data which actually maps to the other fields. There is customer data in these files, please discard of them when they are no longer needed. | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-08-23 ] | |||||||||||||||||||||||||||
|
wonko, thank you. It looks like for this table, something has wrongly reset the FIL_PAGE_TYPE of the root page from FIL_PAGE_TYPE_INSTANT to FIL_PAGE_INDEX. This corruption would not take effect until the table definition is being re-loaded (after server restart, or after the table had been evicted from the InnoDB data dictionary cache). This is a very nasty bug that should affect all usage of instant ADD COLUMN (in 10.4, also instant DROP COLUMN and changing the order of columns). To work around this, the tables should be rebuilt to the canonical format:
For both .ibd files that are contained in the archive For the second file (with 3-digit number in the directory name), we likewise have FIL_PAGE_TYPE=FIL_PAGE_INDEX and PAGE_LEVEL=1 for the clustered index root page., and again the leftmost leaf page number is 8, and again we have the hidden metadata record at the logical start of that page. | |||||||||||||||||||||||||||
| Comment by Bernard Grymonpon [ 2019-08-23 ] | |||||||||||||||||||||||||||
|
We fixed this by disallowing the instant-add logic in our framework which keeps our schemas in a good shape; and did an optimize on all tables. So we no longer have this corruption. | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-08-23 ] | |||||||||||||||||||||||||||
|
wonko, there must be a glitch somewhere in the InnoDB B-tree logic that allows this to happen. It should not be in the root page split logic, because in both your tables, it looked like the instant ADD COLUMN was executed after the root page had already been split. The root page could have been reorganized, but on reorganize we should preserve the FIL_PAGE_TYPE. The next step is that I will add debug instrumentation that allows our Random Query Generator (RQG) based tests to evict tables from the InnoDB table more frequently, or at each transaction commit. With that, it should be possible to repeat the bug using RQG, and then apply the grammar simplifier to end up with a simple test case that I can use for repeating the bug. Until we have found and fixed this bug, it is not safe to use the instant ADD COLUMN feature. | |||||||||||||||||||||||||||
| Comment by Bernard Grymonpon [ 2019-08-23 ] | |||||||||||||||||||||||||||
|
Thanks for the confirmation. Is there a general flag in the config we can set to disable all the `INSTANT` stuff at the server leve (or change the default behaviour)l? | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-08-23 ] | |||||||||||||||||||||||||||
|
wonko, there is only the crude old_alter_table=1 or alter_algorithm=copy, which will cause even the simplest ALTER TABLE to use ALGORITHM=COPY (copy the data record by record), like it was before the InnoDB Plugin in MySQL 5.1 introduced "fast index creation". You might not want that. elenst observed that | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-08-23 ] | |||||||||||||||||||||||||||
|
mleich, please test 10.3 with innodb_evict_tables_on_commit_debug.patch
A single client should be sufficient, and possibly necessary. With a single client connection, --innodb-evict-tables-on-commit-debug should have much better chances to evict table definitions during transaction commit. | |||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2019-08-23 ] | |||||||||||||||||||||||||||
|
ok | |||||||||||||||||||||||||||
| Comment by Pavel Znamensky [ 2019-09-12 ] | |||||||||||||||||||||||||||
|
Hi, | |||||||||||||||||||||||||||
| Comment by Geoff Montee (Inactive) [ 2019-09-12 ] | |||||||||||||||||||||||||||
|
Hi kompastver,
We do have some documentation here: https://mariadb.com/kb/en/library/innodb-online-ddl-operations-with-algorithminstant/#known-bugs I believe the only configuration option that could transparently prevent this is to set the following in a configuration file:
Setting this to COPY would likely cause terrible performance for most users. Setting this to INPLACE wouldn't be a sufficient workaround, because InnoDB can still use the INSTANT algorithm if the algorithm is set to any value other than COPY. This is due to the hierarchical nature of MariaDB's alter algorithms. See here:
https://mariadb.com/kb/en/library/innodb-online-ddl-overview/#specifying-an-alter-algorithm
If you set the algorithm to INPLACE, then you would also have to add the FORCE option. e.g.:
| |||||||||||||||||||||||||||
| Comment by Pavel Znamensky [ 2019-09-13 ] | |||||||||||||||||||||||||||
|
Geoff, thanks for explanations! | |||||||||||||||||||||||||||
| Comment by Geoff Montee (Inactive) [ 2019-09-13 ] | |||||||||||||||||||||||||||
|
Hi kompastver, No problem! I know that marko was considering whether a better workaround is needed to avoid known bugs like this. I created We haven't been able to reproduce this issue yet, so if you happen to find a way to reproduce it, please let us know. Luckily, the bug seems relatively rare. | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-09-25 ] | |||||||||||||||||||||||||||
|
We should ensure that CHECK TABLE will report corruption if:
Note: REC_INFO_MIN_REC_FLAG did not occur in any leaf-page records until It would also help if normal SQL operations noticed these forms of corruption.
were not successful yet. I think that in order to be able to reproduce this bug, we must try to evict and reload table definitions as often as possible. If the table remains in the InnoDB dict_sys cache, any changes to the FIL_PAGE_TYPE of the clustered index root page should not matter at all. | |||||||||||||||||||||||||||
| Comment by Eugene Kosov (Inactive) [ 2019-09-30 ] | |||||||||||||||||||||||||||
|
Please review CHECK TABLE improvements. | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-10-08 ] | |||||||||||||||||||||||||||
|
I pushed the CHECK TABLE improvements and improved checks to a staging tree of 10.2 after testing it WITH_INNODB_EXTRA_DEBUG enabled. I don’t think that it hurts to have better checks in 10.2 already. | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-11-12 ] | |||||||||||||||||||||||||||
|
I believe that I may have found an explanation why mleich failed to repeat this corruption with recent 10.3 or 10.4. | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-11-15 ] | |||||||||||||||||||||||||||
|
I think that this report duplicates |