[MDEV-28349] Provide "crash safe" options for CHECK TABLE and ALTER TABLE ... CHECK PARTITION ... Created: 2022-04-19 Updated: 2023-07-11 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Data Definition - Alter Table, Storage Engine - InnoDB |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major |
| Reporter: | Valerii Kravchuk | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | check, corruption, innochecksum | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
| Description |
|
We need a safe way to run table checks for InnoDB tabes with statements like CHECK TABLE or ALTER TABLE ... CHECK PARTITION in production, that will NOT cause any deliberate assertion failures. Something like deprecated and removed since 10.3 innodb_corrupt_table_action option (the name may be different) for these statements (or all access) with values like "assert" (current behaviour), "warn" (add the details about corruption found and continue if possible or stop stating the table/partition is corrupted) etc. This should apply NOT only to page checksums, but to all other kinds of assertions we may hit in InnoDB in the process. |
| Comments |
| Comment by Marko Mäkelä [ 2022-04-19 ] |
|
I am afraid that crashes in CHECK TABLE due to change buffer corruption (see |
| Comment by Marko Mäkelä [ 2022-05-23 ] |
|
I believe that I implemented most of this in Perhaps mleich could take fault injection to the next level and play a "crazy DBA" who would attempt to back up a running database with rsync instead of proper mariadb-backup or file system snapshots. If the corrupted backup does not cause InnoDB to crash, we should have a winner. |
| Comment by Marko Mäkelä [ 2022-06-07 ] |
|
|
| Comment by Marko Mäkelä [ 2022-08-01 ] |
|
I am afraid that avoiding all crashes in CHECK TABLE requires avoiding crashes in all of InnoDB. CHECK TABLE shares a lot of code with normal multi-versioned (MVCC) repeatable read. valerii, the obvious sources of crashes (including many in CHECK TABLE) should be fixed in the upcoming 10.6.9 release. Can you (or anyone else) provide examples of remaining crashes on corrupted data? |
| Comment by Marko Mäkelä [ 2022-08-01 ] |
|
|
| Comment by Marko Mäkelä [ 2022-11-02 ] |
|
The CHECK TABLE record-counting code was rewritten in |
| Comment by Marko Mäkelä [ 2022-11-02 ] |
|
valerii, have any crashes been observed with MariaDB Server 10.6.9 or later? ( |
| Comment by Valerii Kravchuk [ 2022-11-07 ] |
|
Are we sure that there is a version (which one, 10.6.9?) where CHECK TABLE and ALTER TABLE ... CHECK PARTITION ... statements are entirely safe, in a sense that when the statement finds any corruption or problem, it reports it, maybe do something else, but let server (other threads) to continue working? If so, the task can be closed IMHO. I doubt we are at this stage already, though. |
| Comment by Marko Mäkelä [ 2022-11-07 ] |
|
valerii, I agree that it is better to retain this ticket open for a few more months, to find practical examples where CHECK TABLE would crash. Just today, related to |
| Comment by Marko Mäkelä [ 2023-03-06 ] |
|
valerii, a few months have passed. Since CHECK TABLE shares quite some code with the rest of InnoDB even after |