[MDEV-24287] mysqlcheck unexpectedly causes table to be marked as corrupt Created: 2020-11-26  Updated: 2023-03-15  Resolved: 2021-06-05

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Affects Version/s: 10.4.13
Fix Version/s: N/A

Type: Bug Priority: Minor
Reporter: Richard Assignee: Sergei Golubchik
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

Ubuntu 20.04.1 LTS (GNU/Linux 5.4.0-52-generic x86_64)


Attachments: PNG File image-2020-11-26-10-43-29-157.png     PNG File image-2020-11-26-10-48-33-725.png     PNG File image-2020-11-26-10-49-30-829.png     PNG File image-2020-11-26-10-51-15-103.png    
Issue Links:
Relates
relates to MDEV-13542 Crashing on a corrupted page is unhel... Closed

 Description   

Summary

  • mysqlcheck found some tables with defective secondary indexes
  • The tables were marked as corrupt by mysqlcheck and could not be accessed
  • resulted in a 1hr production outage on a public facing service

It is not clear from the documentation (https://mariadb.com/kb/en/mysqlcheck/) that this is intended behaviour. The implication is that a check is just a check and takes no action.

Expectation: A check should flag the problem but take no action OR the defective secondary indexes should be marked as unusable. Either way the table should remain accessible pending a planned response.

Detail

Server version: 10.4.13-MariaDB-1:10.4.13+maria~focal-log mariadb.org binary distribution

All tables are Innodb

This is a Master plus 2 slave replication topology

Initially both slaves showed the same replication errors:

2020-11-24 10:01:44 44 [ERROR] InnoDB: Record in index `idlocation_deleted_packetgw1_senddate` of table `xxx`.`yyy` was not found on update: TUPLE (info_bits=0, 5 fields):

{[2] p(0x0670),[1] (0x01),NULL,[5] (0x99A7F0A000),[4] @ (0x1340A38C)}

at: COMPACT RECORD(info_bits=0, 5 fields):

{[2] o(0x066F),[1] (0x01),[1]2(0x32),[5] (0x99A7F08000),[4] i f(0x1369C166)}

2020-11-24 10:01:44 44 [ERROR] InnoDB: Record in index `idloc_del_sentviagw_deliverystdone_senddate_packetgw1` of table `xxx`.`yyy` was not found on update: TUPLE (info_bits=0, 7 fields):

{[2] p(0x0670),[1] (0x01),[4] (0x80000001),[1] (0x01),[5] (0x99A7F0A000),NULL,[4] @ (0x1340A38C)}

at: COMPACT RECORD(info_bits=0, 7 fields):

{[2] p(0x0670),[1] (0x01),[4] (0x80000001),[1] (0x01),[5] (0x99A7EEA000),[1]2(0x32),[4] M!G(0x134D2147)}

2020-11-24 10:01:44 4 [ERROR] InnoDB: Unable to find a record to delete-mark
InnoDB: tuple DATA TUPLE: 7 fields;

This led me to do a mysqlcheck on the production master:

mysqlcheck --check --verbose --databases aaa bbb ccc

6 tables had errors of this type:

schema.tablex
Warning : InnoDB: Index 'xxxx_id' contains 77545619 entries, should be 77545621.
Warning : InnoDB: Index 'session_id' contains 77545620 entries, should be 77545621.
error : Corrupt

We then saw problems of this type in the log:

[ERROR] Got error 128 when reading table './schema/tablex'
[ERROR] Got error 180 when reading table './schema/tablex'

Fixed using optimize table

Proposed resolution

At the very least the documentation should be modified to warn this may happen

Preferably:

default to just do check, take no action even for corrupt tables or primary keys. In a production environment the DBA must be able to take any decision that impacts availability

Add switches:

--mark_corrupt_table
--mark_corrupt_sec_idx



 Comments   
Comment by Marko Mäkelä [ 2021-03-22 ]

Do these corruptions occur after upgrading to a release that includes a fix of MDEV-24449, and after rebuilding the affected tables?

Comment by Marko Mäkelä [ 2021-04-26 ]

The description in https://dev.mysql.com/doc/refman/5.7/en/check-table.html#check-table-innodb seems to mostly apply to the InnoDB implementation in MariaDB as well.
That documentation mentions SPATIAL INDEX, which were introduced in MySQL 5.7 and copied to MariaDB 10.2.2. They are fundamentally broken; see e.g. MDEV-15284.

My understanding is that the ability of InnoDB to permanently mark indexes or tables as corrupted was a misguided attempt to fix the infamous counterpart of MDEV-13542. Unfortunately, MariaDB Server does include that code.

Comment by Sergei Golubchik [ 2021-06-05 ]

Documentation clearly says

If CHECK TABLE finds an error in an InnoDB table, MariaDB might shutdown to prevent the error propagation. [...] Otherwise, since MariaDB 5.5, the table or an index might be marked as corrupted, to prevent use.

The way I understand it, "marking table as corrupt" is a documented behavior, and cannot be considered unexpected.

Comment by Marko Mäkelä [ 2023-03-15 ]

As of MDEV-24402, the clustered index will never be marked as corrupted, to allow data to be extracted from the table, or the table to be rebuilt.

Generated at Thu Feb 08 09:28:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.