[MDEV-31107] InnoDB tables deleted Created: 2023-04-21  Updated: 2023-04-21

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.8.3
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Nuno Ventura Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Rocky Linux Operating System running on VMware Workstation



 Description   

Hello.
Using MariaDB for a few months without any problem, last Thursday all tables except one disappeared from one of the databases.
I didn't have binary log active. (sad, I know)
Trying to repair the tables didn’t work with error that the tables don’t exist.
Trying to innodb_force_recover until 6 (all tables are in innodb format), the system recovered two other tables, but without data.

From the logs I’m almost sure there was no hacking attack and I’m sure I don’t have any drop command from any application, and I also check HeidiSQL logs for any command like a drop or something that could do this and I didn’t found nothing.
I recovered most of the data from last backup and some data from my application logs, and in the next morning the plant could restart the production. (one problem during restore is the disable keys command doesn’t work on innodb tables and so some data was not restored)

Now I need your tips/help for the following:

  • I need to explain to the client and to my boss what occurred. (using mySQL for 2 decades never had nothing like this). There is any issue that can affect this version? What log files should I check?
  • I need to take some measures to ensure that it will not happen again in the future. Any advice?
    Thank you.
    Nuno Ventura


 Comments   
Comment by Marko Mäkelä [ 2023-04-21 ]

For reasons explained in MDEV-27199, the setting innodb_force_recovery=6 should really never be used; it is a sure way to corrupt your data in unpredictable ways.

No REPAIR functionality has been implemented for InnoDB yet. Heikki Tuuri said to me at least a couple of times that he does not believe in it; users should have periodic backups (with the procedure being tested), and InnoDB should be crash-safe to begin with. Maybe in MariaDB, we are approaching that (by finally fixing MDEV-13542 and related crash-on-corruption bugs), but there are cases where OPTIMIZE TABLE or a SELECT from a corrupted table would not work.

Coincidentally, some advice that was given by a former MySQL employee in https://blog.jcole.us/2023/04/18/the-customer-is-always-wrong/ might apply in cases like this.

Comment by Nuno Ventura [ 2023-04-21 ]

@Mako Mäkelä thanks for your answer.

I was (more or less) aware of the dangers of using innodb_force_recovery with high values, but I just did it after making a copy of ibdata1 file and because I wanted to try to recover the data that was in database after the last backup (that is done daily) and restoring the backup would loose the data since the backup (as I don't have binary logs).
I was told that the innoDB was crash safe, and the ibdata1 file was with the same size as before, so all the tables should be there will all the data.

The article from Geremy Cole has good advice, I feel I'm not too far from his point of view, I didn't make a bigger mess, I applied first aid, they are working, but now I really need to find the route cause, so I can apply a definitive solution. I made a copy of the full rocky linux VM to my test server so I can investigate without disturbing the costumer or cause any risk to the real data.

Any further suggestions on what to search for?
How to check if it was a hardware error, disk, network, etc?
How to be sure if there was no hacker connected?
How to check if it was a drop command, from what IP or what user?
Any tool to read the ib_logfile0 ?

Comment by Marko Mäkelä [ 2023-04-21 ]

If you had a backup of the broken state right before any attempted repair, restoring that could allow the root cause to be analyzed better. This is a bug reporting system, and I am an InnoDB developer, not a support engineer. Our redo log format after my last revision (MDEV-14425) is pretty simple to read. Perhaps some tweaks to log0recv.cc could be applied to try to parse all data that is there, even before the latest recorded checkpoints. There should be FILE_DELETE records when InnoDB data files are deleted. But, the InnoDB redo log does not include any SQL statements, it is at a very low level.

Comment by Nuno Ventura [ 2023-04-21 ]

Sorry, @Marko Mäkelä, maybe this is not even a bug. I still don't know.
If someone can help or maybe tell me a better forum where I can get some ideas....
Thank you

Generated at Thu Feb 08 10:21:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.