Details
-
Bug
-
Status: In Testing (View Workflow)
-
Critical
-
Resolution: Unresolved
-
10.11.3, 10.6.12
-
None
Description
After upgrading to ubuntu 22.04, and alongside that to mariadb 10.6, we started experiencing table corruption during our nightly restores. Because these restores are to staging servers loss of data was acceptable, so we fixed the problem by simply removing the corrupted table. But this obviously won't fly should we ever have to do a production restore (Which is very rarely).
The database portion of the restores specifically are done using mydumper with the default amount of 4 threads.
What makes the problem particularly nasty is that it cannot be consistently reproduced, we ourselves noticed it seemed to happen in roughly 1 out of every 7 restores (Where each restore takes about ~40-50 minutes).
This made us think it was related to parallelism. So we tried running mydumper single-threaded, which did not solve the problem.
We have also tried upgrading to various versions of mariadb/mydumper, most notably:
mariadb | mydumper |
10.6.12 | 0.15 |
10.11.3 | 0.10 |
10.11.3 | 0.15 |
But with all of the above version combinations the problem still occurred
Eventually we found that we could no longer reproduce the problem while running mariadb 10.5.21 with mydumper 0.10, but we are still unsure of the underlying cause.
Provided files:
The included table structure is just one table of our db, as we were able to reproduce it by only restoring backups of this table.
Because it is quite time consuming to reproduce and we would have to generate a large amount of dummy data (Existing dumps all contain customer data), we have not included a database dump for now.
But we still wanted to create this bug report just in case you might already see something strange based on the included table structure and error log.
In case there is no immediate apparent problem and you still want to look further into this, we would of course be happy to provide a database dump.
Update:
After running on 10.6.15 I was again able to reproduce it. I generated a stacktrace form the core dump (mariadbd_full_bt_all_threads.txt)
Attachments
Issue Links
- relates to
-
MDEV-16281 Implement parallel CREATE INDEX, ALTER TABLE, or bulk load
- Open
-
MDEV-31441 BLOB corruption on UPDATE of PRIMARY KEY with FOREIGN KEY
- Closed
-
MDEV-31817 SIGSEGV after btr_page_get_father_block() returns nullptr on corrupted data
- Closed
-
MDEV-30882 Crash on ROLLBACK of DELETE or UPDATE in a ROW_FORMAT=COMPRESSED table
- Closed