[MDEV-32174] ROW_FORMAT=COMPRESSED table corruption due to ROLLBACK - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Duplicate
Affects Version/s: 10.11.3, 10.6.12
Fix Version/s: 10.6.20, 10.11.10, 11.2.6, 11.4.4, 11.6.2, 11.7.1, 11.8.0
Component/s: Data Manipulation - Insert, Storage Engine - InnoDB
Labels:
None
Environment:

Hide
  OS: Ubuntu 22.04 (Reproduced on both an azure VM, as well as a regular workstation laptop.)
  Mariadb: 10.6.12 and 10.11.3 (And presumably all versions in between).
  mydumper: 0.10.0 (If relevant, the problem occured using all different version of mydumper)

Show
  OS: Ubuntu 22.04 (Reproduced on both an azure VM, as well as a regular workstation laptop.)   Mariadb: 10.6.12 and 10.11.3 (And presumably all versions in between).   mydumper: 0.10.0 (If relevant, the problem occured using all different version of mydumper)

Description

After upgrading to ubuntu 22.04, and alongside that to mariadb 10.6, we started experiencing table corruption during our nightly restores. Because these restores are to staging servers loss of data was acceptable, so we fixed the problem by simply removing the corrupted table. But this obviously won't fly should we ever have to do a production restore (Which is very rarely).
The database portion of the restores specifically are done using mydumper with the default amount of 4 threads.
What makes the problem particularly nasty is that it cannot be consistently reproduced, we ourselves noticed it seemed to happen in roughly 1 out of every 7 restores (Where each restore takes about ~40-50 minutes).
This made us think it was related to parallelism. So we tried running mydumper single-threaded, which did not solve the problem.
We have also tried upgrading to various versions of mariadb/mydumper, most notably:

mariadb	mydumper
10.6.12	0.15
10.11.3	0.10
10.11.3	0.15

But with all of the above version combinations the problem still occurred
Eventually we found that we could no longer reproduce the problem while running mariadb 10.5.21 with mydumper 0.10, but we are still unsure of the underlying cause.

Provided files:
The included table structure is just one table of our db, as we were able to reproduce it by only restoring backups of this table.
Because it is quite time consuming to reproduce and we would have to generate a large amount of dummy data (Existing dumps all contain customer data), we have not included a database dump for now.
But we still wanted to create this bug report just in case you might already see something strange based on the included table structure and error log.
In case there is no immediate apparent problem and you still want to look further into this, we would of course be happy to provide a database dump.

Update:
After running on 10.6.15 I was again able to reproduce it. I generated a stacktrace form the core dump (mariadbd_full_bt_all_threads.txt)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

configuration.txt
78 kB
2023-09-15 06:28
mariadbd_full_bt_all_threads.txt
149 kB
2023-09-15 13:28
MDEV-32174_ps.txt
335 kB
2024-01-30 02:46
restore-failure-error-log.txt
20 kB
2023-09-15 06:28
syslog-restore-10.6.15.txt
24 kB
2023-09-15 10:39
table_structure.txt
1 kB
2023-09-15 06:28

Issue Links

duplicates

MDEV-34453 Trying to read 16384 bytes at 70368744161280 outside the bounds of the file: ./ibdata1

Closed

relates to

MDEV-16281 Implement parallel CREATE INDEX, ALTER TABLE, or bulk load

Open

MDEV-31441 BLOB corruption on UPDATE of PRIMARY KEY with FOREIGN KEY

Closed

MDEV-31817 SIGSEGV after btr_page_get_father_block() returns nullptr on corrupted data

Closed

MDEV-35413 InnoDB: Cannot load compressed BLOB (ROW_FORMAT=COMPRESSED table)

Closed

MDEV-30882 Crash on ROLLBACK of DELETE or UPDATE in a ROW_FORMAT=COMPRESSED table

Closed

MDEV-35779 Index Corruption on Database Restore

Closed

(2 relates to)

Activity

Ascending order - Click to sort in descending order

View 44 older comments

Marko Mäkelä added a comment - 2025-02-27 14:01

bjhilbrands, MariaDB Server 10.6.18 should be affected by the race condition that was fixed in ~~MDEV-34453~~ (MariaDB Server 10.6.20). Can you reproduce the corruption with 10.6.20 or 10.6.21 or a later release?

Marko Mäkelä added a comment - 2025-02-27 14:01 bjhilbrands , MariaDB Server 10.6.18 should be affected by the race condition that was fixed in MDEV-34453 (MariaDB Server 10.6.20). Can you reproduce the corruption with 10.6.20 or 10.6.21 or a later release?

Bart-Jan Hilbrands added a comment - 2025-03-03 06:59

I set up a pipeline that restores database backups 24/7; i'll report back next week if the issue hasn't been reproduced by then

Bart-Jan Hilbrands added a comment - 2025-03-03 06:59 I set up a pipeline that restores database backups 24/7; i'll report back next week if the issue hasn't been reproduced by then

Bart-Jan Hilbrands added a comment - 2025-03-04 13:46

It is still looking promising so far! I have another question on which other versions of mariadb this might affect/have affected. We are upgrading to ubuntu 24.04 this year, and the current mariadb version in the ubuntu package repository is 10.11.8. Do you know if we might also run into this problem using that version, or that we would have install a later version of 10.11?

Bart-Jan Hilbrands added a comment - 2025-03-04 13:46 It is still looking promising so far! I have another question on which other versions of mariadb this might affect/have affected. We are upgrading to ubuntu 24.04 this year, and the current mariadb version in the ubuntu package repository is 10.11.8. Do you know if we might also run into this problem using that version, or that we would have install a later version of 10.11?

Bart-Jan Hilbrands added a comment - 2025-03-10 08:48 - edited

Alright, it has been a week of nonstop restores now and the problem has not reproduced yet with version 10.6.21. I think we can pretty safely say that the problem has been resolved, thanks for the effort!

Bart-Jan Hilbrands added a comment - 2025-03-10 08:48 - edited Alright, it has been a week of nonstop restores now and the problem has not reproduced yet with version 10.6.21. I think we can pretty safely say that the problem has been resolved, thanks for the effort!

Marko Mäkelä added a comment - 2025-03-10 09:04

Thank you for the testing. I think that we can declare this bug a duplicate of ~~MDEV-34453~~.

Marko Mäkelä added a comment - 2025-03-10 09:04 Thank you for the testing. I think that we can declare this bug a duplicate of MDEV-34453 .

People

Assignee:: Marko Mäkelä

Reporter:: Bart-Jan Hilbrands

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 2023-09-15 06:29

Updated:: 2025-03-10 09:04

Resolved:: 2025-03-10 09:04

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.