Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32174

ROW_FORMAT=COMPRESSED table corruption due to ROLLBACK

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Duplicate
    • 10.11.3, 10.6.12
    • None

    Description

      After upgrading to ubuntu 22.04, and alongside that to mariadb 10.6, we started experiencing table corruption during our nightly restores. Because these restores are to staging servers loss of data was acceptable, so we fixed the problem by simply removing the corrupted table. But this obviously won't fly should we ever have to do a production restore (Which is very rarely).
      The database portion of the restores specifically are done using mydumper with the default amount of 4 threads.
      What makes the problem particularly nasty is that it cannot be consistently reproduced, we ourselves noticed it seemed to happen in roughly 1 out of every 7 restores (Where each restore takes about ~40-50 minutes).
      This made us think it was related to parallelism. So we tried running mydumper single-threaded, which did not solve the problem.
      We have also tried upgrading to various versions of mariadb/mydumper, most notably:

      mariadb mydumper
      10.6.12 0.15
      10.11.3 0.10
      10.11.3 0.15

      But with all of the above version combinations the problem still occurred
      Eventually we found that we could no longer reproduce the problem while running mariadb 10.5.21 with mydumper 0.10, but we are still unsure of the underlying cause.

      Provided files:
      The included table structure is just one table of our db, as we were able to reproduce it by only restoring backups of this table.
      Because it is quite time consuming to reproduce and we would have to generate a large amount of dummy data (Existing dumps all contain customer data), we have not included a database dump for now.
      But we still wanted to create this bug report just in case you might already see something strange based on the included table structure and error log.
      In case there is no immediate apparent problem and you still want to look further into this, we would of course be happy to provide a database dump.

      Update:
      After running on 10.6.15 I was again able to reproduce it. I generated a stacktrace form the core dump (mariadbd_full_bt_all_threads.txt)

      Attachments

        1. configuration.txt
          78 kB
        2. mariadbd_full_bt_all_threads.txt
          149 kB
        3. MDEV-32174_ps.txt
          335 kB
        4. restore-failure-error-log.txt
          20 kB
        5. syslog-restore-10.6.15.txt
          24 kB
        6. table_structure.txt
          1 kB

        Issue Links

          Activity

            bjhilbrands, MariaDB Server 10.6.18 should be affected by the race condition that was fixed in MDEV-34453 (MariaDB Server 10.6.20). Can you reproduce the corruption with 10.6.20 or 10.6.21 or a later release?

            marko Marko Mäkelä added a comment - bjhilbrands , MariaDB Server 10.6.18 should be affected by the race condition that was fixed in MDEV-34453 (MariaDB Server 10.6.20). Can you reproduce the corruption with 10.6.20 or 10.6.21 or a later release?

            I set up a pipeline that restores database backups 24/7; i'll report back next week if the issue hasn't been reproduced by then

            bjhilbrands Bart-Jan Hilbrands added a comment - I set up a pipeline that restores database backups 24/7; i'll report back next week if the issue hasn't been reproduced by then

            It is still looking promising so far! I have another question on which other versions of mariadb this might affect/have affected. We are upgrading to ubuntu 24.04 this year, and the current mariadb version in the ubuntu package repository is 10.11.8. Do you know if we might also run into this problem using that version, or that we would have install a later version of 10.11?

            bjhilbrands Bart-Jan Hilbrands added a comment - It is still looking promising so far! I have another question on which other versions of mariadb this might affect/have affected. We are upgrading to ubuntu 24.04 this year, and the current mariadb version in the ubuntu package repository is 10.11.8. Do you know if we might also run into this problem using that version, or that we would have install a later version of 10.11?
            bjhilbrands Bart-Jan Hilbrands added a comment - - edited

            Alright, it has been a week of nonstop restores now and the problem has not reproduced yet with version 10.6.21. I think we can pretty safely say that the problem has been resolved, thanks for the effort!

            bjhilbrands Bart-Jan Hilbrands added a comment - - edited Alright, it has been a week of nonstop restores now and the problem has not reproduced yet with version 10.6.21. I think we can pretty safely say that the problem has been resolved, thanks for the effort!

            Thank you for the testing. I think that we can declare this bug a duplicate of MDEV-34453.

            marko Marko Mäkelä added a comment - Thank you for the testing. I think that we can declare this bug a duplicate of MDEV-34453 .

            People

              marko Marko Mäkelä
              bjhilbrands Bart-Jan Hilbrands
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.