Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30715

upgrade from 10.3 -> 10.6 corrupts tables

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Incomplete
    • 10.6.12
    • N/A
    • Encryption, Server
    • None
    • Ubuntu 20.04.4 LTS \n \l
      10.6.12-MariaDB-1:10.6.12+maria~ubu2004-log

    Description

      Hi,

      I have had several tries upgrading our 10.3.X machines to 10.6.12 but each time the tables end up being corrupted.

      It starts with showing a lot of these NOTE log lines for random databases and tables:

      2023-02-23  0:22:54 18 [Note] InnoDB: Cannot close file ./database/table.ibd because of 8 pending operations and pending fsync
      

      later it transistions into a spam of for about 2-3 seconds:

      2023-02-23  0:23:01 19 [ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=30301, page number=3], should be [page id: space=67449, page number=3]
      

      which leads to the corruption error:

      2023-02-23  0:23:02 19 [ERROR] InnoDB: Table `database`.`table` /* Partition `2023_01_25` */ is corrupted. Please drop the table and recreate.
      2023-02-23  0:23:02 19 [ERROR] Failed to open table database/table#P#2023_01_25.
       
      2023-02-23  0:23:02 19 [ERROR] Slave SQL: Error 'Table 'database.table' doesn't exist in engine' on query. Default database: 'database'. Query: 'ALTER TABLE table DROP PARTITION 2023_01_24', Gtid 0-10-11605127429, Internal MariaDB error code: 1932
      2023-02-23  0:23:02 19 [Warning] Slave: Table 'database.table' doesn't exist in engine Error_code: 1932
      2023-02-23  0:23:02 19 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.035937' position 990531724
      2023-02-23  0:23:02 16 [ERROR] Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
      ...
      2023-02-23  0:23:02 16 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
      ...
      2023-02-23  0:23:02 16 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.035937' position 990531724
      2023-02-23  0:23:02 17 [ERROR] Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
      
      

      then there's multiple database and table combinations which give the same error:

      2023-02-23  2:33:30 1005741 [ERROR] InnoDB: Table `databaseX`.`tableX` is corrupted. Please drop the table and recreate.
      

      with more different errors:

      2023-02-23  3:01:29 1008412 [Warning] InnoDB: 16384 bytes should have been read at 393216 from ./database2/table2.ibd, but got only 0. Retrying.
      2023-02-23  3:01:29 1008412 [Warning] InnoDB: Retry attempts for reading partial data failed.
      

      It's a lot of logs lines and hope it's clear I changed database and tables names to keep them private.

      this specific upgrade was from 10.3.36+maria~ubu2004 to 10.6.12+maria~ubu2004 and was on a slave from a master with version: 10.3.36-MariaDB-1:10.3.36+maria~ubu2004-log

      we use innodb encryption with the file_key_management plugin

      my.cnf is attached

      Attachments

        Activity

          Hi!

          I would suggest that you run mysqlcheck --extended (see MDEV-30129) immediately after upgrading from 10.3.

          I’d suspect that the corruption is related to encryption in some way. How many times was the server restarted before the corruption started to appear? Could you attach the complete server error log?

          The warning about being unable to close a file is most likely not directly related to the corruption. See MDEV-25215 for some explanation on that.

          marko Marko Mäkelä added a comment - Hi! I would suggest that you run mysqlcheck --extended (see MDEV-30129 ) immediately after upgrading from 10.3. I’d suspect that the corruption is related to encryption in some way. How many times was the server restarted before the corruption started to appear? Could you attach the complete server error log? The warning about being unable to close a file is most likely not directly related to the corruption. See MDEV-25215 for some explanation on that.

          People

            Unassigned Unassigned
            Hemera Martijn Smidt
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.