[MDEV-30715] upgrade from 10.3 -> 10.6 corrupts tables Created: 2023-02-23  Updated: 2023-05-02  Resolved: 2023-05-02

Status: Closed
Project: MariaDB Server
Component/s: Encryption, Server
Affects Version/s: 10.6.12
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Martijn Smidt Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: None
Environment:

Ubuntu 20.04.4 LTS \n \l
10.6.12-MariaDB-1:10.6.12+maria~ubu2004-log


Attachments: File my.cnf    

 Description   

Hi,

I have had several tries upgrading our 10.3.X machines to 10.6.12 but each time the tables end up being corrupted.

It starts with showing a lot of these NOTE log lines for random databases and tables:

2023-02-23  0:22:54 18 [Note] InnoDB: Cannot close file ./database/table.ibd because of 8 pending operations and pending fsync

later it transistions into a spam of for about 2-3 seconds:

2023-02-23  0:23:01 19 [ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=30301, page number=3], should be [page id: space=67449, page number=3]

which leads to the corruption error:

2023-02-23  0:23:02 19 [ERROR] InnoDB: Table `database`.`table` /* Partition `2023_01_25` */ is corrupted. Please drop the table and recreate.
2023-02-23  0:23:02 19 [ERROR] Failed to open table database/table#P#2023_01_25.
 
2023-02-23  0:23:02 19 [ERROR] Slave SQL: Error 'Table 'database.table' doesn't exist in engine' on query. Default database: 'database'. Query: 'ALTER TABLE table DROP PARTITION 2023_01_24', Gtid 0-10-11605127429, Internal MariaDB error code: 1932
2023-02-23  0:23:02 19 [Warning] Slave: Table 'database.table' doesn't exist in engine Error_code: 1932
2023-02-23  0:23:02 19 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.035937' position 990531724
2023-02-23  0:23:02 16 [ERROR] Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
...
2023-02-23  0:23:02 16 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
...
2023-02-23  0:23:02 16 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.035937' position 990531724
2023-02-23  0:23:02 17 [ERROR] Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964

then there's multiple database and table combinations which give the same error:

2023-02-23  2:33:30 1005741 [ERROR] InnoDB: Table `databaseX`.`tableX` is corrupted. Please drop the table and recreate.

with more different errors:

2023-02-23  3:01:29 1008412 [Warning] InnoDB: 16384 bytes should have been read at 393216 from ./database2/table2.ibd, but got only 0. Retrying.
2023-02-23  3:01:29 1008412 [Warning] InnoDB: Retry attempts for reading partial data failed.

It's a lot of logs lines and hope it's clear I changed database and tables names to keep them private.

this specific upgrade was from 10.3.36+maria~ubu2004 to 10.6.12+maria~ubu2004 and was on a slave from a master with version: 10.3.36-MariaDB-1:10.3.36+maria~ubu2004-log

we use innodb encryption with the file_key_management plugin

my.cnf is attached



 Comments   
Comment by Marko Mäkelä [ 2023-03-27 ]

Hi!

I would suggest that you run mysqlcheck --extended (see MDEV-30129) immediately after upgrading from 10.3.

I’d suspect that the corruption is related to encryption in some way. How many times was the server restarted before the corruption started to appear? Could you attach the complete server error log?

The warning about being unable to close a file is most likely not directly related to the corruption. See MDEV-25215 for some explanation on that.

Generated at Thu Feb 08 10:18:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.