We upgraded 5 slaves and 1 master from MariaDB 10.3.32 to MariaDB 10.6.8
Here are error, issue, documentation suggestions we encountered.
4 of 5 of the slaves had the following error:
preventing startup after upgrade. so mariadb_upgrade could not be run and server could not be started
Therefore we downgraded back to 10.3.32 and ran MySQL commands
then shutdown then upgraded to 10.6.8 again with the same problem.
We had to delete the ib_logfile0 file, upgrade to 10.6.8 and restart to upgrade.
Here are related logs:
Here are related my.cnf settings:
What we did not try: Stop slave replication manually before shutdown and then upgrade.
Other notes: We had major problems in the past before changing innodb_change_buffering to none. Basically restarting slaves broke some index in the database nearly all the time. We ran scripts to check all tables rebuild all the broken table indexes using OPTIMIZE TABLE. On the other hand one of the servers that did not get this error was one rebuild recently from mariadbbackup. However the backup server running the mariadbbackup had the same error upgrading. Just adding in case some recovery we did broke the database before upgrade, but we don't think that is the issue.
Other notes: Mariadb database is about 1.5 TB, with peak about 50,000 selects per second, 50 deletes per second, 500 inserts per second, 500 updates per second on master, with averages over half the peak
On four out of five of the slaves after upgrade, we got
one server had it twenty times in the last month, one server had it four times, one server had it twice, one server had it once.
CHECK TABLE TABLENAMEREDACTED shows no error.
percona checksum showed no data difference.
There was no other slave replication problems.
TABLENAMEREDACTED is the same table on all servers, and it is actually relatively not a big table or busy table.
We performed OPTIMIZE TABLE TABLENAMEREDACTED on the master and slaves to recreate table just in case.
However, we are still gettings errors on slaves with the same table, even after performing OPTIMIZE TABLE TABLENAMEREDACTED.
Also, percona checksum still shows no data difference between servers.
CHECK TABLE TABLENAMEREDACTED still shows no error.
There are still no other slave replication problems.
Other notes: all servers are runing with the following setting which enable parallel replication.
The documentation in MariaDB suggests running mariadb_upgrade after updating MariaDB.
I would suggest putting in the documentation to restart mariadb after running mariadb_upgrade as that seems to potentially remove some problems that may occur because of startup without the fixes mariadb_upgrade does.