[MDEV-15912] InnoDB: Failing assertion: purge_sys.tail.commit <= purge_sys.rseg->last_commit upon upgrade from 10.0 or 10.1 to 10.3 Created: 2018-04-17 Updated: 2023-09-27 Resolved: 2021-06-21 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.3, 10.4 |
| Fix Version/s: | 10.3.30, 10.4.20, 10.5.11, 10.6.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | affects-tests, compat56, compat57, regression, upgrade | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
10.3 server crashes with a non-debug assertion failure when it starts on the attached datadir.
Then the current 10.3 server is started on the same datadir. It starts, but crashes immediately afterwards as below.
This current test was run with --innodb-page-size=8K --loose-innodb_log_compressed_pages=on --loose-innodb-change-buffering=none, I'm not sure whether any of them important. Naturally, to reproduce the crash on the attached datadir, the server needs to be also started with --innodb-page-size=8K, other two options don't make a difference; otherwise all defaults. ib_logfile-s are compressed and attached separately just to overcome the 10M limitation in JIRA. I don't know if they are needed, the crash happens with and without them. Similar-looking crashes upon upgrade from 10.1 have also been observed before. |
| Comments |
| Comment by Ben Anderson [ 2018-07-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
We're also seeing this issue under similar circumstances. We've got a 10.0 data dir which should have been shutdown cleanly but may not have in certain situations. Installing mariadb-server 10.3 and running mysql_upgrade works fine, and the server will boot. If the server is running enough or a certain action is performed (i'm not sure which, don't know enough), a purge will start. This will cause the crash. With some help from `dragonheart` on IRC, I installed debug symbols and gdb'd. Here is some clumsy GDB info that may help.
I then did some reading around and noticed we had missed a step in the MariaDB upgrade notes that for one of the version (i've lost the page now, but it might be have been 10.0 -> 10.1) it recommends doing a shutdown with innodb_fast_shutdown=0 before doing the upgrade. I tried this, and booted back up with 10.3, ran the mysql_upgrade and then booted 10.3, left it running under some workloads (very minor) and it appears to be stable. So - this is a case of user error, but I don't think MariaDB should be crashing quite this badly in this case. I'm keen to help debug any more if it means a better error message or something handled. I've watched this thread, if anyone needs more info. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-07-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
ben.anderson, MariaDB upgrade is supposed to work even if the undo logs are not empty. Tests for that were implemented in In MariaDB 10.3, there are two major changes to the undo log format.
There was also a minor change to the undo log format, in Note: A shutdown after SET GLOBAL innodb_fast_shutdown=0 may fail to empty some undo logs. This was fixed in MariaDB 10.3.6 only: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-07-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
In the mdev15912_data.bgz
If I start with innodb_force_recovery=3, then the server will not crash. But, this will not only prevent the rollback of recovered transactions but also the purge of old history (which is where the assertion would fail). With a source code modification we can only disable the rollback. This will let the server start up successfully:
So, the failure must be somehow related to the rollback of the recovered transaction. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-07-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
ben.anderson reported:
Also in mdev15912_data.bgz
The two-component last_commit field (trx->no << 1 | !old_insert) was introduced in | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-07-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The last hunk is not emitting any message, and the dataset is still crashing. (Note: it is necessary to restore the data directory from mdev15912_data.bgz | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-08-31 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
We may have to introduce trx_rseg_t::old_insert_cached and some separate mechanism that will guarantee that old insert_undo pages from before the upgrade to 10.3 (or later) will eventually be freed, without interfering with the purge of transaction history. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-05-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
As described in It might be possible to remove the old_insert data members altogether. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-06-21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
We will refuse an upgrade from older versions than MariaDB 10.3 if a clean shutdown was not performed, so that the undo logs would be emptied:
There will be only one persistent undo log per transaction. No main-memory data structures related to the partitioned persistent undo log before | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Hans Borresen [ 2021-06-21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
It looks like this just got pulled into the 10.3 branch, but it breaks upgrades for anyone trying to go from 5.7 -> 10.3.30, or even 5.7 -> 10.3.29 -> 10.3.30 I started a topic for this in the zulip chat. Log snippet, just in case it is helpful:
Edit: I went ahead and filed | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-06-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
hborresen, thank you. I should have actually tested the upgrade. As I expected, fixing |