[MDEV-30134] buf_page_t::unfix(): Assertion `!((f ^ (f - 1)) & LRU_MASK)' failed Created: 2022-11-30  Updated: 2023-02-21  Resolved: 2023-02-16

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6, 10.7, 10.8, 10.9, 10.10, 10.11
Fix Version/s: 10.11.3, 11.0.1, 10.6.13, 10.7.8, 10.8.8, 10.9.6, 10.10.4

Type: Bug Priority: Critical
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: crash

Attachments: File TBR-1258.opt     File TBR-1258.result     File TBR-1258.test    
Issue Links:
Relates
relates to MDEV-26917 InnoDB: Clustered record for sec rec ... Closed
relates to MDEV-27058 Buffer page descriptors are too large Closed
relates to MDEV-27734 Set innodb_change_buffering=none by d... Closed
relates to MDEV-27735 Deprecate the parameter innodb_change... Closed
relates to MDEV-29694 Remove the InnoDB change buffer Closed

 Description   

The attached test case provided by mleich rather easily reproduces an assertion failure related to the change buffer, on all server versions 10.6 through 10.11.

The involved code was refactored as part of MDEV-27058, but it is yet unclear if this failure is a regression starting with 10.6.

The change buffer was disabled by default in MDEV-27734, deprecated in MDEV-27735, and i t is scheduled for removal in MDEV-29694. With MDEV-29694 present, the test will not crash.

The assertion fails with various stack traces, related to buffering purge operations (not inserts or delete-mark operations).



 Comments   
Comment by Marko Mäkelä [ 2023-01-20 ]

This can sometimes be reproduced with

./mtr --mysqld=--loose-innodb-change-buffering{=all,-debug=1} innodb.innodb_defragment

Comment by Marko Mäkelä [ 2023-01-23 ]

This is also occasionally reproducible with

./mtr --mysqld=--loose-innodb-change-buffering{=all,-debug=1} innodb.ibuf_delete

Comment by Marko Mäkelä [ 2023-02-09 ]

I just tried to reproduce this on 10.6 39f46745995939e17678d3c2f030f625d5bc41c2 (one commit before MDEV-30400), but failed so far. The only thing that I reproduced was a server hang due to a bug in innodb_change_buffering_debug=1 that was fixed in MDEV-30400.

Comment by Marko Mäkelä [ 2023-02-10 ]

Because mleich informed me that he last reproduced this on a development branch of MDEV-30148 on December 2, I created a fix based on its 10.6 parent commit for testing. I was not able to reproduce the failure myself today, either with that supposed fix or its 10.6 parent commit.

As far as I can tell, this a bug specific to the debug-only parameter innodb_change_buffering_debug=1. But, there is some room for simplifying some code around this. To my understanding, a similar bug affects 10.5 as well, but the assertion expression would be count != 0 before this data structure was refactored in MDEV-27058.

It might be much harder to reproduce the failure in 10.5 or older versions. While I developed a 10.5 version of the fix, I don’t think that it is feasible to apply it, if this really only affects a debug parameter, and if we are unable to reproduce the failure in the first place. Starting with 10.6, MDEV-21452, MDEV-24142 and similar changes could dramatically change some timing around this.

Generated at Thu Feb 08 10:13:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.