[MDEV-20139] innodb.innodb_buffer_pool_dump_pct failed in buildbot with timeout in wait_condition.inc Created: 2019-07-23 Updated: 2023-12-12 Resolved: 2023-12-12 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB, Tests |
| Affects Version/s: | 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1, 11.2 |
| Fix Version/s: | 10.4.33, 10.5.24, 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Description |
|
http://buildbot.askmonty.org/buildbot/builders/kvm-bintar-xenial-amd64/builds/1311
|
| Comments |
| Comment by Alice Sherepa [ 2021-06-24 ] | ||||||||||||||||||||||||||||
|
on 10.6 101da87228f11a1d http://buildbot.askmonty.org/buildbot/builders/win32-debug/builds/19914/steps/test/logs/stdio win32
| ||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-29 ] | ||||||||||||||||||||||||||||
|
This test still fails regularly, several times per month. A simple fix would be to not use a separate thread for the buffer pool resizing, but simply make the SET GLOBAL innodb_buffer_pool_size thread to resize the buffer pool and return when done. ralf.gebhardt, would that be an acceptable change of behaviour in GA releases? | ||||||||||||||||||||||||||||
| Comment by Ralf Gebhardt [ 2023-12-11 ] | ||||||||||||||||||||||||||||
|
If this is a way to fix the bug while it is not breaking compatibility, why not? Is there anything special to this bug fix compared to others? Are there any user visible changes which would require changes in an application? | ||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-12-12 ] | ||||||||||||||||||||||||||||
|
Sorry, I meant that SET GLOBAL innodb_buffer_dump_now=ON could block until the operation has been completed. There is some similar logic around SET GLOBAL innodb_buffer_pool_size as well. Given that the failures are rather rare and possibly occur when the CI workers are under heavy load, I suspect that the failure probability could be reduced by the following:
That is, under some circumstances the time stamp in the status message (measured in seconds) does not advance even though the client attempted to sleep for 1 second. The sleep statement is implemented by do_sleep(), which invokes my_sleep(). It will invoke the Microsoft Windows Sleep() or POSIX select(2), ignoring any errors. On GNU/Linux, the timeout interval will be rounded up to the system clock granularity, so presumably the sleep should not be any shorter than requested. However, EINTR is documented as a possible error. On Microsoft Windows (where this test has failed as well), it is documented that the actual sleep duration may be less than the requested delay. I am not convinced that increasing the delay is a good approach. I think that it is better to first wait that the file ib_buffer_pool has been created, and then wait that the status message indicates completion. The actual ‘interesting part’ of the test is to show that the size of buffer pool dumps depends on the parameter innodb_buffer_pool_dump_pct. | ||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-12-12 ] | ||||||||||||||||||||||||||||
|
The test got more than 100× faster on Microsoft Windows Debug:
On my Debian GNU/Linux system, the test would at best complete in 14 or 19 milliseconds, for RelWithDebInfo and Debug. |