[MDEV-32544] Setting innodb_buffer_pool_size to the maximum value can cause drastic performance degradation Created: 2023-10-23 Updated: 2023-11-22 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.4.32, 10.5.23, 10.6.16, 10.11.5 |
| Fix Version/s: | 10.4, 10.5, 10.6, 10.11 |
| Type: | Bug | Priority: | Major |
| Reporter: | fuying | Assignee: | Daniel Black |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
docker mariadb:10.11 |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Description |
|
docker run --restart=always --name mariadb1011enginevalidate -e MYSQL_ROOT_PASSWORD=123456 -d mariadb:10.11 > create database test; When executing source select.sql normally, you will get that the empty set takes about 0.002s. https://mariadb.com/kb/en/innodb-system-variables/#innodb_buffer_pool_size |
| Comments |
| Comment by Marko Mäkelä [ 2023-10-23 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
It would seem to me that the maximum allowed value of innodb_buffer_pool_size on 64-bit systems is too large. I believe that the virtual address size on typical AMD64 or ARMv8 implementations is only 48 bits (256 TiB). I do not know of other architectures, such as POWER, s390x or RISC-V. I think that the buffer pool resizing will have to be refactored as part of MDEV-29445. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-10-23 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Were any messages written to the server error log about the buffer pool resizing? I would expect the virtual memory allocation to fail as soon as the virtual memory size of the process approaches 256 TiB. A design constraint is that innodb_buffer_pool_size should not exceed about 80% of the available physical memory. This is not being enforced. The slowness could simply be because the system started to swap memory. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by fuying [ 2023-10-23 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
It seems that no server error log... | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2023-11-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
danblack, where should be an error log be when running a docker image (as in the bug description)? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2023-11-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The server default is stderr, so that ends up in the container log for the container retrievable with:
ying you've changed your init_tables.sql file in the collations so there are invalid collations there.
I tried the following (but the last is wrong for charset binary)
In addition to what's already been said, the TLB of CPUs only support having certain amount of mapping inside the MMU of the CPU. Where more virtual space than the CPU supports has been allocated, any access to that memory, hits a memory fault, the context switch goes to the kernel and a memory mapping looking on RAM occurs. Taken to the extreme this access via RAM being so many cycles slower than MMU accessing lookups could add up to the additional time. You could look as some perf measurements on TLB counters and context switchers while doing the SELECT in both cases. It would still be interesting to see where CPU time is spent. Can you please update the init_table.sql file to something that can be executed. Also related to buffer chunk size, this is based on the initial innodb-buffer-pool size. The larger the pool, the larger the chunksize (for quicker resizing). This might change the execution. If you start the container with command --innodb_buffer_pool_size=9223372036854775807 (after the image name on the docker run execution). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by fuying [ 2023-11-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you both very much for your replies! Here are my creation steps After creating the initial database state, I tested and still have this problem: >SET GLOBAL innodb_buffer_pool_size = 9223372036854775807; | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2023-11-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Much easier thanks. The logs from SET GLOBAL innodb_buffer_pool_size = 9223372036854775807
Do it again
and:
So the reason the query takes so long is the 60 seconds on retry in which there's buffer pool lock on the innodb buffer pool while resizing to prevent the query. However, there's obviously a bunch of contradictory state about the the resize failing "Cannot allocate", "Completed", but "Resizing failed". | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2023-11-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
10.4 doesn't delay on the the select as it doesn't keep trying. However the output message around "Completed to resize" needs correction..
Try behaviour starts at 10.5:
so its just that 10.11 gets nicer units and avoids negative sizes on memory. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2023-11-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
So:
10.5:
10.11:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
danblack, thank you, these suggestions sound reasonable to me. I agree that the ut_allocator::allocate() retry logic does not make any sense when allocating the buffer pool. If the small allocation of a buffer pool chunk fails, then the large allocation of the additional buffer pool memory is very likely to fail as well. This code would be rewritten in MDEV-29445, where I would like to move to a single contiguous range of virtual memory addresses for the buffer pool. |