[MDEV-31953] madvise(frame, srv_page_size, MADV_FREE) is causing a performance regression Created: 2023-08-18 Updated: 2023-11-20 Resolved: 2023-11-18 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.11, 11.0, 11.1, 11.2, 11.3 |
| Fix Version/s: | 10.11.7, 11.0.5, 11.1.4, 11.2.3 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Marko Mäkelä | Assignee: | Daniel Black |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | performance, regression | ||
| Environment: |
GNU/Linux, various multi socket Intel CPU systems (NUMA) |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Description |
|
In If the call in buf_LRU_block_free_non_file_page() is committed out, the throughput in an I/O bound HammerDB benchmark would improve by 6% to 18%, depending on the CPU microarchitecture. It could be best to revert this code for now. A better solution might be doable as part of a larger refactoring (MDEV-25340, MDEV-29445). Instead of invoking madvise(MADV_FREE) on individual buffer pages, we might invoke it on larger chunks at a time. There could also be an explicit SQL statement for defragmenting or freeing the buffer pool. |
| Comments |
| Comment by Marko Mäkelä [ 2023-09-04 ] | ||||||||||||||||||
|
axel has confirmed a regression that the following patch would address:
| ||||||||||||||||||
| Comment by Axel Schwenke [ 2023-09-12 ] | ||||||||||||||||||
|
I tested the patch with TPC-C workload and 3 different sets of UNDO log specific settings. Baseline was MariaDB 10.11 @ commit 725bd568346 no undo log settings
innodb_max_purge_lag = 500000
innodb_undo_log_truncate = ON
So it was faster with the patch in any case. The UNDO log growth was not affected by this patch, nor was the general "jumpiness" of the troughput-over-time. For another workload, INSERT w/ AUTOCOMMIT the patch showed no effect. | ||||||||||||||||||
| Comment by Axel Schwenke [ 2023-09-12 ] | ||||||||||||||||||
|
The time series for the second scenario (where the performance difference was greatest): | ||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-09-15 ] | ||||||||||||||||||
|
https://github.com/MariaDB/server/commit/c1fd082e9c7369f4511eb5a52e58cb15489caa74 | ||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-09-15 ] | ||||||||||||||||||
|
I think that in any case, it could be useful to have some interface to ask InnoDB to collect any garbage from the buffer pool. Garbage would be left behind in particular by DDL operations. The thinking is that the LRU replacement mechanism will reuse those garbage pages later. This "laziness" allows operations like DROP, TRUNCATE or any table rebuild (OPTIMIZE, ALTER TABLE) to finish faster. But, it generally leads to "InnoDB buffer pool cannot shrink", which Possibly, we can reuse existing user interface for this:
That is, when you execute a "null" buffer pool resizing, InnoDB would scan the buf_pool.LRU list, evict any pages that belong to not-longer-existing tablespaces, and invoke madvise(MADV_FREE) or its Microsoft Windows equivalent on all page frame addresses that got freed. To keep it simple, this could be similar to the current approach (individual calls for each page frame, even when they are adjacent). During normal operation, there would be no madvise(MADV_FREE) calls. A refinement of this could be to "defragment" the buffer pool and invoke madvise(MADV_FREE) on the larger areas that got freed. This might be easier to do after the MDEV-29445 refactoring. | ||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-10-23 ] | ||||||||||||||||||
|
Note that the logic of the added function buf_pool_t::garbage_collect() will need to be revised based on | ||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-10-23 ] | ||||||||||||||||||
|
Hi marko. Maybe in the released versions we just remove the madvise? | ||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-10-23 ] | ||||||||||||||||||
|
wlad, I would prefer not to introduce new parameters whenever it can be avoided, because it is hard or impossible to remove parameters when they become obsolete. We may consider adding a new interface for triggering this garbage collection later, once we have implemented MDEV-29445. It might become feasible to trigger the garbage collection automatically as well. | ||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-10-23 ] | ||||||||||||||||||
|
Well, until we figured out the interface , maybe we should not do that at all, and introduce it in when we know what we want to do with it. | ||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-10-24 ] | ||||||||||||||||||
|
wlad, are you proposing to essentially revert the new 10.11 feature I believe that SET GLOBAL already requires a privileged user. The new behaviour of setting the parameter to the old value can be documented. The old behaviour is not that lightweight either: it would always spawn buf_resize_task and wait for it to complete. | ||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-10-24 ] | ||||||||||||||||||
|
marko, revert without replacement, yep, that's the proposal. If we determine that a feature was determined to be not good enough, performance-wise or otherwise, why not? It does not have visible interface. The server returns to being as impolite as any other program. Maybe Linux kernel would learn to shrink process working set, instead of killing processes one day. | ||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-10-24 ] | ||||||||||||||||||
|
danblack suggested a better solution: Make Let us do that, and leave the logic around SET GLOBAL innodb_buffer_pool_size alone. This will miss the scheduled quarterly releases, but will hopefully be finished in the next quarter. | ||||||||||||||||||
| Comment by Daniel Black [ 2023-11-13 ] | ||||||||||||||||||
|
Needs advice on improving reliability of the test case, but otherwise complete. (Hope I didn't miss anything). | ||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-17 ] | ||||||||||||||||||
|
OK to push after addressing my review comment, which includes a more stable version of the test innodb.mem_pressure. |