Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Not a Bug
-
12.2
-
debian sid
-
Not for Release Notes
Description
more perf data files on this closed thread https://jira.mariadb.org/browse/MDEV-36035
maybe related https://jira.mariadb.org/browse/MDEV-32067
Performance Overhead Analysis: Functions Related to Folio in Kernel (Top 10)
Overview
This report analyzes the performance overhead caused by various kernel functions related to folio operations. The analysis is based on sampling data from the one_connection process, which shows the percentage of CPU time spent in these functions. The key focus is on kernel functions, primarily those involving memory management, page handling, and folio operations. Understanding these bottlenecks can help in optimizing memory management, improving system performance, and reducing resource contention.
Top 10 Folio-Related Functions by CPU Time Overhead:
Percentage Command Shared Object Symbol
7.24% one_connection [kernel.kallsyms] folio_add_new_anon_rmap
4.95% one_connection [kernel.kallsyms] folio_add_lru
3.15% one_connection [kernel.kallsyms] __folio_batch_add_and_move
2.20% one_connection [kernel.kallsyms] folio_mark_accessed
1.21% one_connection [kernel.kallsyms] folios_put_refs
1.01% one_connection [kernel.kallsyms] folio_batch_move_lru
0.70% one_connection [kernel.kallsyms] free_unref_folios
0.66% one_connection [kernel.kallsyms] page_counter_try_charge
0.32% one_connection [kernel.kallsyms] uncharge_folio
0.19% one_connection [kernel.kallsyms] vma_alloc_folio_noprof
Detailed Analysis of Top 5 Folio-Related Functions
folio_add_new_anon_rmap (7.24%)
Function Overview: This function is responsible for adding a new anonymous mapping for a folio (a unit of memory). It is crucial in virtual memory management, especially when handling page faulting and memory access.
Impact: The high overhead suggests that memory mapping operations involving anonymous pages are consuming a significant portion of CPU resources. This could be due to frequent allocation or remapping of memory pages.
folio_add_lru (4.95%)
Function Overview: Adds a folio to the Least Recently Used (LRU) list, which is essential for memory management in the Linux kernel.
Impact: Memory management via LRU is a core operation, and its relatively high overhead indicates that the system might be frequently managing memory pages or dealing with memory pressure.
__folio_batch_add_and_move (3.15%)
Function Overview: Adds folios in batches to different lists or moves them between lists. This is part of the kernel’s strategy to optimize page replacement and memory reclaim.
Impact: Batching operations are often costly in terms of CPU, especially under memory pressure. The impact here suggests that batch operations on memory pages might be contributing to system bottlenecks.
folio_mark_accessed (2.20%)
Function Overview: Marks a folio as accessed, typically for cache or memory reclamation purposes. This function plays a role in tracking memory usage for efficient memory reclaim.
Impact: A notable overhead indicates that memory access tracking, which can be tied to memory management or kernel workload, is a significant contributor to system performance issues.
folios_put_refs (1.21%)
Function Overview: Releases references held on folios when they are no longer needed. Proper reference counting is crucial for memory safety and avoiding memory leaks.
Impact: The fact that this function is contributing to overhead suggests that there are frequent allocations and deallocations of folios, possibly due to dynamic memory usage patterns or inefficient memory reference management.
Other Notable Functions
folio_batch_move_lru (1.01%): Responsible for moving folios between LRU lists, which is part of kernel memory management during memory reclaim.
free_unref_folios (0.70%): Releases unreferenced folios back to the memory allocator. High frequency of this function can indicate high churn in memory resources, leading to increased CPU utilization.
uncharge_folio (0.32%): This function uncharges a folio from a memory cgroup, which is part of the memory accounting system.
vma_alloc_folio_noprof (0.19%): Allocates folios without profiling, which is related to virtual memory area management.
Conclusion and Recommendations
The performance overhead analysis reveals that the kernel’s folio management functions contribute significantly to CPU utilization in your system. Specifically, memory mapping, LRU management, and folio reference counting are taking up substantial CPU time. Here are some potential next steps:
Optimization of Memory Management: The system seems to be frequently interacting with memory management structures like LRU lists and folio reference counts. Investigating ways to reduce the frequency of these operations, possibly by improving memory allocation patterns or tweaking kernel memory settings, could help reduce overhead.
Reducing Folio Allocation/Deallocation Churn: Functions like folio_add_new_anon_rmap and free_unref_folios suggest that there may be excessive memory churn. Reviewing how memory is allocated and deallocated might help improve performance, particularly in systems under heavy memory load.
Profiling and Further Optimization: Given the high overhead of these folio-related functions, further profiling is recommended to pinpoint the specific workloads or patterns that are causing these bottlenecks. Optimizing the memory management subsystem could yield substantial performance gains.
This report aims to give insights into potential areas for kernel optimization and guide future performance tuning efforts.
Attachments
Issue Links
- blocks
-
MDEV-36035 sysbench read performance regression by over 20 times against 4.12.2024 git master
-
- Open
-