[MDEV-12213] NUMA support Created: 2017-03-09 Updated: 2023-09-19 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major |
| Reporter: | Daniel Black | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | gsoc17 | ||
| Issue Links: |
|
||||||||
| Description |
|
NUMA hardware is becoming more common. Access to RAM that is not local to the CPU nodes is more expensive than accessing it locally. MariaDB should implement mechanisms to optimize the workload to keep CPUs of a node accessing their local memory. example numa architecture:
Components of the implementation include:
(Marko, Jan, et al. please edit with important design/implementation details) I'm willing to mentor this (with help). |
| Comments |
| Comment by Sergei Golubchik [ 2017-03-09 ] |
|
Thanks!This sounds interesting and useful, a good project. A couple of thoughts:
|
| Comment by Daniel Black [ 2017-03-10 ] |
|
Implementation plan from a configuration point of view.
Best guesses so far:
Unsure how to handle
Threads generally - will have sched_setaffinity/SetThreadAffinityMask set to the cpu set corresponding to the numa node. NUMA implementation will be abstracted and support Windows equivalent NUMA functions - https://msdn.microsoft.com/en-us/library/windows/desktop/aa363804(v=vs.85).aspx Eventually - persistent table of mappings Out of scope:
|
| Comment by Sergey Vojtovich [ 2017-05-05 ] |
|
Eventually we'll have to bind table cache instances and MDL instances (not yet implemented) to NUMA nodes. As well as PFS counters and some status variables. Please keep this in mind. |
| Comment by Daniel Black [ 2017-05-05 ] |
|
Thanks svoj. All tips gratefully received. GSOC approved with Sumit Lakra as the student. Mentors: jplindst and me. |
| Comment by Daniel Black [ 2017-05-29 ] |
|
Tip from irc worthy of consideration at some stage: memory engine is a good candidate |
| Comment by Daniel Black [ 2021-02-01 ] |
| Comment by Marko Mäkelä [ 2021-02-16 ] |
|
innodb_mtflush_threads and its replacement were removed, and in I think that it would be very challenging make all users of the buffer pool aware of NUMA (say, actively migrate execution threads to the NUMA node that owns most of the data that is likely to be addressed). I wonder if it could make sense to partition the buf_pool.page_hash in such a way that pages would be mapped to NUMA nodes by some simple formula like page_id.raw()%N_NUMA. All entries of a buf_pool_numa[i].page_hash would point to buffer pool block descriptors and blocks that reside in that NUMA node. I think that we should keep a global buf_pool.LRU and buf_pool.flush_list in any case. |