Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-37557

Issues with new Buffer Pool Implementation in MariaDB Q2 Minors (10.11.12 and 11.4.6)

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 11.8.0, 11.8.1, 10.11.12, 10.11.13, 10.11.14, 11.4.6, 11.4.7, 11.4.8, 11.8.2, 11.8.3, 11.8.4
    • None
    • None

    Description

      The new buffer pool implementation in MariaDB 10.11.12/13 and 11.4.6/7 exhibits inconsistent behavior across different operating systems, with the most critical issue being engine crashes during resizing operations under specific memory conditions.

      Description

      MariaDB Q2 minor releases introduced a significant redesign of the buffer pool architecture, replacing the chunk-based system with a contiguous memory model. Three new parameters were added: innodb_buffer_pool_size_max, innodb_buffer_pool_size_auto_min, and innodb_log_checkpoint_now. Testing has revealed some issues with this implementation.

      Memory Allocation Behavior Differences

      This information is for background and would be referred in the next part.

      During initialization and resize operations, engine uses fundamentally different approaches for memory allocation.

      • During initialization, engine reserves a contiguous virtual memory address range without immediately committing physical memory
      • During resize, engine attempts to commit additional memory from the previously reserved space. This immediately attempts to allocate physical memory = (new value - previous value) assuming new value > previous value

      Critical Issue: Engine Crashes During Buffer Pool Resizing

      Scenario for Crash

      • MariaDB 11.4.7 installed
      • Buffer pool size (innodb_buffer_pool_size) = 10GB
      • Maximum buffer pool size (innodb_buffer_pool_size_max) = 15GB
      • Total system memory = 15GB

      First, we reduce the buffer pool to minimum size:

      SET GLOBAL innodb_buffer_pool_size=6M;
      

      Then we increase it back to 10GB:

      SET GLOBAL innodb_buffer_pool_size=10G;
      

      This works fine, and the engine commits 10GB of physical memory. Purpose of reducing it to minimum was to force the engine to use 10G as during resize physical memory is allocated.

      Meanwhile, another process starts and consumes 4GB of memory.
      Now only 1GB of free system memory remains.

      We try to increase buffer pool size further:

      SET GLOBAL innodb_buffer_pool_size=12G;
      

      This is technically allowed since our max is 15GB, but there's only 1GB free. But, it would crash on certain OS (with certain kernel version) and would show failure in other.

      Failure Example:

      MariaDB [(none)]> show global variables like "innodb_buffer_pool_size%";
      +----------------------------------+-------------+
      | Variable_name                    | Value       |
      +----------------------------------+-------------+
      | innodb_buffer_pool_size          | 134217728   |
      | innodb_buffer_pool_size_auto_min | 20501757952 |
      | innodb_buffer_pool_size_max      | 20501757952 |
      +----------------------------------+-------------+
      3 rows in set (0.001 sec)
       
      MariaDB [(none)]> set global innodb_buffer_pool_size=20501757952;
      ERROR 5 (HY000): Out of memory (Needed 3187671040 bytes)
      MariaDB [(none)]> set global innodb_buffer_pool_size=8501757952;
      Query OK, 0 rows affected, 1 warning (2.748 sec)
       
      $ hostnamectl
         Static hostname: xxxxxx.us-west-2.compute.internal
               Icon name: computer-vm
                 Chassis: vm
              Machine ID: xxxxxx
                 Boot ID: xxxxxx
          Virtualization: amazon
        Operating System: Amazon Linux 2
             CPE OS Name: cpe:2.3:o:amazon:amazon_linux:2
                  Kernel: Linux 4.14.355-276.618.amzn2.x86_64
            Architecture: x86-64
      

      Crash Example:

      MariaDB [(none)]> show global variables like "innodb_buffer_pool_size%";
      +----------------------------------+-------------+
      | Variable_name                    | Value       |
      +----------------------------------+-------------+
      | innodb_buffer_pool_size          | 134217728   |
      | innodb_buffer_pool_size_auto_min | 33646706688 |
      | innodb_buffer_pool_size_max      | 33646706688 |
      +----------------------------------+-------------+
      3 rows in set (0.001 sec)
       
      MariaDB [(none)]> set global innodb_buffer_pool_size=28672000000;
      ERROR 2026 (HY000): TLS/SSL error: unexpected eof while reading
      MariaDB [(none)]> exit
      Bye
      [1]-  Killed                  mariadbd --defaults-file=/home/linuxbrew/.linuxbrew/etc/my.cnf --innodb_buffer_pool_size_max=33653706688
       
      $ hostnamectl
         Static hostname: xxxxxx.us-west-2.compute.internal
               Icon name: computer-vm
                 Chassis: vm
              Machine ID: xxxxxx
                 Boot ID: xxxxxx
          Virtualization: xen
        Operating System: Amazon Linux 2
             CPE OS Name: cpe:2.3:o:amazon:amazon_linux:2
                  Kernel: Linux 5.10.235-227.919.amzn2.x86_64
            Architecture: x86-64
      

      Memory Information just before crash

      $ cat /proc/meminfo
      MemTotal:       32859952 kB
      MemFree:        21227528 kB
      MemAvailable:   21520972 kB
      Buffers:               0 kB
      Cached:           632672 kB
      SwapCached:            0 kB
      Active:           329668 kB
      Inactive:       10993976 kB
      Active(anon):        280 kB
      Inactive(anon): 10691148 kB
      Active(file):     329388 kB
      Inactive(file):   302828 kB
      Unevictable:           0 kB
      Mlocked:               0 kB
      SwapTotal:             0 kB
      SwapFree:              0 kB
      Dirty:               172 kB
      Writeback:             0 kB
      AnonPages:      10691072 kB
      Mapped:           176412 kB
      Shmem:               456 kB
      KReclaimable:      48392 kB
      Slab:             101312 kB
      SReclaimable:      48392 kB
      SUnreclaim:        52920 kB
      KernelStack:        4016 kB
      PageTables:        81632 kB
      NFS_Unstable:          0 kB
      Bounce:                0 kB
      WritebackTmp:          0 kB
      CommitLimit:    16429976 kB
      Committed_AS:   45714748 kB
      VmallocTotal:   34359738367 kB
      VmallocUsed:       14928 kB
      VmallocChunk:          0 kB
      Percpu:            60480 kB
      HardwareCorrupted:     0 kB
      AnonHugePages:  10483712 kB
      ShmemHugePages:        0 kB
      ShmemPmdMapped:        0 kB
      FileHugePages:         0 kB
      FilePmdMapped:         0 kB
      HugePages_Total:       0
      HugePages_Free:        0
      HugePages_Rsvd:        0
      HugePages_Surp:        0
      Hugepagesize:       2048 kB
      Hugetlb:               0 kB
      DirectMap4k:      231424 kB
      DirectMap2M:    22837248 kB
      DirectMap1G:    11534336 kB
       
      $ sudo cat /proc/10457/smaps_rollup
      55dedcb1b000-7ffd9b3f2000 ---p 00000000 00:00 0                          [rollup]
      Rss:              105984 kB
      Pss:              104254 kB
      Pss_Anon:          80332 kB
      Pss_File:          23922 kB
      Pss_Shmem:             0 kB
      Shared_Clean:       3456 kB
      Shared_Dirty:          0 kB
      Private_Clean:     22196 kB
      Private_Dirty:     80332 kB
      Referenced:       105984 kB
      Anonymous:         80332 kB
      LazyFree:              0 kB
      AnonHugePages:         0 kB
      ShmemPmdMapped:        0 kB
      FilePmdMapped:         0 kB
      Shared_Hugetlb:        0 kB
      Private_Hugetlb:       0 kB
      Swap:                  0 kB
      SwapPss:               0 kB
      Locked:                0 kB
      

      Crash Analysis
      OOM killer killed the engine

      [159031.223998] containerd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-999
      [159031.235188] CPU: 1 PID: 3872 Comm: containerd Not tainted 5.10.235-227.919.amzn2.x86_64 #1
      [159031.244367] Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006
      [159031.447288] Call Trace:
      [159031.450605]  dump_stack+0x57/0x70
      [159031.454473]  dump_header+0x4c/0x20e
      [159031.458462]  oom_kill_process.cold+0xb/0x10
      [159031.462755]  out_of_memory+0xed/0x2d0
      [159031.467939]  __alloc_pages_slowpath.constprop.0+0x93d/0xa00
      [159031.474058]  __alloc_pages_nodemask+0x2de/0x310
      [159031.478904]  pagecache_get_page+0x17e/0x310
      [159031.483305]  filemap_fault+0x4e4/0x6b0
      [159031.487205]  __xfs_filemap_fault.constprop.0+0x45/0x150
      [159031.492764]  __do_fault+0x3a/0x150
      [159031.496521]  do_fault+0x9c/0x240
      [159031.500133]  __handle_mm_fault+0x499/0x640
      [159031.504550]  handle_mm_fault+0xbe/0x2a0
      [159031.508708]  do_user_addr_fault+0x1b3/0x3f0
      [159031.512949]  exc_page_fault+0x68/0x130
      [159031.516885]  ? asm_exc_page_fault+0x8/0x30
      [159031.521201]  asm_exc_page_fault+0x1e/0x30
      [159031.525416] RIP: 0033:0x55bb3619a4fc
      [159031.530823] Code: Unable to access opcode bytes at RIP 0x55bb3619a4d2.
      [159031.537431] RSP: 002b:00007fcbcb4b4d80 EFLAGS: 00010202
      [159031.544665] RAX: ffffffffffffff92 RBX: 0000000000000000 RCX: 000055bb3620e8e3
      [159031.554288] RDX: 000055bb377b6c28 RSI: 0000000000000080 RDI: 000055bb38cde2a0
      [159031.563911] RBP: 00007fcbcb4b4db0 R08: 0000000000000000 R09: 0000000000000000
      [159031.573130] R10: 00007fcbcb4b4d60 R11: 0000000000000206 R12: 00007fcbcb4b4d60
      [159031.582649] R13: 00007fff4322224f R14: 000000c000006a80 R15: 00007fff43222340
      [159031.592084] Mem-Info:
      [159031.597127] active_anon:70 inactive_anon:8118236 isolated_anon:0
                       active_file:53 inactive_file:0 isolated_file:32
                       unevictable:0 dirty:5 writeback:0
                       slab_reclaimable:5661 slab_unreclaimable:10990
                       mapped:64 shmem:114 pagetables:17304 bounce:0
                       free:48448 free_pcp:271 free_cma:0
      [159031.633853] Node 0 active_anon:280kB inactive_anon:32472944kB active_file:212kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):128kB mapped:256kB dirty:20kB writeback:0kB shmem:456kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 10483712kB writeback_tmp:0kB kernel_stack:4032kB all_unreclaimable? yes
      [159031.668854] Node 0 DMA free:11808kB min:32kB low:44kB high:56kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
      [159031.700790] lowmem_reserve[]: 0 3692 32055 32055
      [159031.707453] Node 0 DMA32 free:121060kB min:7780kB low:11560kB high:15340kB reserved_highatomic:0KB active_anon:4kB inactive_anon:3665640kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3915776kB managed:3799540kB mlocked:0kB pagetables:7228kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
      [159031.743425] lowmem_reserve[]: 0 0 28363 28363
      [159031.747789] Node 0 Normal free:61332kB min:61816kB low:90860kB high:119904kB reserved_highatomic:0KB active_anon:276kB inactive_anon:28807484kB active_file:236kB inactive_file:0kB unevictable:0kB writepending:0kB present:29622272kB managed:29044508kB mlocked:0kB pagetables:62032kB bounce:0kB free_pcp:1084kB local_pcp:0kB free_cma:0kB
      [159031.775760] lowmem_reserve[]: 0 0 0 0
      [159031.779809] Node 0 DMA: 0*4kB 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11808kB
      [159031.793777] Node 0 DMA32: 199*4kB (UE) 104*8kB (UME) 24*16kB (UE) 10*32kB (UME) 6*64kB (UME) 2*128kB (UE) 1*256kB (U) 3*512kB (UME) 2*1024kB (UE) 2*2048kB (ME) 27*4096kB (M) = 121500kB
      [159031.810138] Node 0 Normal: 3815*4kB (UME) 1963*8kB (UE) 624*16kB (UME) 402*32kB (UME) 89*64kB (UME) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 59508kB
      [159031.824059] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
      [159031.833793] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
      [159031.843530] 140 total pagecache pages
      [159031.848726] 0 pages in swap cache
      [159031.854087] Swap cache stats: add 0, delete 0, find 0/0
      [159031.860197] Free swap  = 0kB
      [159031.864319] Total swap = 0kB
      [159031.868364] 8388509 pages RAM
      [159031.872546] 0 pages HighMem/MovableOnly
      [159031.877302] 173521 pages reserved
      [159031.881714] 0 pages hwpoisoned
      [159031.885759] Tasks state (memory values in pages):
      [159031.891346] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
      [159031.900920] [   2212]     0  2212    40478       85   376832        0             0 systemd-journal
      [159031.910766] [   2230]     0  2230    11142      326   118784        0         -1000 systemd-udevd
      [159031.923298] [   3294]     0  3294    14414      111   135168        0         -1000 auditd
      [159031.932867] [   3334]     0  3334     6600       95    98304        0             0 systemd-logind
      [159031.942809] [   3335]    81  3335    14042      137   159744        0          -900 dbus-daemon
      [159031.952567] [   3336]    32  3336    16815      135   167936        0             0 rpcbind
      [159031.961565] [   3339]     0  3339     1068       26    53248        0             0 acpid
      [159031.971894] [   3342]     0  3342    25464       82   102400        0             0 irqbalance
      [159031.982834] [   3364]   999  3364    29525      111   135168        0             0 chronyd
      [159031.992895] [   3385]   998  3385    24087      216   221184        0             0 rngd
      [159032.003079] [   3404]     0  3404    52999      119   180224        0             0 gssproxy
      [159032.013589] [   3740]     0  3740    24672      516   208896        0             0 dhclient
      [159032.024176] [   3787]     0  3787    24672      509   212992        0             0 dhclient
      [159032.034701] [   3832]     0  3832   307848      256    98304        0             0 amazon-ecs-volu
      [159032.045594] [   3839]     0  3839   589160     4608   389120        0          -999 containerd
      [159032.055335] [   3946]     0  3946    22058      262   200704        0             0 master
      [159032.064793] [   3948]    89  3948    22096      257   208896        0             0 qmgr
      [159032.073513] [   3993]     0  3993    27192      257   249856        0         -1000 sshd
      [159032.082389] [   3996]     0  3996   310145     2284   143360        0             0 amazon-ssm-agen
      [159032.091763] [   3999]     0  3999    82618      487   405504        0             0 rsyslogd
      [159032.101107] [   4032]     0  4032    33250      155   102400        0             0 crond
      [159032.111248] [   4034]     0  4034    29203       27    77824        0             0 agetty
      [159032.119910] [   4036]     0  4036    29291       27    61440        0             0 agetty
      [159032.127796] [   4038]     0  4038   620252    10511   569344        0          -500 dockerd
      [159032.135742] [   4126]     0  4126     1057       17    49152        0             0 bpfilter_umh
      [159032.144106] [   4274]     0  4274   312997     3312   163840        0             0 ssm-agent-worke
      [159032.152506] [   4530]     0  4530   484713     1025   249856        0             0 amazon-ecs-init
      [159032.161217] [  30465]    89 30465    22079      253   204800        0             0 pickup
      [159032.169391] [   9203]     0  9203    37136      330   335872        0             0 sshd
      [159032.177423] [   9205]  1000  9205    37136      330   323584        0             0 sshd
      [159032.186774] [   9206]  1000  9206    30532      114    81920        0             0 bash
      [159032.194777] [  10457]  1000 10457  8502153  5465048 44118016        0             0 mariadbd
      [159032.202960] [  11306]     0 11306    30565      114    86016        0             0 log4j-cve-2021-
      [159032.211981] [  25269]  1000 25269  2902682  2624250 21147648        0             0 python3
      [159032.220621] [  25693]  1000 25693     6358      879    90112        0             0 mariadb
      [159032.229078] [  25826]     0 25826    28662       16    65536        0             0 sleep
      [159032.237481] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice,task=mariadbd,pid=10457,uid=1000
      [159032.251593] Out of memory: Killed process 10457 (mariadbd) total-vm:34008612kB, anon-rss:21860192kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:43084kB oom_score_adj:0
      

      strace for checking system call

      [pid 16956] mmap(0x7f4402800000, 28536995840, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_POPULATE, -1, 0) = ?
      [pid 16956] +++ killed by SIGKILL +++
      [pid 16861] +++ killed by SIGKILL +++
      [pid 16854] +++ killed by SIGKILL +++
      [pid 16849] +++ killed by SIGKILL +++
      [pid 16839] +++ killed by SIGKILL +++
      [pid 16838] +++ killed by SIGKILL +++
      [pid 16840] +++ killed by SIGKILL +++
      +++ killed by SIGKILL +++
      

      Lack of documentation

      Users upgrading to these versions have no way to understand the purpose, behavior, or proper configuration of these parameters. This documentation gap is particularly problematic for innodb_buffer_pool_size_max, which fundamentally changes how buffer pool resizing works in ways that contradict user expectations.

      Documentation was requested in https://jira.mariadb.org/browse/MDEV-37176 but not much updates are there after that and this information was also missing in the release notes.

      Most users familiar with MariaDB would expect that "innodb_buffer_pool_size" would be fully dynamic, allowing both increases and decreases within constraints. However, the undocumented reality is that "innodb_buffer_pool_size_max" acts as a hard ceiling that defaults to the initial buffer pool size and cannot be changed without restarting. This creates a confusing situation where users can reduce buffer pool size but cannot increase it beyond its initial value without a restart.

      Another confusing aspect is that when users attempt to increase buffer pool size beyond "innodb_buffer_pool_size_max", the system shows only a warning rather than an error, and due to the lack of documentation, users have no way of knowing that their request won't actually increase the buffer pool size at all in the default case where the current size equals the maximum size or will cap it at max in another case.

      MariaDB [(none)]> set global innodb_buffer_pool_size=99999999999999999;
      Query OK, 0 rows affected, 1 warning (0.696 sec)
       
      MariaDB [(none)]> show warnings;
      +---------+------+------------------------------------------------------------------------+
      | Level   | Code | Message                                                                |
      +---------+------+------------------------------------------------------------------------+
      | Warning | 1292 | Truncated incorrect innodb_buffer_pool_size value: '99999999999999999' |
      +---------+------+------------------------------------------------------------------------+
       
      MariaDB [(none)]> show global variables like "innodb_buffer_pool_size%";
      +----------------------------------+-----------+
      | Variable_name                    | Value     |
      +----------------------------------+-----------+
      | innodb_buffer_pool_size          | 452984832 |
      | innodb_buffer_pool_size_auto_min | 452984832 |
      | innodb_buffer_pool_size_max      | 452984832 |
      +----------------------------------+-----------+
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              anehra Akshat Nehra
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.