Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25340

Server startup with large innodb_buffer_pool_size takes a long time

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Won't Do
    • 10.1(EOL), (13)
      10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL)
    • N/A

    Description

      The initialization of innodb buffer pool chunks/blocks on allocation has a number of consequences.

      • on very large memory systems this can take significant time
      • the initialization of data in the allocated memory forces this to paged into the process resulting in a real allocation out of the available OS space.

      The second becomes a problem when a large number of processes are on the system, potentially a large number of mariadb instances all competing for memory. A claim on use approach leaves more RAM available for other processes if mariadb doesn't end up using it.

      Sizing innodb buffer pool is a hard task so lets try to make it easier, or just less consequences if its wrong.

      The proposal is buf_pool_t::chunk_t::create removed the initialization (buf_block_init) and this becomes detected when blocks/chunks are used and then initialized there.

      Attachments

        Issue Links

          Activity

            danblack Daniel Black added a comment - - edited

            Was getting 3 seconds on 20G with memlock, only some of it huge pages.

            10.6-4bd94afbd3cecdcca9edb87834b41de5494fbbea

            # echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
            # cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
            201
            # ulimit -l 20000000
            #  sql/mariadbd --no-defaults --datadir=/tmp/${PWD##*/}-datadir --socket=/tmp/${PWD##*/}.sock --plugin-dir=${PWD}/mysql-test/var/plugins/ --verbose --skip-networking   --large-pages=1  --innodb-buffer-pool-size=20G --memlock --user=root
            2023-12-01 13:16:42 0 [Note] Starting MariaDB 10.6.17-MariaDB source revision 4bd94afbd3cecdcca9edb87834b41de5494fbbea as process 185716
            2023-12-01 13:16:42 0 [Note] InnoDB: Compressed tables use zlib 1.2.13
            2023-12-01 13:16:42 0 [Note] InnoDB: Number of pools: 1
            2023-12-01 13:16:42 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
            2023-12-01 13:16:42 0 [Note] InnoDB: Using liburing
            2023-12-01 13:16:42 0 [Note] InnoDB: Initializing buffer pool, total size = 21474836480, chunk size = 134217728
            2023-12-01 13:16:42 0 [Note] InnoDB: Completed initialization of buffer pool
            2023-12-01 13:16:42 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed
            2023-12-01 13:16:42 0 [Note] InnoDB: 128 rollback segments are active.
            2023-12-01 13:16:42 0 [Note] InnoDB: Creating shared tablespace for temporary tables
            2023-12-01 13:16:42 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
            2023-12-01 13:16:42 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
            2023-12-01 13:16:42 0 [Note] InnoDB: 10.6.17 started; log sequence number 55985; transaction id 23
            2023-12-01 13:16:42 0 [Note] Plugin 'FEEDBACK' is disabled.
            2023-12-01 13:16:42 0 [Note] InnoDB: Loading buffer pool(s) from /tmp/build-mariadb-server-10.6-datadir/ib_buffer_pool
            2023-12-01 13:16:42 0 [Note] InnoDB: Buffer pool(s) load completed at 231201 13:16:42
            2023-12-01 13:16:45 0 [Note] sql/mariadbd: ready for connections.
            Version: '10.6.17-MariaDB'  socket: '/tmp/build-mariadb-server-10.6.sock'  port: 0  Source distribution
            

            With 38ee8129d85e867c86b16404a3906030ddab7129

            #  sql/mariadbd --no-defaults --datadir=/tmp/${PWD##*/}-datadir --socket=/tmp/${PWD##*/}.sock --plugin-dir=${PWD}/mysql-test/var/plugins/ --verbose --skip-networking   --large-pages=1  --innodb-buffer-pool-size=20G --memlock --user=root
            2023-12-01 13:32:49 0 [Note] Starting MariaDB 10.6.17-MariaDB source revision 4bd94afbd3cecdcca9edb87834b41de5494fbbea as process 186718
            2023-12-01 13:32:49 0 [Note] InnoDB: Compressed tables use zlib 1.2.13
            2023-12-01 13:32:49 0 [Note] InnoDB: Number of pools: 1
            2023-12-01 13:32:49 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
            2023-12-01 13:32:49 0 [Note] InnoDB: Using liburing
            2023-12-01 13:32:49 0 [Note] InnoDB: Initializing buffer pool, total size = 21474836480, chunk size = 134217728
            2023-12-01 13:32:49 0 [Note] InnoDB: Completed initialization of buffer pool
            2023-12-01 13:32:49 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed
            2023-12-01 13:32:49 0 [Note] InnoDB: 128 rollback segments are active.
            2023-12-01 13:32:49 0 [Note] InnoDB: Creating shared tablespace for temporary tables
            2023-12-01 13:32:49 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
            2023-12-01 13:32:49 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
            2023-12-01 13:32:49 0 [Note] InnoDB: 10.6.17 started; log sequence number 56081; transaction id 23
            2023-12-01 13:32:49 0 [Note] InnoDB: Loading buffer pool(s) from /tmp/build-mariadb-server-10.6-datadir/ib_buffer_pool
            2023-12-01 13:32:49 0 [Note] Plugin 'FEEDBACK' is disabled.
            2023-12-01 13:32:49 0 [Note] InnoDB: Buffer pool(s) load completed at 231201 13:32:49
            2023-12-01 13:32:51 0 [Note] sql/mariadbd: ready for connections.
            Version: '10.6.17-MariaDB'  socket: '/tmp/build-mariadb-server-10.6.sock'  port: 0  Source distribution
            

            Gained a second (1/3) on smallish 20G pools. Good gain. Thanks.

            danblack Daniel Black added a comment - - edited Was getting 3 seconds on 20G with memlock, only some of it huge pages. 10.6-4bd94afbd3cecdcca9edb87834b41de5494fbbea # echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages 201 # ulimit -l 20000000 # sql/mariadbd --no-defaults --datadir=/tmp/${PWD##*/}-datadir --socket=/tmp/${PWD##*/}.sock --plugin-dir=${PWD}/mysql-test/var/plugins/ --verbose --skip-networking --large-pages=1 --innodb-buffer-pool-size=20G --memlock --user=root 2023-12-01 13:16:42 0 [Note] Starting MariaDB 10.6.17-MariaDB source revision 4bd94afbd3cecdcca9edb87834b41de5494fbbea as process 185716 2023-12-01 13:16:42 0 [Note] InnoDB: Compressed tables use zlib 1.2.13 2023-12-01 13:16:42 0 [Note] InnoDB: Number of pools: 1 2023-12-01 13:16:42 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions 2023-12-01 13:16:42 0 [Note] InnoDB: Using liburing 2023-12-01 13:16:42 0 [Note] InnoDB: Initializing buffer pool, total size = 21474836480, chunk size = 134217728 2023-12-01 13:16:42 0 [Note] InnoDB: Completed initialization of buffer pool 2023-12-01 13:16:42 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed 2023-12-01 13:16:42 0 [Note] InnoDB: 128 rollback segments are active. 2023-12-01 13:16:42 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2023-12-01 13:16:42 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2023-12-01 13:16:42 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2023-12-01 13:16:42 0 [Note] InnoDB: 10.6.17 started; log sequence number 55985; transaction id 23 2023-12-01 13:16:42 0 [Note] Plugin 'FEEDBACK' is disabled. 2023-12-01 13:16:42 0 [Note] InnoDB: Loading buffer pool(s) from /tmp/build-mariadb-server-10.6-datadir/ib_buffer_pool 2023-12-01 13:16:42 0 [Note] InnoDB: Buffer pool(s) load completed at 231201 13:16:42 2023-12-01 13:16:45 0 [Note] sql/mariadbd: ready for connections. Version: '10.6.17-MariaDB' socket: '/tmp/build-mariadb-server-10.6.sock' port: 0 Source distribution With 38ee8129d85e867c86b16404a3906030ddab7129 # sql/mariadbd --no-defaults --datadir=/tmp/${PWD##*/}-datadir --socket=/tmp/${PWD##*/}.sock --plugin-dir=${PWD}/mysql-test/var/plugins/ --verbose --skip-networking --large-pages=1 --innodb-buffer-pool-size=20G --memlock --user=root 2023-12-01 13:32:49 0 [Note] Starting MariaDB 10.6.17-MariaDB source revision 4bd94afbd3cecdcca9edb87834b41de5494fbbea as process 186718 2023-12-01 13:32:49 0 [Note] InnoDB: Compressed tables use zlib 1.2.13 2023-12-01 13:32:49 0 [Note] InnoDB: Number of pools: 1 2023-12-01 13:32:49 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions 2023-12-01 13:32:49 0 [Note] InnoDB: Using liburing 2023-12-01 13:32:49 0 [Note] InnoDB: Initializing buffer pool, total size = 21474836480, chunk size = 134217728 2023-12-01 13:32:49 0 [Note] InnoDB: Completed initialization of buffer pool 2023-12-01 13:32:49 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed 2023-12-01 13:32:49 0 [Note] InnoDB: 128 rollback segments are active. 2023-12-01 13:32:49 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2023-12-01 13:32:49 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2023-12-01 13:32:49 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2023-12-01 13:32:49 0 [Note] InnoDB: 10.6.17 started; log sequence number 56081; transaction id 23 2023-12-01 13:32:49 0 [Note] InnoDB: Loading buffer pool(s) from /tmp/build-mariadb-server-10.6-datadir/ib_buffer_pool 2023-12-01 13:32:49 0 [Note] Plugin 'FEEDBACK' is disabled. 2023-12-01 13:32:49 0 [Note] InnoDB: Buffer pool(s) load completed at 231201 13:32:49 2023-12-01 13:32:51 0 [Note] sql/mariadbd: ready for connections. Version: '10.6.17-MariaDB' socket: '/tmp/build-mariadb-server-10.6.sock' port: 0 Source distribution Gained a second (1/3) on smallish 20G pools. Good gain. Thanks.

            echo 96|sudo tee /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
            echo 1024|sudo tee/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
            time sql/mariadbd --large-pages=1 --datadir /dev/shm/data --innodb-log-file-size=5m --bootstrap --innodb-buffer-pool-size{,-max}=96g < /dev/null
            

            10.6-MDEV-29445 07513d4faba65a074ce64d308f5327cd5954a324

            2025-01-30 15:15:17 0 [Note] Starting MariaDB 10.6.21-MariaDB source revision 07513d4faba65a074ce64d308f5327cd5954a324 server_uid +iaoNO+gS3c9MVoT2uqEAeNeOOc= as process 43849
            2025-01-30 15:15:17 0 [Note] InnoDB: The first data file './ibdata1' did not exist. A new tablespace will be created!
            2025-01-30 15:15:17 0 [Note] InnoDB: Compressed tables use zlib 1.3.1
            2025-01-30 15:15:17 0 [Note] InnoDB: Using transactional memory
            2025-01-30 15:15:17 0 [Note] InnoDB: Number of pools: 1
            2025-01-30 15:15:17 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
            2025-01-30 15:15:17 0 [Note] InnoDB: Using liburing
            2025-01-30 15:15:17 0 [Note] InnoDB: innodb_buffer_pool_size_max=98304m, innodb_buffer_pool_size=98304m
            2025-01-30 15:15:17 0 [Note] InnoDB: Completed initialization of buffer pool
            2025-01-30 15:15:17 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 5242880 bytes
            2025-01-30 15:15:17 0 [Note] InnoDB: Setting file './ibdata1' size to 12 MB. Physically writing the file full; Please wait ...
            2025-01-30 15:15:17 0 [Note] InnoDB: File './ibdata1' size is now 12 MB.
            2025-01-30 15:15:17 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
            2025-01-30 15:15:17 0 [Note] InnoDB: New log file created, LSN=10313
            2025-01-30 15:15:17 0 [Note] InnoDB: Doublewrite buffer not found: creating new
            2025-01-30 15:15:17 0 [Note] InnoDB: Doublewrite buffer created
            2025-01-30 15:15:17 0 [Note] InnoDB: 128 rollback segments are active.
            2025-01-30 15:15:17 0 [Note] InnoDB: Creating shared tablespace for temporary tables
            2025-01-30 15:15:17 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
            2025-01-30 15:15:17 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
            2025-01-30 15:15:17 0 [Note] InnoDB: 10.6.21 started; log sequence number 0; transaction id 3
            2025-01-30 15:15:17 0 [Note] Plugin 'FEEDBACK' is disabled.
             
            real	0m0.455s
            user	0m0.062s
            sys	0m0.102s
            

            I repeated the same under perf record. In the perf report, the Linux kernel is dominating:

            10.6-MDEV-29445 07513d4faba65a074ce64d308f5327cd5954a324

              39.87%  mariadbd       [kernel.kallsyms]     [k] clear_page_erms
              12.46%  mariadbd       libc.so.6             [.] __memset_avx2_unaligned_erms_rtm
               3.90%  mariadbd       [vdso]                [.] __vdso_clock_gettime
               1.80%  mariadbd       mariadbd              [.] my_timer_milliseconds
               1.66%  mariadbd       ld-linux-x86-64.so.2  [.] _dl_relocate_object
               1.53%  mariadbd       ld-linux-x86-64.so.2  [.] do_lookup_x
               1.38%  mariadbd       libc.so.6             [.] clock_gettime@@GLIBC_2.17
               1.30%  mariadbd       [kernel.kallsyms]     [k] __mem_cgroup_charge
               1.14%  mariadbd       mariadbd              [.] create_log_file(bool, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)
               0.97%  mariadbd       [kernel.kallsyms]     [k] native_irq_return_iret
               0.79%  mariadbd       [kernel.kallsyms]     [k] next_uptodate_folio
               0.72%  mariadbd       [kernel.kallsyms]     [k] __mod_memcg_lruvec_state
               0.66%  mariadbd       [kernel.kallsyms]     [k] xas_find_conflict
               0.59%  mariadbd       [kernel.kallsyms]     [k] __list_del_entry_valid_or_report
               0.58%  mariadbd       [kernel.kallsyms]     [k] folios_put_refs
               0.50%  mariadbd       mariadbd              [.] crc32c_3way
               0.50%  mariadbd       [kernel.kallsyms]     [k] __rmqueue_pcplist
               0.49%  mariadbd       [kernel.kallsyms]     [k] do_user_addr_fault
               0.49%  mariadbd       [kernel.kallsyms]     [k] try_charge_memcg
               0.49%  mariadbd       [kernel.kallsyms]     [k] shmem_add_to_page_cache
               0.48%  mariadbd       [kernel.kallsyms]     [k] __rcu_read_unlock
               0.46%  mariadbd       [kernel.kallsyms]     [k] free_unref_folios
               0.45%  mariadbd       ld-linux-x86-64.so.2  [.] strcmp
               0.45%  mariadbd       [kernel.kallsyms]     [k] memset_orig
               0.45%  mariadbd       mariadbd              [.] clock_gettime@plt
               0.45%  mariadbd       mariadbd              [.] my_timer_init
               0.43%  mariadbd       [kernel.kallsyms]     [k] unmap_page_range
               0.41%  mariadbd       mariadbd              [.] buf_flush_list_holding_mutex(unsigned long, unsigned long)
               0.36%  mariadbd       [kernel.kallsyms]     [k] __free_one_page
               0.33%  mariadbd       [kernel.kallsyms]     [k] prep_new_page
               0.33%  mariadbd       mariadbd              [.] buf_page_get_low(page_id_t, unsigned long, rw_lock_type_t, buf_block_t*, unsigned long, mtr_t*, dberr_t*, bool)
               0.33%  mariadbd       [kernel.kallsyms]     [k] alloc_pages_mpol_noprof
               0.33%  mariadbd       [kernel.kallsyms]     [k] lru_add
               0.31%  mariadbd       [kernel.kallsyms]     [k] folio_alloc_mpol_noprof
               0.31%  mariadbd       [kernel.kallsyms]     [k] do_anonymous_page
               0.31%  mariadbd       libc.so.6             [.] pthread_mutex_lock@@GLIBC_2.2.5
               0.31%  iou-wrk-44164  [kernel.kallsyms]     [k] rep_movs_alternative
            

            This was on Debian GNU/Linux with

            cat /proc/version

            Linux version 6.12.10-amd64 (debian-kernel@lists.debian.org) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-14) 14.2.0, GNU ld (GNU Binutils for Debian) 2.43.50.20250108) #1 SMP PREEMPT_DYNAMIC Debian 6.12.10-1 (2025-01-18)
            

            Compared to what I had reported on 2023-08-18, the start-up time was cut in less than half, on the same hardware.

            marko Marko Mäkelä added a comment - echo 96| sudo tee /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages echo 1024| sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages time sql /mariadbd --large-pages=1 --datadir /dev/shm/data --innodb-log- file -size=5m --bootstrap --innodb-buffer-pool-size{,-max}=96g < /dev/null 10.6-MDEV-29445 07513d4faba65a074ce64d308f5327cd5954a324 2025-01-30 15:15:17 0 [Note] Starting MariaDB 10.6.21-MariaDB source revision 07513d4faba65a074ce64d308f5327cd5954a324 server_uid +iaoNO+gS3c9MVoT2uqEAeNeOOc= as process 43849 2025-01-30 15:15:17 0 [Note] InnoDB: The first data file './ibdata1' did not exist. A new tablespace will be created! 2025-01-30 15:15:17 0 [Note] InnoDB: Compressed tables use zlib 1.3.1 2025-01-30 15:15:17 0 [Note] InnoDB: Using transactional memory 2025-01-30 15:15:17 0 [Note] InnoDB: Number of pools: 1 2025-01-30 15:15:17 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions 2025-01-30 15:15:17 0 [Note] InnoDB: Using liburing 2025-01-30 15:15:17 0 [Note] InnoDB: innodb_buffer_pool_size_max=98304m, innodb_buffer_pool_size=98304m 2025-01-30 15:15:17 0 [Note] InnoDB: Completed initialization of buffer pool 2025-01-30 15:15:17 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 5242880 bytes 2025-01-30 15:15:17 0 [Note] InnoDB: Setting file './ibdata1' size to 12 MB. Physically writing the file full; Please wait ... 2025-01-30 15:15:17 0 [Note] InnoDB: File './ibdata1' size is now 12 MB. 2025-01-30 15:15:17 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0 2025-01-30 15:15:17 0 [Note] InnoDB: New log file created, LSN=10313 2025-01-30 15:15:17 0 [Note] InnoDB: Doublewrite buffer not found: creating new 2025-01-30 15:15:17 0 [Note] InnoDB: Doublewrite buffer created 2025-01-30 15:15:17 0 [Note] InnoDB: 128 rollback segments are active. 2025-01-30 15:15:17 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2025-01-30 15:15:17 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2025-01-30 15:15:17 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2025-01-30 15:15:17 0 [Note] InnoDB: 10.6.21 started; log sequence number 0; transaction id 3 2025-01-30 15:15:17 0 [Note] Plugin 'FEEDBACK' is disabled.   real 0m0.455s user 0m0.062s sys 0m0.102s I repeated the same under perf record . In the perf report , the Linux kernel is dominating: 10.6-MDEV-29445 07513d4faba65a074ce64d308f5327cd5954a324 39.87% mariadbd [kernel.kallsyms] [k] clear_page_erms 12.46% mariadbd libc.so.6 [.] __memset_avx2_unaligned_erms_rtm 3.90% mariadbd [vdso] [.] __vdso_clock_gettime 1.80% mariadbd mariadbd [.] my_timer_milliseconds 1.66% mariadbd ld-linux-x86-64.so.2 [.] _dl_relocate_object 1.53% mariadbd ld-linux-x86-64.so.2 [.] do_lookup_x 1.38% mariadbd libc.so.6 [.] clock_gettime@@GLIBC_2.17 1.30% mariadbd [kernel.kallsyms] [k] __mem_cgroup_charge 1.14% mariadbd mariadbd [.] create_log_file(bool, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) 0.97% mariadbd [kernel.kallsyms] [k] native_irq_return_iret 0.79% mariadbd [kernel.kallsyms] [k] next_uptodate_folio 0.72% mariadbd [kernel.kallsyms] [k] __mod_memcg_lruvec_state 0.66% mariadbd [kernel.kallsyms] [k] xas_find_conflict 0.59% mariadbd [kernel.kallsyms] [k] __list_del_entry_valid_or_report 0.58% mariadbd [kernel.kallsyms] [k] folios_put_refs 0.50% mariadbd mariadbd [.] crc32c_3way 0.50% mariadbd [kernel.kallsyms] [k] __rmqueue_pcplist 0.49% mariadbd [kernel.kallsyms] [k] do_user_addr_fault 0.49% mariadbd [kernel.kallsyms] [k] try_charge_memcg 0.49% mariadbd [kernel.kallsyms] [k] shmem_add_to_page_cache 0.48% mariadbd [kernel.kallsyms] [k] __rcu_read_unlock 0.46% mariadbd [kernel.kallsyms] [k] free_unref_folios 0.45% mariadbd ld-linux-x86-64.so.2 [.] strcmp 0.45% mariadbd [kernel.kallsyms] [k] memset_orig 0.45% mariadbd mariadbd [.] clock_gettime@plt 0.45% mariadbd mariadbd [.] my_timer_init 0.43% mariadbd [kernel.kallsyms] [k] unmap_page_range 0.41% mariadbd mariadbd [.] buf_flush_list_holding_mutex(unsigned long, unsigned long) 0.36% mariadbd [kernel.kallsyms] [k] __free_one_page 0.33% mariadbd [kernel.kallsyms] [k] prep_new_page 0.33% mariadbd mariadbd [.] buf_page_get_low(page_id_t, unsigned long, rw_lock_type_t, buf_block_t*, unsigned long, mtr_t*, dberr_t*, bool) 0.33% mariadbd [kernel.kallsyms] [k] alloc_pages_mpol_noprof 0.33% mariadbd [kernel.kallsyms] [k] lru_add 0.31% mariadbd [kernel.kallsyms] [k] folio_alloc_mpol_noprof 0.31% mariadbd [kernel.kallsyms] [k] do_anonymous_page 0.31% mariadbd libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5 0.31% iou-wrk-44164 [kernel.kallsyms] [k] rep_movs_alternative This was on Debian GNU/Linux with cat /proc/version Linux version 6.12.10-amd64 (debian-kernel@lists.debian.org) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-14) 14.2.0, GNU ld (GNU Binutils for Debian) 2.43.50.20250108) #1 SMP PREEMPT_DYNAMIC Debian 6.12.10-1 (2025-01-18) Compared to what I had reported on 2023-08-18, the start-up time was cut in less than half, on the same hardware.

            In a development version of MDEV-29445, I had included lazy page initialization, but I am testing removing it because it was complicating some logic. Not doing the lazy initialization does slow down the startup, but the positive side is that there can’t be any regression during workload. Today, I am measuring a start-up time of around 3.5 seconds, on the same system that reported 0.455 seconds on my previous test run. There is some 0.2 seconds of variance in the numbers between runs, and large_pages might be slightly slower. Here is an example:

              81,78%  mariadbd       [kernel.kallsyms]     [k] clear_page_erms
               2,18%  mariadbd       mariadbd              [.] buf_pool_t::create()
               1,11%  mariadbd       [kernel.kallsyms]     [k] __rmqueue_pcplist
               0,94%  mariadbd       [kernel.kallsyms]     [k] __rcu_read_unlock
               0,88%  mariadbd       libc.so.6             [.] __memset_avx2_unaligned_erms_rtm
               0,88%  mariadbd       [kernel.kallsyms]     [k] unmap_page_range
               0,75%  mariadbd       [kernel.kallsyms]     [k] __mem_cgroup_charge
               0,60%  mariadbd       [kernel.kallsyms]     [k] native_irq_return_iret
               0,51%  mariadbd       [kernel.kallsyms]     [k] try_charge_memcg
               0,44%  mariadbd       [kernel.kallsyms]     [k] prep_new_page
            

            I think that a buffer-pool initialization time of almost 30 GiB/s might be acceptable, more acceptable than a potential workload-time performance regression that the lazy initialization could incur.

            marko Marko Mäkelä added a comment - In a development version of MDEV-29445 , I had included lazy page initialization, but I am testing removing it because it was complicating some logic. Not doing the lazy initialization does slow down the startup, but the positive side is that there can’t be any regression during workload. Today, I am measuring a start-up time of around 3.5 seconds, on the same system that reported 0.455 seconds on my previous test run. There is some 0.2 seconds of variance in the numbers between runs, and large_pages might be slightly slower. Here is an example: 81,78% mariadbd [kernel.kallsyms] [k] clear_page_erms 2,18% mariadbd mariadbd [.] buf_pool_t::create() 1,11% mariadbd [kernel.kallsyms] [k] __rmqueue_pcplist 0,94% mariadbd [kernel.kallsyms] [k] __rcu_read_unlock 0,88% mariadbd libc.so.6 [.] __memset_avx2_unaligned_erms_rtm 0,88% mariadbd [kernel.kallsyms] [k] unmap_page_range 0,75% mariadbd [kernel.kallsyms] [k] __mem_cgroup_charge 0,60% mariadbd [kernel.kallsyms] [k] native_irq_return_iret 0,51% mariadbd [kernel.kallsyms] [k] try_charge_memcg 0,44% mariadbd [kernel.kallsyms] [k] prep_new_page I think that a buffer-pool initialization time of almost 30 GiB/s might be acceptable, more acceptable than a potential workload-time performance regression that the lazy initialization could incur.

            Today, I’m measuring a start-up time of 2.6±0.1 seconds for the same test, this time running Linux 6.12.13 and a slightly different development build of MDEV-29445. I get roughly the same time on each CPU scaling governor that I tested (powersave, balanced, performance).

            marko Marko Mäkelä added a comment - Today, I’m measuring a start-up time of 2.6±0.1 seconds for the same test, this time running Linux 6.12.13 and a slightly different development build of MDEV-29445 . I get roughly the same time on each CPU scaling governor that I tested (powersave, balanced, performance).

            During the development of MDEV-29445, I experimented with the deferred initialization of the buf_pool.free list, hoping to save some time at startup with large innodb_buffer_pool_size. Some failures in the additional logic was caught during stress tests. Based on my testing, the additional overhead of initializing all of buf_pool.free is not that big.

            To shorten up the startup time, one could specify a larger innodb_buffer_pool_size_max and a small initial innodb_buffer_pool_size, and issue SET GLOBAL innodb_buffer_pool_size to increase the size later. With the MDEV-29445 changes, that should be a quick operation.

            marko Marko Mäkelä added a comment - During the development of MDEV-29445 , I experimented with the deferred initialization of the buf_pool.free list, hoping to save some time at startup with large innodb_buffer_pool_size . Some failures in the additional logic was caught during stress tests. Based on my testing, the additional overhead of initializing all of buf_pool.free is not that big. To shorten up the startup time, one could specify a larger innodb_buffer_pool_size_max and a small initial innodb_buffer_pool_size , and issue SET GLOBAL innodb_buffer_pool_size to increase the size later. With the MDEV-29445 changes, that should be a quick operation.

            People

              marko Marko Mäkelä
              danblack Daniel Black
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.