Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-37152

buf_page_get_gen -> buf_pool->stat.n_page_gets++ high cpu utilisation

    XMLWordPrintable

Details

    • Related to performance
    • The performance of ANALYZE FORMAT=JSON as well as the counter innodb_buffer_pool_read_requests was improved.
    • Q4/2025 Server Maintenance

    Description

      Issue is related to prior closed MDEV-21212. Under a HammerDB TPROC-C workload updating the buffer pages accessed metric consumes up over 40% of the cycles of buf_page_get_gen measured on an Intel Xeon Gold 5412U.

      Using perf top under the workload we can observe the following:

      Samples: 11M of event 'cycles:P', 4000 Hz, Event count (approx.): 2108496459712 lost: 0/0 drop: 0/0
      Overhead  Shared Object             Symbol
         5.09%  mariadbd                  [.] buf_page_get_gen(page_id_t, unsigned long, rw_lock_type_t, buf_block_t*, unsigned long, mtr_t
         4.69%  mariadbd                  [.] ssux_lock_impl<true>::rd_wait()
         1.90%  mariadbd                  [.] int page_cur_dtuple_cmp<false>(dtuple_t const&, unsigned char const*, dict_index_t const&, un
         1.58%  mariadbd                  [.] sp_lex_keeper::reset_lex_and_exec_core(THD*, unsigned int*, bool, sp_instr*, bool)
         1.51%  mariadbd                  [.] cmp_dtuple_rec_bytes(unsigned char const*, dict_index_t const&, dtuple_t const&, int*, unsign
         1.40%  mariadbd                  [.] btr_cur_t::search_leaf(dtuple_t const*, page_cur_mode_t, btr_latch_mode, mtr_t*)
         1.36%  mariadbd                  [.] l_find(LF_SLIST**, charset_info_st const*, unsigned int, unsigned char const*, unsigned long,
         1.13%  mariadbd                  [.] MYSQLparse(THD*)
      

      Looking at the annotated view of buf_page_get_gen over 40% of the cycles are in addq $0x1,0x80(%rcx,%rax,1)

      {Samples: 44M of event 'cycles:P', 4000 Hz, Event count (approx.): 2194043919140
      buf_page_get_gen(page_id_t, unsigned long, rw_lock_type_t, buf_block_t*, unsigned long, mtr_t*, dberr_t*)  /opt/mariadb-11.8.2-linux-
      Percent│                                                                                                                            
             │     /* The following functions increment different engine status */                                                        
             │     inline void mariadb_increment_pages_accessed(ha_handler_stats *stats)                                                  
             │     {                                                                                                                      
             │     if (stats)                                                                                                             
             │       test    %rsi,%rsi                                                                                                    
        0.14 │     ↓ je      54                                                                                                           
             │     stats->pages_accessed++;                                                                                               
             │       addq    $0x1,(%rsi)                                                                                                  
             │     /* rdtsc */                                                                                                            
             │     extern __inline unsigned long long                                                                                     
             │     __attribute__((__gnu_inline__, __always_inline__, __artificial__))                                                     
             │     __rdtsc (void)                                                                                                         
             │     {                                                                                                                      
             │     return __builtin_ia32_rdtsc ();                                                                                        
        1.15 │ 54:   rdtsc                                                                                                                
             │     TPOOL_SUPPRESS_TSAN void add(size_t index, Type n) {                                                                   
             │     index = index % N;                                                                                                     
             │     ut_ad(index < UT_ARR_SIZE(m_counter));                                                                                 
             │     m_counter[index].value += n;                                                                                           
        0.00 │       lea     buf_pool,%rcx                                                                                                
             │     }                                                                                                                      
             │     private:                                                                                                               
             │     /** @return the index of an array element */                                                                           
             │     static ulint calc_hash(ulint fold, ulint n_cells) noexcept                                                             
             │     {                                                                                                                      
             │     return pad(fold % n_cells);                                                                                            
        0.00 │       xor     %edx,%edx                                                                                                    
             │     index = index % N;                                                                                                     
        0.03 │       and     $0x7f,%eax                                                                                                   
             │     }                                                                                                                      
             │     /** Retrieve the tablespace id.                                                                                        
             │     @return tablespace id */                                                                                               
             │     constexpr uint32_t space() const noexcept                                                                              
             │     { return static_cast<uint32_t>(m_id >> 32); }                                                                          
        0.12 │       mov     %r14,%rdi                                                                                                    
             │     m_counter[index].value += n;                                                                                           
        0.02 │       shl     $0x6,%rax                                                                                                    
        0.00 │       shr     $0x20,%rdi                                                                                                   
       44.01 │       addq    $0x1,0x80(%rcx,%rax,1)                                                                                       
             │     { return static_cast<uint32_t>(m_id); }                                                                                
             │     /** Retrieve the fold value.                  
      

      If we comment out ++buf_pool.stat.n_page_gets in buf_inc_get

      static void buf_inc_get(ha_handler_stats *stats)
      {
       mariadb_increment_pages_accessed(stats);
       ++buf_pool.stat.n_page_gets;
      }
      

      CPU utilisation is lower in buf_page_get_gen

      Samples: 1M of event 'cycles:P', 4000 Hz, Event count (approx.): 1038205088122 lost: 0/0 drop: 0/0
       
      Overhead  Shared Object         Symbol
         4.14%  mariadbd              [.] buf_page_get_gen(page_id_t, unsigned long, rw_lock_type_t, buf_block_t*, unsigned long, mtr_t*, d
         4.00%  mariadbd              [.] ssux_lock_impl<true>::rd_wait()
         2.04%  mariadbd              [.] int page_cur_dtuple_cmp<false>(dtuple_t const&, unsigned char const*, dict_index_t const&, unsign
         1.60%  mariadbd              [.] sp_lex_keeper::reset_lex_and_exec_core(THD*, unsigned int*, bool, sp_instr*, bool)
         1.58%  mariadbd              [.] cmp_dtuple_rec_bytes(unsigned char const*, dict_index_t const&, dtuple_t const&, int*, unsigned l
         1.53%  mariadbd              [.] btr_cur_t::search_leaf(dtuple_t const*, page_cur_mode_t, btr_latch_mode, mtr_t*)
         1.38%  mariadbd              [.] l_find(LF_SLIST**, charset_info_st const*, unsigned int, unsigned char const*, unsigned long, CUR
         1.18%  mariadbd              [.] MYSQLparse(THD*)
      

      and the hotspot is no longer present in the annotated view.

      {Samples: 14M of event 'cycles:P', 4000 Hz, Event count (approx.): 3579600867736
      buf_page_get_gen(page_id_t, unsigned long, rw_lock_type_t, buf_block_t*, unsigned long, mtr_t*, dberr_t*)  /opt/mariadb-11.8.2-linux-
      Percent│     /* The following functions increment different engine status */                                                        
             │     inline void mariadb_increment_pages_accessed(ha_handler_stats *stats)                                                  
             │     {                                                                                                                      
             │     if (stats)                                                                                                             
             │       test    %rsi,%rsi                                                                                                    ◆
        0.17 │     ↓ je      54                                                                                                           
             │     stats->pages_accessed++;                                                                                               
             │       addq    $0x1,(%rsi)                                                                                                  
             │     }                                                                                                                      
             │     /** Retrieve the tablespace id.                                                                                        
             │     @return tablespace id */                                                                                               
             │     constexpr uint32_t space() const noexcept                                                                              
             │     { return static_cast<uint32_t>(m_id >> 32); }                                                                          
        0.00 │ 54:   mov     %r14,%rdi                                                                                                    
             │     (addr & ~(ELEMENTS_PER_LATCH * sizeof chain));                                                                         
             │     }                                                                                                                      
             │     /** Get a hash table slot. */                                                                                          
             │     hash_chain &cell_get(ulint fold) const                                                                                 
             │     { return array[calc_hash(fold, n_cells)]; }                                                                            
        0.29 │       mov     buf_pool+0x4208,%rcx                                                                                         
             │     return pad(fold % n_cells);                                                                                            
        0.00 │       xor     %edx,%edx                                                                                                    
        0.15 │       shr     $0x20,%rdi                                                                                                   
             │     { return static_cast<uint32_t>(m_id); }                                                                                
             │     /** Retrieve the fold value.      
      

      and show engine innodb status reports: No buffer pool page gets since the last printout
      This reports a similar finding to MDEV-21212 however the CPU utilisation is now higher.

      It is noted that commenting out ++buf_pool.stat.n_page_gets did not result in conclusive performance gains (1.003) however does reduce CPU utilisation in buf_page_get_gen and enable further analysis.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              Steve Shaw Steve Shaw
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.