Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36226

Stall and crash when page cleaner fails to generate free pages during Async flush

Details

    Description

      The issue is not repeatable but here is a snapshot of the stack of user case when it crashed.

      1. srv_monitor_task: Raises fatal error as dict sys latch wait found to exceed 600 sec (10 minutes)

      #5  0x000056327cfbad2f in ib::fatal::~fatal (this=<optimized out>, __in_chrg=<optimized out>) 
      #6  0x000056327cfb222f in srv_monitor_task.cold () 
      #7  0x000056327d865268 in tpool::thread_pool_generic::timer_generic::run (this=0x56327f13be50) 
      

      2. DML threads waiting on dict sys latch while trying to open a table

      #5  dict_sys_t::freeze (line=1039, file="storage/innobase/dict/dict0dict.cc" at /usr/src/debug/MariaDB-/src_0/storage/innobase/dict/dict0dict.cc:1006
      #6  0x000056327d7e9f88 in dict_table_open_on_name (table_name="zabbix/applications", DICT_ERR_IGNORE_FK_NOKEY) at /usr/src/debug/MariaDB-/src_0/storage/innobase/dict/dict0dict.cc:1039
      #7  0x000056327d699f65 in ha_innobase::open(char const*, int, unsigned int) () at /usr/src/debug/MariaDB-/src_0/storage/innobase/handler/ha_innodb.cc:6187
      #9  0x000056327d20fa82 in open_table_from_share
      

      3. Threads stuck waiting for free page in buffer pool.

      #1 0x000056327d7dcd9f in buf_LRU_get_free_block (have_mutex=have_mutex@entry=false) at storage/innobase/buf/buf0lru.cc:467
      

      3A. One of the above threads is fetching dict stat information holding X dictionary latch and hence blocking all dictionary operations.

      #1 0x000056327d7dcd9f in buf_LRU_get_free_block (have_mutex=have_mutex@entry=false) at storage/innobase/buf/buf0lru.cc:467
      #2 0x000056327d02425f in buf_page_init_for_read (mode=132, page_id={m_id = 37005438222352}, at storage/innobase/buf/buf0rea.cc:121
      #3 0x000056327d7dd6f7 in buf_read_page_low (unzip=false, zip_size=0, page_id={m_id = 37005438222352}, mode=132, sync=true, space=0x56327efbbdb8) at storage/innobase/buf/buf0rea.cc:295
      #10 0x000056327d76188b in row_sel_step (thr=thr@entry=0x7f9418606dd0) at /usr/src/debug/MariaDB-/src_0/storage/innobase/row/row0sel.cc:2433
      #14 0x000056327d71dd1b in que_eval_sql (info=info@entry=0x7f941805d890,
      #15 0x000056327d8060a2 in dict_stats_fetch_from_ps (table=0x7f9418600e40) at /usr/src/debug/MariaDB-/src_0/storage/innobase/dict/dict0stats.cc:3910
      

      4. Page cleaner thread which should generate free pages, is busy flushing pages.
      #0  0x00007fa721b3248c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x000056327d7d14bc in buf_dblwr_t::flush_buffered_writes (this=0x56327eaa3520 <buf_dblwr>, size=64) at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0dblwr.cc:588
      #2  0x000056327d7d1b5b in buf_dblwr_t::add_to_batch (this=<buf_dblwr>, request= {bpage = 0x7f9e6800faa0, slot=0x0, type = IORequest::WRITE_ASYNC}, size=16384) at storage/innobase/buf/buf0dblwr.cc:647
      #3  0x000056327d7d7655 in buf_page_t::flush (this=0x7f9e6800faa0, space=0x7f92e0072e78) at /usr/src/debug/MariaDB-/src_0/storage/innobase/include/os0file.h:209
      #4  0x000056327d7d9397 in buf_do_flush_list_batch (lsn=<optimized out>, max_n=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0flu.cc:1524
      #5  buf_flush_list_holding_mutex (lsn=<optimized out>, max_n=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0flu.cc:1598
      #6  buf_flush_list_holding_mutex (max_n=<optimized out>, lsn=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0flu.cc:1579
      #7  0x000056327d7d9dc9 in buf_flush_page_cleaner () at /usr/src/debug/MariaDB-/src_0/storage/innobase/buf/buf0flu.cc:2503
      

      The crash happened as page cleaner could not free up any page for 10 minutes. Now during regular iteration the page cleaner does flush from flush list for maximum of innodb_max_io_capacity capacity and then goes for generating free pages from LRU tail.

      In the specific case innodb_max_io_capacity = 4000 4000x16k ~64M data page flush(128M write IO with double write) should not take that long. From the detailed code analysis I found one catch in some specific case where we may skip freeing any page from the LRU tail and go to the next iteration for flush list flush. The process could repeat for long time if the buffer pool size (in GBs) and the total number of dirty pages are large.

      I could repeat the case where LRU flush/eviction step fails to free any free page in debugger with the following scenario.

      1. Dirty pages that are modified and not used for long time (set-A) filling up the tail of the LRU.

      2. At the same time other dirty pages modified earlier than the LRU tail pages(set-B) but used and modified repeatedly. These pages are in the head of the flush list and would be flushed prior to set-A while flushing from the flush list to reduce checkpoint age.

      3. When asynchronous flush is triggered i.e. when 7/8 th of the LSN margin is filled in the redo log, we try to flush innodb_io_capacity_max number of pages from flush list in the iteration. after flushing innodb_io_capacity_max number of pages from flush list we try to do flush and eviction from the LRU (buf_flush_LRU) with max_n=0 i.e. the number of pages allowed to flush during LRU scan is zero because we have already flushed innodb_io_capacity_max amount of pages from flush list in the iteration.

      4. We have all dirty pages (set-A) in LRU tail and we return not able to evict any page at all.

      This scenario could happen with large buffer pool size. It should be fixed by ensuring at least some flush quota for LRU flush and eviction stage in a iteration.

      Attachments

        Issue Links

          Activity

            debarun Debarun Banerjee added a comment -

            Steve Shaw Possibly some other change have made the behaviour different in 10.6.21 but as you mentioned it doesn't guarantee that the issue is not there. Yes, I think if the issue is repeatable in 10.6.18, it makes good sense to run it with the pull request changes. btw, the stack trace (and other details) information would be still useful for me even without the patch in 10.6.18 (in case it doesn't repeat with the pull-request patch).

            debarun Debarun Banerjee added a comment - Steve Shaw Possibly some other change have made the behaviour different in 10.6.21 but as you mentioned it doesn't guarantee that the issue is not there. Yes, I think if the issue is repeatable in 10.6.18, it makes good sense to run it with the pull request changes. btw, the stack trace (and other details) information would be still useful for me even without the patch in 10.6.18 (in case it doesn't repeat with the pull-request patch).
            Steve Shaw Steve Shaw added a comment -

            I also ran against 10.6.19 and did not see the issue or the message. I re-ran again vs 10.6.18 and this gave the message "2025-03-27 13:24:22 0 [Warning] InnoDB: Could not free any blocks in the buffer pool! 4737408 blocks are in use and 0 free. Consider increasing innodb_buffer_pool_size." yet did not hang in the subsequent hour as previously, yet stopping and restarting the test with the sessions reconnecting produced a hang immediately. So, so far "Could not free any blocks in the buffer pool!" and the hang state has only been reproduced every time after receiving the message on 10.6.18 and no other version.

            Steve Shaw Steve Shaw added a comment - I also ran against 10.6.19 and did not see the issue or the message. I re-ran again vs 10.6.18 and this gave the message "2025-03-27 13:24:22 0 [Warning] InnoDB: Could not free any blocks in the buffer pool! 4737408 blocks are in use and 0 free. Consider increasing innodb_buffer_pool_size." yet did not hang in the subsequent hour as previously, yet stopping and restarting the test with the sessions reconnecting produced a hang immediately. So, so far "Could not free any blocks in the buffer pool!" and the hang state has only been reproduced every time after receiving the message on 10.6.18 and no other version.

            MDEV-34265 must be included in ES 10.6.19-15, as it is fixed in CS 10.6.19.
            Thanks

            pandi.gurusamy Pandikrishnan Gurusamy added a comment - MDEV-34265 must be included in ES 10.6.19-15, as it is fixed in CS 10.6.19. Thanks
            debarun Debarun Banerjee added a comment -

            pandi.gurusamy right, customer encountered the current bug on 10.6.19 and that stack trace also had confirmed earlier that it is not related to MDEV-34265. My analysis in this bug is mainly based on customer stack trace and this bug is expected to fix a symptom that should result in the exact symptom customer is facing. However, we could not repeat this issue in black-box testing and Steve Shawis kindly helping with it.

            Steve Shaw I see. In 10.6.18, you could be hitting MDEV-34265. For current customer case (CS-211883), however, we are sure that he is hitting a different issue which could be explained by the current bug MDEV-36226.

            To know for sure whether you are hitting MDEV-34265, I would need to check the complete stack trace in your case when it hangs. Could you please attach gdb to the hanging mariadbd and get the stack trace for all threads ?
            Also, I recommend to move to 10.6.19 for future testing of this bug to ensure that we are not blocked by MDEV-34265.

            debarun Debarun Banerjee added a comment - pandi.gurusamy right, customer encountered the current bug on 10.6.19 and that stack trace also had confirmed earlier that it is not related to MDEV-34265 . My analysis in this bug is mainly based on customer stack trace and this bug is expected to fix a symptom that should result in the exact symptom customer is facing. However, we could not repeat this issue in black-box testing and Steve Shaw is kindly helping with it. Steve Shaw I see. In 10.6.18, you could be hitting MDEV-34265 . For current customer case (CS-211883), however, we are sure that he is hitting a different issue which could be explained by the current bug MDEV-36226 . To know for sure whether you are hitting MDEV-34265 , I would need to check the complete stack trace in your case when it hangs. Could you please attach gdb to the hanging mariadbd and get the stack trace for all threads ? Also, I recommend to move to 10.6.19 for future testing of this bug to ensure that we are not blocked by MDEV-34265 .
            marko Marko Mäkelä added a comment -

            axel reported in MDEV-35155 observing a performance improvement. This is most likely thanks to debarun’s fix.

            marko Marko Mäkelä added a comment - axel reported in MDEV-35155 observing a performance improvement. This is most likely thanks to debarun ’s fix.

            People

              debarun Debarun Banerjee
              debarun Debarun Banerjee
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.