Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34265

Possible hang during IO burst with innodb_flush_sync enabled

    XMLWordPrintable

Details

    Description

      The issue was found while analyzing the intermittent CI failures.The tests would fail with timeout.
      CI failure: innodb.insert_into_empty
      CI failure: encryption.innochecksum
      [ fail ] timeout after 900 seconds

      This is the relevant stack traces from 2 threads showing the issue. User session thread and purge thread gets blocked. They wait free page cleaner to free pages in BP.

      Page cleaner thread has entered "furious flush" stage and is busy flushing pages to pull forward the checkpoint LSN. It fails to progress as some dirty pages are already latched by mtr from session thread(s). This creates a deadlock.

      Thread 9 (Thread 0xffff85552fc0 (LWP 29458)):
      #0  0x0000ffff9d2b28cc in __futex_abstimed_wait_cancelable64 () from /lib64/libc.so.6
      No symbol table info available.
      #1  0x0000ffff9d2b5c60 in pthread_cond_timedwait@@GLIBC_2.17 () from /lib64/libc.so.6
      No symbol table info available.
      #2  0x0000aaaab58536c8 in buf_LRU_get_free_block (have_mutex=have_mutex@entry=false) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/buf/buf0lru.cc:452
      #3  0x0000aaaab5853bf8 in buf_page_init_for_read (mode=mode@entry=132, page_id=page_id@entry={m_id = 800}, zip_size=zip_size@entry=0, unzip=<optimized out>) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/buf/buf0rea.cc:121
      #4  0x0000aaaab58543bc in buf_read_page_low (space=0xaaaadee50008, sync=sync@entry=true, mode=mode@entry=132, page_id={m_id = 800}, zip_size=zip_size@entry=0, unzip=unzip@entry=false) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/buf/buf0rea.cc:295
      #5  0x0000aaaab5855124 in buf_read_page (page_id=page_id@entry={m_id = 800}, zip_size=zip_size@entry=0) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/buf/buf0rea.cc:464
      #6  0x0000aaaab5843320 in buf_page_get_low (page_id={m_id = 800}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=0x0, mode=16, mtr=0xffff855520d8, err=0x0, allow_ibuf_merge=false) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/buf/buf0buf.cc:2831
      #7  0x0000aaaab5843f2c in buf_page_get_gen (page_id=<optimized out>, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=16, mtr=mtr@entry=0xffff855520d8, err=err@entry=0x0, allow_ibuf_merge=allow_ibuf_merge@entry=false) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/buf/buf0buf.cc:3246
      #8  0x0000aaaab57fd7c8 in purge_sys_t::get_page (this=this@entry=0xaaaab6bfdf80 <purge_sys>, id=id@entry={m_id = 800}) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/trx/trx0purge.cc:787
      #9  0x0000aaaab57fdba0 in purge_sys_t::choose_next_log (this=this@entry=0xaaaab6bfdf80 <purge_sys>) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/trx/trx0purge.cc:909
      #10 0x0000aaaab57fe6ec in purge_sys_t::fetch_next_rec (this=0xaaaab6bfdf80 <purge_sys>) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/trx/trx0purge.cc:1022
      #11 trx_purge_attach_undo_recs (n_purge_threads=4, thd=0xaaaadf1c51d8) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/trx/trx0purge.cc:1281
      #12 trx_purge (n_tasks=n_tasks@entry=4, history_size=<optimized out>) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/trx/trx0purge.cc:1406
       
      Thread 2 (Thread 0xffff95516fc0 (LWP 29374)):
      #3  0x0000aaaab5697a50 in pfs_end_mutex_wait_v1 (locker=<optimized out>, rc=0) at /home/buildbot/aarch64-fedora-40/build/storage/perfschema/pfs.cc:4006
      #4  0x0000aaaab5194cb8 in psi_mutex_lock (that=<optimized out>, file=<optimized out>, line=<optimized out>) at /home/buildbot/aarch64-fedora-40/build/mysys/my_thr_init.c:494
      #5  0x0000aaaab58476c4 in inline_mysql_mutex_lock (that=0xaaaab6c06760 <buf_dblwr+16>, src_file=0xaaaab5a9cba0 "/home/buildbot/aarch64-fedora-40/build/storage/innobase/buf/buf0dblwr.cc", src_line=727) at /home/buildbot/aarch64-fedora-40/build/include/mysql/psi/mysql_thread.h:746
      #6  buf_dblwr_t::flush_buffered_writes (this=0xaaaab6c06750 <buf_dblwr>) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/buf/buf0dblwr.cc:727
      #7  0x0000aaaab5190d24 in buf_flush_list (max_n=<optimized out>, lsn=lsn@entry=9419484) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/buf/buf0flu.cc:1576
      #8  0x0000aaaab584f574 in buf_flush_sync_for_checkpoint (lsn=9419484) at /home/buildbot/aarch64-fedora-40/build/storage/innobase/buf/buf0flu.cc:2187
      #9  buf_flush_page_cleaner () at /home/buildbot/aarch64-fedora-40/build/storage/innobase/buf/buf0flu.cc:2458
      

      Attachments

        Issue Links

          Activity

            People

              debarun Debarun Banerjee
              debarun Debarun Banerjee
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.