Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16989

InnoDB hang on crash recovery: Waited for 10 seconds for 256 pending reads

    XMLWordPrintable

Details

    Description

      wlad made me aware of a PBX-1467 fix for Percona Xtrabackup. I believe that the described hang scenario is possible in InnoDB and XtraDB crash recovery.

      Quoting sergei-gl’s commit message:

      Here is an example deadlock scenario:

      Thread 1 in `recv_apply_hashed_log_recs' is waiting when
      `buf_pool->n_pend_reads' become not too high to make a progress. It is
      `apply_batch_on=TRUE' and will change it to be `FALSE' once apply batch
      completed. Note that `buf_pool->n_pend_reads' is already high.

      Now, one of the pending reads completes and `buf_page_io_complete'
      invoked. It should decrement `buf_pool->n_pend_reads' and let current
      apply batch to make progress.

      But before decrementing `buf_pool->n_pend_reads', `buf_page_io_complete'
      invoked `ibuf_merge_or_delete_for_page' which in turn triggered one more
      `recv_apply_hashed_log_recs'. This new `recv_apply_hashed_log_recs'
      cannot make progress because `apply_batch_on' is `TRUE', it is waiting
      for thread 1. We are in the deadlock now.

      Lets imagine that all IO handlers (`buf_page_io_complete') stuck in the
      `recv_apply_hashed_log_recs', here is what we see in this case.

      Proposed fix is to decrement `buf_pool->n_pend_reads' before invoking
      `ibuf_merge_or_delete_for_page'.

      This hang should only be possible if there were buffered changes to secondary index leaf pages, to pages which were read during the redo log processing, possibly by read-ahead.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.