Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-12496

mtflush thread's hang cause mysqld crash

    XMLWordPrintable

Details

    Description

      Problem:
      There is a dead lock between sql thread , page cleaner work thread and page cleaner worker thread, there are three threads involved.

      thread 8 (page cleaner coordinator thread) wait for -> thread 5 (mtflush_io_thread) wait for -> thread 181 (handle_rpl_parallel_thread) [waiting for free block]

      For detail stack info, please refer to the attachment. [gdb.txt]

      Analyze:
      There is a problem between Pager cleaner Coordinator thread and worker threads (mtflush_io_thread) . that is, coordinator may miss os_event_set from mtflush_io_thread and caused mysqld to crash, take the situation bellow for example:

      1) Coordinator produce work items for mtflush_io_thread ;
      562 for(i=0;i<buf_pool_inst; i++)

      { 563 work_item[i].tsk = MT_WRK_WRITE; 564 work_item[i].wr.buf_pool = buf_pool_from_array(i); 565 work_item[i].wr.flush_type = flush_type; 566 work_item[i].wr.min = min_n; 567 work_item[i].wr.lsn_limit = lsn_limit; 568 work_item[i].wi_status = WRK_ITEM_UNSET; 569 work_item[i].wheap = work_heap; 570 work_item[i].rheap = reply_heap; 571 work_item[i].n_flushed = 0; 572 work_item[i].n_evicted = 0; 573 work_item[i].id_usr = 0; 574 575 ib_wqueue_add(mtflush_ctx->wq, 576 (void *)(work_item + i), 577 work_heap); 578 }

      579

      2) Consumer thread consume thread and send event to Coordinator thread that hasn't call os_event_wait;

      3) Coordinator thread call os_event_wait to collect status produced by mtflush_io_thread, but the events sent by mtflush_io_thread has gone before.

      4) Coordinator thread will call ib_wqueue_wait(mtflush_ctx->wr_cq) and won't produce work_item of flush jobs, as a result, user threads from client will use of free blocks from buf->free_list;

      5) Because of reasons above, mysqld will crash in some reason.

      the gdb.txt include all threads info.

      Attachments

        Issue Links

          Activity

            People

              jplindst Jan Lindström (Inactive)
              qinglin musazhang
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.