Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33699

write statements gets stuck during IO-bound insert benchmark with InnoDB

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.11.7
    • None
    • None
    • Ubuntu 22.04, server has 32 cores (cores, not HW threads), 128G of RAM and XFS with SW RAID 0 over 2 NVMe devices

    Description

      I was able to complete CPU-bound insert benchmark tests for all LTS releases from 10.2 through 11.4. But when I try an IO-bound run with MariaDB 10.11.7 then sessions doing writes to InnoDB get stuck. By stuck I mean:

      • per SHOW PROCESSLIST write statements (insert and delete) are running for 42000+ seconds. These should finish in less than one second.
      • SHOW INNODB STATUS hangs right now (42000+ seconds into the problem)

      By CPU-bound I mean the database fits in the DBMS block cache. By IO-bound I mean the database is larger than memory.

      The benchmark is run with 24 clients and there are 3 connections/client – one for insert, one for delete, one for select. At the point where it gets stuck so the SHOW PROCESSLIST output shows 72 connections and 48 (insert, delete) are stuck while the selects are proceeding.

      Also, while stuck, from "top" I see that the mariadbd process is busy with %CPU >= 400.

      I will upload the full my.cnf. For this test I set innodb_change_buffering=all. The previous test run uses =off and it was too slow (would have take ~10 days) while others take less than 1 day. So I killed the run and restarted with the change buffer enabled.

      For this benchmark I have been using a big server (32 cores) with high-concurrency (24 clients) and a small server (8 cores) with low-concurrency (1 client). For IO-bound runs I have only seen this problem on the big server.

      The error log has these two messages that I don't recall occurring with upstream MySQL+InnoDB. But these occur during the create index step of the benchmark which is a few hours prior to the hang:

      2024-03-16  8:45:50 0 [Warning] InnoDB: Could not free any blocks in the buffer pool! 6490112 blocks are in use and 0 free. Consider increasing innodb_buffer_pool_size.
      2024-03-16 13:03:05 0 [Warning] InnoDB: Could not free any blocks in the buffer pool! 6490112 blocks are in use and 0 free. Consider increasing innodb_buffer_pool_size.
      

      For an overview of the insert benchmark see here

      I attached 2 sets of PMP output. For each set there are two files – one grouped, one not grouped. From a quick browse of the stacks I assume the hang is related to the change buffer.

      Attachments

        1. 8core.ma1011.11d.flat
          49 kB
        2. 8core.ma1011.11d.hier
          8 kB
        3. dell32.err
          3 kB
        4. my.cnf
          2 kB
        5. pmp.grouped.240317_152037.txt
          14 kB
        6. pmp.grouped.240317_152426.txt
          12 kB
        7. pmp.not-grouped.240317_152037.txt
          194 kB
        8. pmp.not-grouped.240317_152426.txt
          197 kB
        9. show_process_list.txt
          10 kB
        10. show_process_list-1.txt
          10 kB

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mdcallag Mark Callaghan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.