Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26873

Partial server hang when using many threads

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Won't Do
    • Affects Version/s: 10.2, 10.3, 10.4, 10.5, 10.6, 10.7
    • Fix Version/s: N/A
    • Component/s: Locking
    • Labels:

      Description

      Split from MDEV-26381. Logging a simplified overview here, with easy reproducibility and keeping things simple, though there are likely more aspects to these hang(s), some described in that ticket.

      Execute the attached hang.sql (identical to [MDEV-26381_OTHER_1.sql] from MDEV-26381), using 10k threads, with all threads replaying in random order (against test db).

      After a few minutes, even on optimized builds, partial hang issues will start to show. SHOW FULL PROCESSLIST attached as show_full_processlist.txt as a 10.7 example of such an occurrence. Issue is very easy to reproduce.

      When logging errors (like ERROR 1146 (42S02) at line 1: Table 'test.t2' doesn't exist) to the screen, it's easy to see when the server starts locking up after 1-5 minutes as the error rate either abruptly stops or slows down clearly/significantly. It then stays in that semi-hang state for 30+ minutes, sometimes unlocking partially with some threads continuing to process transactions whilst others remain in hanged state.

      Machine is not OOM, nor OOS, nor busy (nothing else running), not challenged by the 10k threads (low load average in htop). IOW, this is not server hardware/capability related in any way afaict.

      Tested version/revision was 10.7.1 b4911f5a34f8dcfb642c6f14535bc9d5d97ade44 (Optimized)

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              wlad Vladislav Vaintroub
              Reporter:
              Roel Roel Van de Paar
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration