Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31759

Large grain of dict_sys lock by table creation affects performance

Details

    Description

      Our test guys tried to create tables with 500 concurrent sessions (each session 200 tables)and cause slow queries.
      I used pstack to inspect the stacks and found most threads are waiting for the mutex of global variable dict_sys

      I viewed the related codes in method
      ha_innobase::create
      and think the lock range is too large.

      row_mysql_lock_data_dictionary / row_mysql_unlock_data_dictionary between
      error = info.create_table(own_trx)

      In create_table_info_t::create_table, there are a lot of object allocation or simply setting members in those objects before attaching them to the cache of global variable dict_sys. IMO those steps don't require mutex of global dict.

      Would it be better if we make the mutex dict_sys fine-grained instead of locking whole process of create_table_info_t::create_table

      Attachments

        Issue Links

          Activity

            Which MariaDB Server version is this about? I do not think that dict_sys.latch should be a bottleneck in MariaDB Server 10.6 or later. The dict_sys.mutex was removed in MDEV-24258.

            marko Marko Mäkelä added a comment - Which MariaDB Server version is this about? I do not think that dict_sys.latch should be a bottleneck in MariaDB Server 10.6 or later. The dict_sys.mutex was removed in MDEV-24258 .
            lyufangabriel Fan Lyu added a comment -

            Hello Marko, I am exactly using 10.5.
            In fact I had a discussion with Monty last weekend regarding the linux mutex "jump in the queue" behaviour

            lyufangabriel Fan Lyu added a comment - Hello Marko, I am exactly using 10.5. In fact I had a discussion with Monty last weekend regarding the linux mutex "jump in the queue" behaviour

            Thank you, lyufangabriel. I have not analyzed such "jumping the queue" behaviour myself, but I believe that it could be dependent on the Linux kernel version, possibly some scheduling parameters, and on the hardware architecture (SMP vs. NUMA). In MariaDB Server 10.6, MDEV-21452, MDEV-27058 and many other changes should improve multi-threaded performance (while also making the code easier to debug).

            Can you test in a staging environment if 10.6 would perform better for you?

            marko Marko Mäkelä added a comment - Thank you, lyufangabriel . I have not analyzed such "jumping the queue" behaviour myself, but I believe that it could be dependent on the Linux kernel version, possibly some scheduling parameters, and on the hardware architecture (SMP vs. NUMA). In MariaDB Server 10.6, MDEV-21452 , MDEV-27058 and many other changes should improve multi-threaded performance (while also making the code easier to debug). Can you test in a staging environment if 10.6 would perform better for you?

            MDEV-31095 should have addressed some "jumping the queue", in MariaDB Server 10.6.16 and later major versions.

            marko Marko Mäkelä added a comment - MDEV-31095 should have addressed some "jumping the queue", in MariaDB Server 10.6.16 and later major versions.

            People

              marko Marko Mäkelä
              lyufangabriel Fan Lyu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.