Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5533

Backport Threadpool improvements from Percona Server

Details

    Description

      Threadpool implementation on Percona server is based on the Maria one but some improvements has been made such as the priority queue throttling (http://www.percona.com/doc/percona-server/5.5/performance/threadpool.html#low-priority-queue-throttling) :

      One case that can limit thread pool performance and even lead to deadlocks under high concurrency is a situation when thread groups are oversubscribed due to active threads reaching the oversubscribe limit, but all/most worker threads are actually waiting on locks currently held by a transaction from another connection that is not currently in the thread pool.

      What happens in this case is that those threads in the pool that have marked themselves inactive are not accounted to the oversubscribe limit. As a result, the number of threads (both active and waiting) in the pool grows until it hits thread_pool_max_threads value. If the connection executing the transaction which is holding the lock has managed to enter the thread pool by then, we get a large (depending on the thread_pool_max_threads value) number of concurrently running threads, and thus, suboptimal performance as a result. Otherwise, we get a deadlock as no more threads can be created to process those transaction(s) and release the lock(s).

      Such situations are prevented by throttling the low priority queue when the total number of worker threads (both active and waiting ones) reaches the oversubscribe limit. That is, if there are too many worker threads, do not start new transactions and create new threads until queued events from the already started transactions are processed.

      Another change that has been made is that the default value for the variable thread_pool_max_threads has been bumped :

      Default value for thread_pool_max_threads was changed from 500 to 100 000. This change was introduced because limiting the total number of threads in the Thread Pool can result in deadlocks and uneven distribution of worker threads between thread groups in case of stalled connections.

      The default value of 500 is pretty restrictive and can be limitating on some setups, an higher value might be better suited for default to avoid hitting the threads limit long before hitting the max_connections limit.

      Attachments

        Activity

          colin Colin Charles added a comment -

          We talked about this for 10.1, but maybe we should do this in 10.0 to. Svoj, please do take a look

          colin Colin Charles added a comment - We talked about this for 10.1, but maybe we should do this in 10.0 to. Svoj, please do take a look

          There are different opinions about this thread_pool_max_threads increase. It's a questionable change and more analysis is needed...

          serg Sergei Golubchik added a comment - There are different opinions about this thread_pool_max_threads increase. It's a questionable change and more analysis is needed...
          jb-boin Jean Weisbuch added a comment -

          I personally had issues with the default value on some servers and ended up with many deadlocks on some cases where i didnt hit any problem with the default thread implementation, I think that an higher default limit with a more conservative value such as 1000 or 1500 could be enough for most setups.

          jb-boin Jean Weisbuch added a comment - I personally had issues with the default value on some servers and ended up with many deadlocks on some cases where i didnt hit any problem with the default thread implementation, I think that an higher default limit with a more conservative value such as 1000 or 1500 could be enough for most setups.
          serg Sergei Golubchik added a comment - - edited

          jb-boin, what kind of deadlocks did you have? The only possible "deadlock" that comes to my mind is when you lock all connections out, for example, with FLUSH TABLES WITH READ LOCK — so everybody has to wait — and all other 499 threads are taken by waiting connections, so you cannot connect anymore. Was that it?

          serg Sergei Golubchik added a comment - - edited jb-boin , what kind of deadlocks did you have? The only possible "deadlock" that comes to my mind is when you lock all connections out, for example, with FLUSH TABLES WITH READ LOCK — so everybody has to wait — and all other 499 threads are taken by waiting connections, so you cannot connect anymore. Was that it?
          jb-boin Jean Weisbuch added a comment -

          I dont remember if it was specifically deadlocks that were happening or query being stalled/hitting timeouts but i am certain that i hit the threads limits and queries or connections were failing while they worked just fine before switching to the threadpool and without doing other changes whatsoever.
          Raising the value on these to a still rather conservative value of 1000-1500 fixed the problem and the overall resources used by the server was significantly lower than before enabling the threadpool.

          In any case i think that a default value high enough not to be blocking on most setups would be safer as the usage of threadpool is benefical on servers with many concurrent connections and is not enabled by default, which means that the threadpool probably will be mainly enabled on servers having an high concurrency and on systems that arent too limited in term of resources on which creating more than 500 threads would be hurting (which would anyway probably use less overall resources than not using the threadpool at all).

          jb-boin Jean Weisbuch added a comment - I dont remember if it was specifically deadlocks that were happening or query being stalled/hitting timeouts but i am certain that i hit the threads limits and queries or connections were failing while they worked just fine before switching to the threadpool and without doing other changes whatsoever. Raising the value on these to a still rather conservative value of 1000-1500 fixed the problem and the overall resources used by the server was significantly lower than before enabling the threadpool. In any case i think that a default value high enough not to be blocking on most setups would be safer as the usage of threadpool is benefical on servers with many concurrent connections and is not enabled by default, which means that the threadpool probably will be mainly enabled on servers having an high concurrency and on systems that arent too limited in term of resources on which creating more than 500 threads would be hurting (which would anyway probably use less overall resources than not using the threadpool at all).

          People

            serg Sergei Golubchik
            jb-boin Jean Weisbuch
            Votes:
            3 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.