[MXS-721] Improve scalability at multi threaded core Created: 2016-05-13  Updated: 2017-12-01  Resolved: 2016-10-08

Status: Closed
Project: MariaDB MaxScale
Component/s: Core
Affects Version/s: 1.4.1
Fix Version/s: 2.0.0

Type: New Feature Priority: Major
Reporter: VAROQUI Stephane Assignee: martin brampton (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MXS-677 Improve efficiency of MaxScale Closed
relates to MXS-731 Optimise the use of spinlock while dr... Closed

 Description   

Vadim point out excessive spinlock usage as concurrency increase
https://www.percona.com/blog/2016/05/12/proxysql-versus-maxscale-for-oltp-ro-workloads/

An internal implementation discussion show up multiple set of task to get improvement :

  • have N epoll_fd to remove one lock
  • remove multiple X nonblocking polls and set it to 1
  • implement thead pool


 Comments   
Comment by martin brampton (Inactive) [ 2016-05-15 ]

It seems to be an assumption that the performance of MaxScale in the benchmark is reduced by virtue of some issue in relation to spinlock. So far as I can see, the word "spinlock" does not occur in Vadim's article.

Assuming ProxySQL is being used in a similar fashion to its use in the benchmark published by Rene Canneo recently, then the difference in performance between ProxySQL and MaxScale is likely to be primarily the result of caching in ProxySQL. For various reasons MaxScale does not currently provide caching.

As time permits, we will run tests of multiple epoll fds, but past profiling runs of MaxScale do not indicate a problem in this area, and restricting a DCB to being processed by a particular processor runs the risk of skewed loading of processors, so that one client might be waiting for the thread that supports their connection, even though there is another thread that is doing nothing.

The question of the optimal number of non-blocking calls to epoll will be reviewed, along with the original design principles.

I'm assuming the last item really means implementing a connection pool, since MaxScale currently has a thread pool. Adopting a connection pool in the way done by ProxySQL would be a sizeable project, and the first step is to better understand what it would involve. It is not clear at present how many client use cases would be helped by this facility.

Comment by René Cannaò [ 2016-05-16 ]

Hi Martin.

To clarify, ProxySQL caches only resultsets for statements explicitly set to be cached. That means that, unless configured otherwise, it doesn't perform any caching.

Comment by Dipti Joshi (Inactive) [ 2016-05-18 ]

rcannao, With ProxySQL, does application need to explicitly indicate which statements to be cached - or is it configured on ProxySQL ?

Comment by René Cannaò [ 2016-05-18 ]

Dipti,

ProxySQL is designed to be completely transparent to the application, that means that the application should not be aware that a proxy exists. With regards to caching, that means that the DBA should configure ProxySQL, describing which queries need to be cached.
As an additional feature, the application can also be explicit and instructs ProxySQL to cache the resultsets for certain statements.

Although, I am not really sure why this is relevant in this context.

Comment by Dipti Joshi (Inactive) [ 2016-05-18 ]

rcannao Thanks for explanation - just wanted to understand whether the setting was on proxySQL side or application side. And it is a static setting and not dynamic.

Comment by René Cannaò [ 2016-05-18 ]

in ProxySQL almost everything is dynamic. There are only 2 exceptions:

  • number of threads;
  • binding IPs/ports : this will be dynamic some time in the future.
Comment by Dipti Joshi (Inactive) [ 2016-05-18 ]

rcannao If DBA has to configure ProxySQL to describe the queries that needs to be cached - that is administrative configuration correct ?Meaning ProxySQL does not it self determine which queries to cache during run time

Comment by René Cannaò [ 2016-05-18 ]

That's correct.
DBA needs to describe the queries that needs to be cached, and it is an administrative configuration

Comment by VAROQUI Stephane [ 2016-07-15 ]

ok when you see this

%Cpu(s): 6.7 us, 14.0 sy, 0.0 ni, 71.0 id, 0.0 wa, 0.0 hi, 5.7 si, 2.6 st

When the only process running is maxscale , 1 thread , readconn router to master
14 sys , for 6.7 user it take 5 minutes for a sysadmin to uninstall the product
Please pay more attention to this task

And report about the progress !

innotop report :
When Load QPS Slow QCacheHit KCacheHit BpsIn BpsOut
Now 0.02 6.73k 0 0.00% 100.00% 2.55M 32.27M

Comment by martin brampton (Inactive) [ 2016-07-19 ]

Thanks for your comments. This isn't being ignored! It is a painstaking process finding performance issues and dealing with them. The initial suggestions were carefully reviewed, but proved of limited value, being either incompatible with MaxScale or unlikely to yield significant gains.

There is not a general problem with the spinlock mechanism, although it has been known for some time that there was a contention problem with the DCB write queue between the mechanism draining the queue and the placing of new items in the queue. It needed a team effort to come up with a solution, and also the availability of time against competing requirements. However, a significant gain in this area was achieved in task MXS-731, where profiling indicated a big improvement.

A number of optimised mechanisms have also been deployed for allocating DCBs and sessions, described in task MXS-677. This gives gains in scenarios where DCBs and sessions are frequently created and destroyed; it will have little impact where there are few, long running sessions.

The read-write split router is suspected of holding spinlocks for too long, and I am about to start on a refactoring of this code (also for other reasons than performance). Although this would obviously not affect your example of a system running read connection router.

I am also looking to set up a dedicated performance test bed to obtain further analysis of MaxScale performance, and to pin down areas where gains may be achievable.

Comment by VAROQUI Stephane [ 2016-07-19 ]

Good to here the team is making progress here

Generated at Thu Feb 08 04:01:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.