[MXS-4204] Maxscale Schema Router is slow to respond for normal querying. Created: 2022-07-14 Updated: 2022-09-16 Resolved: 2022-09-16 |
|
| Status: | Closed |
| Project: | MariaDB MaxScale |
| Component/s: | N/A |
| Affects Version/s: | 6.4.0 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Edward Stoever | Assignee: | markus makela |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Description |
|
Tested on Maxscale 6.4.0 Customer has requested that maxscale schema router be configured so that they can shard schemas by moving them to separate Mariadb instances. Current customer configuration: single instance, 300 schemas, each with approximately 75 tables. Approximately 1,000 user accounts. Each account with varied granted privileges. To test this scenario, I created the following:
Select_priv = grant select on *.* to I created scripts that would connect repeatedly and select, insert, update, delete. Each query a new connection from a separate account. Example:
Running a typical script with 605 lines connecting directly to the database completes in 10 to 15 seconds. |
| Comments |
| Comment by markus makela [ 2022-07-15 ] | |||
|
10 minutes is definitely too slow if it happens after the initial building of the map of all shards and their locations. One thing that would help evaluate some of the latency issues is to run the following query and see how long it takes:
This is the query that the schemarouter executes to determine which tables are visible for which users. | |||
| Comment by markus makela [ 2022-07-15 ] | |||
|
One thing I did notice is that if you have log_info=true, the performance will be much slower as the majority of the time is spent writing debug log data into the log files. Checking whether it's on via maxctrl show maxscale would be a good sanity check to make. The given test case is pretty much the worst-case performance you can get from schemarouter: the shard map cache won't be of any use as all users are unique and each of them builds the shard map when it starts. The default cache validity of 300 seconds isn't helping here that much either so increasing that to something like refresh_interval=24h would also be sensible (assuming the set of tables and databases insn't constantly changing). | |||
| Comment by markus makela [ 2022-07-15 ] | |||
|
I did find some odd slowness in the schemarouter even in the case where the shard map is found in the cache: compared to a readwritesplit service which took 4 seconds, the schemarouter took around 7 seconds to iterate over 1000 users and do a SELECT 1. This was with the command line client so one of the queries it does might cause this slowness; further testing is requried. | |||
| Comment by markus makela [ 2022-09-16 ] | |||
|
Closing as Not a Bug since this is very much expected behavior with the way the schemarouter was designed to work. Sequentially executing SQL statements by using a separate user for each statement is expected to take longer as it forces the schemarouter to build the shard location map every time the user logs in. This effectively bypasses the whole caching that is in place for it and is comparable to using InnoDB with no buffer pool. |