[MXS-4151] Schemarouter duplicate checks are excessively slow Created: 2022-06-02  Updated: 2022-06-27  Resolved: 2022-06-03

Status: Closed
Project: MariaDB MaxScale
Component/s: schemarouter
Affects Version/s: 2.5.20, 6.3.1
Fix Version/s: 2.5.21, 6.4.0

Type: Bug Priority: Major
Reporter: markus makela Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None


 Description   

If the duplicate checks are enabled, the cost of performing the duplicate check grows very fast as the number of tables increases. With around 50000 tables and with the default duplicate checks, it takes on average 25 seconds to do the duplicate checks. With ignore_tables_regex=.* the time drops to around 500 milliseconds of which a large part is network latency.

The reason why it is so slow is that for each visible table, a lookup into the table location is done while the result is being iterated. As the location lookup processes all tables (a somewhat dumb approach), it ends up iterating the table once per row resulting in roughly quadratic complexity. By first inserting all the elements into the resulting container, the duplicate check can be done later in a single pass over the whole container. This results in linear complexity which works out far better.


Generated at Thu Feb 08 04:26:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.