[MXS-3531] Regex matching for SQL has no hard limits Created: 2021-05-04  Updated: 2023-04-28  Resolved: 2023-04-28

Status: Closed
Project: MariaDB MaxScale
Component/s: namedserverfilter
Affects Version/s: None
Fix Version/s: 23.08.0

Type: New Feature Priority: Major
Reporter: markus makela Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None


 Description   

If one of the filters that uses a regular expression against an SQL statement is configured with a pattern that backtraces catastrophically and an excessively long SQL statement is used, the parsing can take several minutes. In order to protect against user errors, filters that use regular expressions could limit the length of the SQL statement to a default value. This would minimize the effects of expensive regular expressions.



 Comments   
Comment by Johan Wikman [ 2021-05-04 ]

But what should be done if the limit is reached? Is there an other option but to close the connection? Which means that a catastrophically long execution time is traded for broken connections. I'm not sure that improves the situation?

Would it be better to simply measure the parsing time and if it is over some threshold, log a very stern warning that the regular expression is potentially very expensive.

Comment by markus makela [ 2021-05-04 ]

The problem is that with the right regex MaxScale is killed by the SystemD watchdog way before we know how long the matching took. It's not horribly hard to write a regular expression that works fine with a small enough string but with even a moderately large string the parsing is impossibly long.

In the cases where the SQL length exceeds the matching limit, the behavior could be made configurable (close session, ignore it, reject the query etc.). For things like the dbfwfilter, the right action to take would be to reject the query. For namedserverfilter, the default could be to just ignore it and assume there's a less complex catch-all pattern.

Another thing to look into would be to set an upper limit on the amount of backtracing PCRE2 does. I believe it is configurable with the pcre2_set_heap_limit value which by default doesn't seem to have a limit. This might end up being a better solution than limiting the length of the SQL.

Comment by Johan Wikman [ 2021-05-11 ]

Fascinating, so some mysterious watchdog kills may be explained by this.

Generated at Thu Feb 08 04:22:05 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.