[MDEV-24565] Please add a string function to count how often some character is in a string Created: 2021-01-11  Updated: 2022-05-21

Status: Open
Project: MariaDB Server
Component/s: OTHER
Fix Version/s: None

Type: Task Priority: Minor
Reporter: Wolfgang Formann Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None


 Description   

In Wikipedia we often have article names with parenthesis like https://en.wikipedia.org/wiki/Walker_(surname) and it would be nice to have some function which enables me to find those, where the number of opening parenthesis matches the numer of closing ones.

Current solution is a query like
( pl_title LIKE "%)%)%" AND pl_title NOT LIKE "%(%(%" )
OR
( pl_title LIKE "%)%" AND pl_title NOT LIKE "%(%" )
...

It would be a lot easier to write a query like
CHAR_COUNT(pl_title, '(') = CHAR_COUNT(pl_title, ')')



 Comments   
Comment by Sergei Golubchik [ 2022-05-20 ]

you can use regular expressions for that, a regex to match recursive parentheses is easy to google:

MariaDB [test]> select regexp_substr('foo ( bar ( lll ) ss.. ( lslkdls ) ) jsjksjdk )', '\\((?:[^()]|((?R)))+\\)');
+---------------------------------------------------------------------------------------------+
| regexp_substr('foo ( bar ( lll ) ss.. ( lslkdls ) ) jsjksjdk )', '\\((?:[^()]|((?R)))+\\)') |
+---------------------------------------------------------------------------------------------+
| ( bar ( lll ) ss.. ( lslkdls ) )                                                            |
+---------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Comment by Wolfgang Formann [ 2022-05-20 ]

Sergei, I want to find all those strings having a different number of opening parenthesis compared to the number of closing ones. Although your idea with matching parenthesis might be more strict, it does not seem to work as I expect and oes not really match my requirement.

Comment by Daniel Black [ 2022-05-21 ]

Using regexp_replace of every other character and compare lengths:

set @str='foo ( bar ( lll ) ss.. ( lslkdls ) ) jsjksjdk )';
select length(regexp_replace(@str, '[^(]','')) = length(regexp_replace(@str, '[^)]',''));

Generated at Thu Feb 08 09:30:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.