[MDEV-6965] non-captured group \2 in regexp_replace Created: 2014-10-28  Updated: 2014-11-10  Resolved: 2014-11-10

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Affects Version/s: 10.0.11
Fix Version/s: 10.0.15

Type: Bug Priority: Major
Reporter: Julian Ladisch Assignee: Alexander Barkov
Resolution: Fixed Votes: 0
Labels: regexp


 Description   

select regexp_replace('1 foo and bar', '(\\d+) foo and (\\d+ )?bar', '\\1 this and \\2that')

expected result: 1 this and that
actual result: 1 this and 2that



 Comments   
Comment by Elena Stepanova [ 2014-10-28 ]

Thinking about it, it sort of makes sense.
the second group does not match to an empty value, it is undefined at all; so the ambiguous sequence backslash-backslash-2 defaults to '2'.
It's counter-intuitive though.

My quick search through PCRE documentation hasn't found anything saying whether it's intentional, so I'm assigning it to bar to confirm (or not).
Or, maybe there is a flag that controls it, I haven't found one either.

Comment by Julian Ladisch [ 2014-10-29 ]

Perl and JavaScript also produce the expected result "1 this and that":

perl -e '$_="1 foo and bar\n"; s/(\d+) foo and (\d+ )?bar/\1 this and \2that/; print;'

document.write("1 foo and bar".replace(/(\\d+) foo and (\\d+ )?bar/, "$1 this and $2that"));

http://www.pcre.org/pcre.txt
" PCRE_JAVASCRIPT_COMPAT
If this option is set […] a back reference to an unset subpattern group matches an empty string (by default this causes the current matching alternative to fail). A pattern such as (\1)(a) succeeds when this option is set (assuming it can find an "a" in the subject), whereas it fails by default, for Perl compatibility.

Generated at Thu Feb 08 07:15:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.