Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6965

non-captured group \2 in regexp_replace

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.11
    • 10.0.15
    • OTHER

    Description

      select regexp_replace('1 foo and bar', '(\\d+) foo and (\\d+ )?bar', '\\1 this and \\2that')

      expected result: 1 this and that
      actual result: 1 this and 2that

      Attachments

        Activity

          elenst Elena Stepanova added a comment - - edited

          Thinking about it, it sort of makes sense.
          the second group does not match to an empty value, it is undefined at all; so the ambiguous sequence backslash-backslash-2 defaults to '2'.
          It's counter-intuitive though.

          My quick search through PCRE documentation hasn't found anything saying whether it's intentional, so I'm assigning it to bar to confirm (or not).
          Or, maybe there is a flag that controls it, I haven't found one either.

          elenst Elena Stepanova added a comment - - edited Thinking about it, it sort of makes sense. the second group does not match to an empty value, it is undefined at all; so the ambiguous sequence backslash-backslash-2 defaults to '2'. It's counter-intuitive though. My quick search through PCRE documentation hasn't found anything saying whether it's intentional, so I'm assigning it to bar to confirm (or not). Or, maybe there is a flag that controls it, I haven't found one either.

          Perl and JavaScript also produce the expected result "1 this and that":

          perl -e '$_="1 foo and bar\n"; s/(\d+) foo and (\d+ )?bar/\1 this and \2that/; print;'

          document.write("1 foo and bar".replace(/(\\d+) foo and (\\d+ )?bar/, "$1 this and $2that"));

          http://www.pcre.org/pcre.txt
          " PCRE_JAVASCRIPT_COMPAT
          If this option is set […] a back reference to an unset subpattern group matches an empty string (by default this causes the current matching alternative to fail). A pattern such as (\1)(a) succeeds when this option is set (assuming it can find an "a" in the subject), whereas it fails by default, for Perl compatibility.

          julian.ladisch Julian Ladisch added a comment - Perl and JavaScript also produce the expected result "1 this and that": perl -e '$_="1 foo and bar\n"; s/(\d+) foo and (\d+ )?bar/\1 this and \2that/; print;' document.write( "1 foo and bar" .replace(/(\\d+) foo and (\\d+ )?bar/, "$1 this and $2that" )); http://www.pcre.org/pcre.txt " PCRE_JAVASCRIPT_COMPAT If this option is set […] a back reference to an unset subpattern group matches an empty string (by default this causes the current matching alternative to fail). A pattern such as (\1)(a) succeeds when this option is set (assuming it can find an "a" in the subject), whereas it fails by default, for Perl compatibility.

          People

            bar Alexander Barkov
            julian.ladisch Julian Ladisch
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.