Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7127

POSIX collating elements are not supported

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Won't Fix
    • 10.0.14
    • N/A
    • OTHER
    • Ubuntu 14.04

    Description

      When running the following queries, you will get an error, which in my case breaks replication as the master is currently MySQL 5.5.40

      SELECT ' ' REGEXP '[[.space.]]';
       
      SELECT '.' REGEXP '[[.period.]]';

      Error:

      ERROR 1139 (42000): Got error 'POSIX collating elements are not supported at offset 1' from regexp

      Attachments

        Activity

          Thanks for the report.

          bar,
          I suppose it's a PCRE limitation, but the replication failure is very unfortunate. Is there anything we can do about it? Maybe a non-default mode or a new version which allows the syntax?

          elenst Elena Stepanova added a comment - Thanks for the report. bar , I suppose it's a PCRE limitation, but the replication failure is very unfortunate. Is there anything we can do about it? Maybe a non-default mode or a new version which allows the syntax?
          bar Alexander Barkov added a comment - - edited

          The old Henry Spencer regex library supported a number of character names:
          https://mariadb.com/kb/en/mariadb/documentation/functions-and-operators/string-functions/regular-expressions-functions/regular-expressions-overview/#character-names
          This was a non-standard, non-POSIX extension in the old library.

          In POSIX regex the syntax '[[.xxx.]]' is reserved for collating elements.
          For some reasons, Henry Spencer reused the same syntax for its character names extension.

          PCRE does not support collating elements yet (but I guess it will in the future).
          Currently PCRE only recognizes this syntax and just returns an error that you can see.

          There is a number of workarounds possible:

          For space:

          SELECT ' ' REGEXP ' ';
          SELECT ' ' REGEXP '[ ]';
          SELECT ' ' REGEXP '[[:space:]]';
          SELECT ' ' REGEXP '\\{20}';
          SELECT ' ' REGEXP '\\x{20}';

          For dot:

          SELECT '.' REGEXP '[.]';
          SELECT '.' REGEXP '\\.';
          SELECT '.' REGEXP '\\x2E';
          SELECT '.' REGEXP '\\x{2E}';

          How difficult would it be to change your application to use these workarounds?

          These two are POSIX compliant and are supported by both libraries:

          SELECT ' ' REGEXP ' ';
          SELECT '.' REGEXP '[.]';

          bar Alexander Barkov added a comment - - edited The old Henry Spencer regex library supported a number of character names: https://mariadb.com/kb/en/mariadb/documentation/functions-and-operators/string-functions/regular-expressions-functions/regular-expressions-overview/#character-names This was a non-standard, non-POSIX extension in the old library. In POSIX regex the syntax '[ [.xxx.] ]' is reserved for collating elements. For some reasons, Henry Spencer reused the same syntax for its character names extension. PCRE does not support collating elements yet (but I guess it will in the future). Currently PCRE only recognizes this syntax and just returns an error that you can see. There is a number of workarounds possible: For space: SELECT ' ' REGEXP ' '; SELECT ' ' REGEXP '[ ]'; SELECT ' ' REGEXP '[[:space:]]'; SELECT ' ' REGEXP '\\{20}'; SELECT ' ' REGEXP '\\x{20}'; For dot: SELECT '.' REGEXP '[.]'; SELECT '.' REGEXP '\\.'; SELECT '.' REGEXP '\\x2E'; SELECT '.' REGEXP '\\x{2E}'; How difficult would it be to change your application to use these workarounds? These two are POSIX compliant and are supported by both libraries: SELECT ' ' REGEXP ' '; SELECT '.' REGEXP '[.]';
          TheReaper Johann added a comment -

          Fortunately it will not be too difficult to change in the instance where it is causing problems.

          TheReaper Johann added a comment - Fortunately it will not be too difficult to change in the instance where it is causing problems.

          Thanks. I reported the issue to the PCRE team. Changing priority to minor for now.
          We'll escalate the bug if we have more related problems reported.

          bar Alexander Barkov added a comment - Thanks. I reported the issue to the PCRE team. Changing priority to minor for now. We'll escalate the bug if we have more related problems reported.

          Same issue with a module for a shopping cart. Currently looking into whether or not this can be easily swapped out in the code, although since the module isn't developed internally, that's likely not going to be an all too easy task.

          Would appreciate a fix for this issue.

          CmdrSharp Marcus Frolander added a comment - Same issue with a module for a shopping cart. Currently looking into whether or not this can be easily swapped out in the code, although since the module isn't developed internally, that's likely not going to be an all too easy task. Would appreciate a fix for this issue.

          Same issue with a module for a shopping cart. Currently looking into whether or not this can be easily swapped out in the code, although since the module isn't developed internally, that's likely not going to be an all too easy task.

          Would appreciate a fix for this issue.

          CmdrSharp Marcus Frolander added a comment - Same issue with a module for a shopping cart. Currently looking into whether or not this can be easily swapped out in the code, although since the module isn't developed internally, that's likely not going to be an all too easy task. Would appreciate a fix for this issue.
          rjasdf Rick James (Inactive) added a comment - - edited

          SELECT ' ' REGEXP '\\{20}'

          --> 0 in MariaDB 10.2.2 and Percona 5.6.22-71.0; so perhaps a poor constant?

          SELECT ' ' REGEXP '\\x{20}';

          --> 1 for 10.2.2, but 0 for 5.6.22 – Suggest you take note of this incompatibility.

          rjasdf Rick James (Inactive) added a comment - - edited SELECT ' ' REGEXP '\\{20}' --> 0 in MariaDB 10.2.2 and Percona 5.6.22-71.0; so perhaps a poor constant? SELECT ' ' REGEXP '\\x{20}' ; --> 1 for 10.2.2, but 0 for 5.6.22 – Suggest you take note of this incompatibility.

          Yes. MariaDB uses PCRE and what you observed are the consequences of this fact.

          PCRE supports \ddd, \xhh, and \x{hhh} for specifying characters by their codes. That's why you've got a mismatch in first query and match in the second. MySQL and Percona use Henry Spencer's library which does not support \x{hhh} syntax.

          serg Sergei Golubchik added a comment - Yes. MariaDB uses PCRE and what you observed are the consequences of this fact. PCRE supports \ddd , \xhh , and \x{hhh } for specifying characters by their codes. That's why you've got a mismatch in first query and match in the second. MySQL and Percona use Henry Spencer's library which does not support \x{hhh } syntax.

          It's not something we can fix. PCRE is no longer developed. PCRE2 is the currently supported version, but it still returns the same error, unfortunately.

          serg Sergei Golubchik added a comment - It's not something we can fix. PCRE is no longer developed. PCRE2 is the currently supported version, but it still returns the same error, unfortunately.

          People

            bar Alexander Barkov
            TheReaper Johann
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.