Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27042

UCA: Resetting contractions to ignorable does not work well

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL)
    • 10.8.0
    • Character Sets
    • None

    Description

      I patch the character set configuration file Index.xml as follows:

      diff --git a/Index.xml.orig b/Index.xml
      index cec3bfc..c69047e 100644
      --- a/Index.xml.orig
      +++ b/Index.xml
      @@ -540,6 +540,19 @@ To make maintaining easier please:
           <flag>binary</flag>
           <flag>compiled</flag>
         </collation>
      +  <collation name="utf8mb3_phone_ci" id="352">
      +    <rules>
      +      <reset>\u0000</reset>
      +        <i>\u0020</i> <!-- space -->
      +        <i>\u0028</i> <!-- left parenthesis -->
      +        <i>\u0029</i> <!-- right parenthesis -->
      +        <i>\u002B</i> <!-- plus -->
      +        <i>\u002D</i> <!-- hyphen -->
      +        <i>tel.</i>
      +    </rules>
      +  </collation>
       </charset>
       
       <charset name="ucs2">
      

      I.e. I want to make ignorable:

      • some punctuation characters
      • The string "tel."

      Now I run this script:

      CREATE OR REPLACE TABLE t1
      (
        phone VARCHAR(64) CHARACTER SET utf8mb3 COLLATE utf8mb3_phone_ci
      );
      INSERT INTO t1 VALUES ('123'),('tel.123');
      SELECT * FROM t1 WHERE phone='123';
      

      +-------+
      | phone |
      +-------+
      | 123   |
      +-------+
      

      Looks wrong. It should return both lines.

      Now I run:

      SELECT phone, HEX(WEIGHT_STRING(phone)) FROM t1;
      

      +---------+---------------------------+
      | phone   | HEX(WEIGHT_STRING(phone)) |
      +---------+---------------------------+
      | 123     | 0E2A0E2B0E2C              |
      | tel.123 |                           |
      +---------+---------------------------+
      

      It also looks wrong: the weight string in the second line should be equal to the weight string in the first line.

      Attachments

        Issue Links

          Activity

            Transition Time In Source Status Execution Times
            Sergei Golubchik made transition -
            Open Closed
            10d 6h 13m 1

            People

              bar Alexander Barkov
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.