Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27042

UCA: Resetting contractions to ignorable does not work well

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8
    • Fix Version/s: 10.8.0
    • Component/s: Character Sets
    • Labels:
      None

      Description

      I patch the character set configuration file Index.xml as follows:

      diff --git a/Index.xml.orig b/Index.xml
      index cec3bfc..c69047e 100644
      --- a/Index.xml.orig
      +++ b/Index.xml
      @@ -540,6 +540,19 @@ To make maintaining easier please:
           <flag>binary</flag>
           <flag>compiled</flag>
         </collation>
      +  <collation name="utf8mb3_phone_ci" id="352">
      +    <rules>
      +      <reset>\u0000</reset>
      +        <i>\u0020</i> <!-- space -->
      +        <i>\u0028</i> <!-- left parenthesis -->
      +        <i>\u0029</i> <!-- right parenthesis -->
      +        <i>\u002B</i> <!-- plus -->
      +        <i>\u002D</i> <!-- hyphen -->
      +        <i>tel.</i>
      +    </rules>
      +  </collation>
       </charset>
       
       <charset name="ucs2">
      

      I.e. I want to make ignorable:

      • some punctuation characters
      • The string "tel."

      Now I run this script:

      CREATE OR REPLACE TABLE t1
      (
        phone VARCHAR(64) CHARACTER SET utf8mb3 COLLATE utf8mb3_phone_ci
      );
      INSERT INTO t1 VALUES ('123'),('tel.123');
      SELECT * FROM t1 WHERE phone='123';
      

      +-------+
      | phone |
      +-------+
      | 123   |
      +-------+
      

      Looks wrong. It should return both lines.

      Now I run:

      SELECT phone, HEX(WEIGHT_STRING(phone)) FROM t1;
      

      +---------+---------------------------+
      | phone   | HEX(WEIGHT_STRING(phone)) |
      +---------+---------------------------+
      | 123     | 0E2A0E2B0E2C              |
      | tel.123 |                           |
      +---------+---------------------------+
      

      It also looks wrong: the weight string in the second line should be equal to the weight string in the first line.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              bar Alexander Barkov
              Reporter:
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.