Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8420

UCA: compare broken bytes as "greater than any non-broken character"

    XMLWordPrintable

Details

    Description

      UCA collations compare:

      • all broken mbminlen units as having weight 0xFFFF
      • all non-BMP characters that have no weight in the weight table for the current collation as having weight 0xFFFD

      This is different from the other collations, which take into account byte values when comparing broken byte sequences. For example, strnncollsp(0xFE, 0xFF) for utf8_general_ci returns -1, because the broken byte value (0xFE) in the left operand is smaller than the broken byte value (0xFF) in the right operand.

      UCA collations, for consistency purposes, should perhaps be fixed to compare different broken bytes as non-equal, like the other collations do.

      This task was originally created as a subtask for MDEV-8036, for all UCA based collations in all Unicode character sets, together with a set of other subtasks of MDEV-8036, which is needed for MDEV-8433. However, the UCA collations already seem to suite the MDEV-8433 needs and MDEV-8433 should probably work without any changes in the UCA collations. For search purposes we can have a broken string only in one operand (the string literal), while the other operand (the field) contains well formed strings. So the string comparison function should normally never compare two broken strings. So MDEV-8420 is now removed from MDEV-8036 dependencies.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.