Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24584

Selecting INT column with COLLATE utf8mb4_general_ci throws an error

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.5.4, 10.5.6, 5.5(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6
    • 10.5.14, 10.6.6, 10.7.2
    • Character Sets
    • None
    • MariaDB Community 10.5 Docker Container

    Description

      Review below walk-through for how to reproduce the issue and the error messages involved.

      Note it is not necessary for this to succeed- there should not be a reason to issue a COLLATE against an integer column like this. However, this behavior does not seem to be documented, so at a minimum if this is intended behavior, this should be more apparent.

      If this can be made to work rather than error, this will ease how much code refactoring some users must perform when switching from MySQL/Oracle to MariaDB.

      MariaDB [(none)]> SELECT VARIABLE_NAME, SESSION_VALUE
          -> FROM INFORMATION_SCHEMA.SYSTEM_VARIABLES WHERE
          -> VARIABLE_NAME LIKE 'character_set_c%' OR
          -> VARIABLE_NAME LIKE 'character_set_re%' OR
          -> VARIABLE_NAME LIKE 'collation_c%';
      +--------------------------+-------------------+
      | VARIABLE_NAME            | SESSION_VALUE     |
      +--------------------------+-------------------+
      | CHARACTER_SET_RESULTS    | latin1            |
      | CHARACTER_SET_CONNECTION | latin1            |
      | CHARACTER_SET_CLIENT     | latin1            |
      | COLLATION_CONNECTION     | latin1_swedish_ci |
      +--------------------------+-------------------+
      4 rows in set (0.001 sec)
       
      MariaDB [(none)]> select 1 collate utf8mb4_general_ci;
      ERROR 1253 (42000): COLLATION 'utf8mb4_general_ci' is not valid for CHARACTER SET 'latin1'
      MariaDB [(none)]> set names utf8mb4;
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [(none)]> SELECT VARIABLE_NAME, SESSION_VALUE FROM INFORMATION_SCHEMA.SYSTEM_VARIABLES WHERE VARIABLE_NAME LIKE 'character_set_c%' OR VARIABLE_NAME LIKE 'character_set_re%' OR VARIABLE_NAME LIKE 'collation_c%';
      +--------------------------+--------------------+
      | VARIABLE_NAME            | SESSION_VALUE      |
      +--------------------------+--------------------+
      | CHARACTER_SET_RESULTS    | utf8mb4            |
      | CHARACTER_SET_CONNECTION | utf8mb4            |
      | CHARACTER_SET_CLIENT     | utf8mb4            |
      | COLLATION_CONNECTION     | utf8mb4_general_ci |
      +--------------------------+--------------------+
      4 rows in set (0.001 sec)
       
      MariaDB [(none)]> select 1 collate utf8mb4_general_ci;
      ERROR 1253 (42000): COLLATION 'utf8mb4_general_ci' is not valid for CHARACTER SET 'latin1'
      MariaDB [(none)]> select 'TEST' collate utf8mb4_general_ci;
      +-----------------------------------+
      | 'TEST' collate utf8mb4_general_ci |
      +-----------------------------------+
      | TEST                              |
      +-----------------------------------+
      1 row in set (0.000 sec)

      Attachments

        Issue Links

          Activity

            Reproducible on all of 5.5-10.6.

            I suppose it is not intended behavior, as it's been fixed in MySQL 5.7+, while reproducible with MySQL 5.6 as of 5.6.43 at least. bar, could you please take a look and confirm (or not)?

            SET NAMES utf8mb4;
            select 'TEST' collate utf8mb4_general_ci;
            select 1 collate utf8mb4_general_ci;
            

            10.5 e4205fba

            SET NAMES utf8mb4;
            select 'TEST' collate utf8mb4_general_ci;
            'TEST' collate utf8mb4_general_ci
            TEST
            select 1 collate utf8mb4_general_ci;
            bug.t                                    [ fail ]
                    Test ended at 2021-01-14 14:40:58
             
            CURRENT_TEST: bug.t
            mysqltest: At line 3: query 'select 1 collate utf8mb4_general_ci' failed: 1253: COLLATION 'utf8mb4_general_ci' is not valid for CHARACTER SET 'latin1'
            

            MySQL 5.7.25

            SET NAMES utf8mb4;
            select 'TEST' collate utf8mb4_general_ci;
            'TEST' collate utf8mb4_general_ci
            TEST
            select 1 collate utf8mb4_general_ci;
            1 collate utf8mb4_general_ci
            1
            

            elenst Elena Stepanova added a comment - Reproducible on all of 5.5-10.6. I suppose it is not intended behavior, as it's been fixed in MySQL 5.7+, while reproducible with MySQL 5.6 as of 5.6.43 at least. bar , could you please take a look and confirm (or not)? SET NAMES utf8mb4; select 'TEST' collate utf8mb4_general_ci; select 1 collate utf8mb4_general_ci; 10.5 e4205fba SET NAMES utf8mb4; select 'TEST' collate utf8mb4_general_ci; 'TEST' collate utf8mb4_general_ci TEST select 1 collate utf8mb4_general_ci; bug.t [ fail ] Test ended at 2021-01-14 14:40:58   CURRENT_TEST: bug.t mysqltest: At line 3: query 'select 1 collate utf8mb4_general_ci' failed: 1253: COLLATION 'utf8mb4_general_ci' is not valid for CHARACTER SET 'latin1' MySQL 5.7.25 SET NAMES utf8mb4; select 'TEST' collate utf8mb4_general_ci; 'TEST' collate utf8mb4_general_ci TEST select 1 collate utf8mb4_general_ci; 1 collate utf8mb4_general_ci 1

            According to the SQL standard, COLLATE is applicable to string data types only.
            But MySQL/MariaDB style is to do automatic data type conversion.
            So perhaps it's good that MySQL fixed it.

            However, the fix is probably incomplete:

            MySQL [test]> select hex(1 collate ucs2_general_ci);
            +--------------------------------+
            | hex(1 collate ucs2_general_ci) |
            +--------------------------------+
            | 31                             |
            +--------------------------------+
            

            Notice, 0x31 is not a valid UCS2 string.

            bar Alexander Barkov added a comment - According to the SQL standard, COLLATE is applicable to string data types only. But MySQL/MariaDB style is to do automatic data type conversion. So perhaps it's good that MySQL fixed it. However, the fix is probably incomplete: MySQL [test]> select hex(1 collate ucs2_general_ci); +--------------------------------+ | hex(1 collate ucs2_general_ci) | +--------------------------------+ | 31 | +--------------------------------+ Notice, 0x31 is not a valid UCS2 string.
            bar Alexander Barkov added a comment - - edited sanja , can you please review: https://github.com/MariaDB/server/tree/bb-10.5-bar-MDEV-24584 ? Thanks.

            The past variant is OK to push

            sanja Oleksandr Byelkin added a comment - The past variant is OK to push

            People

              bar Alexander Barkov
              rob.schwyzer@mariadb.com Rob Schwyzer (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.