Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27782

Wrong columns when using table level `CHARACTER SET utf8mb4 COLLATE DEFAULT`

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL)
    • 10.9.2
    • Character Sets
    • None

    Description

      CREATE OR REPLACE TABLE t1 (a CHAR(10)) CHARACTER SET utf8mb4 COLLATE DEFAULT;
      SHOW CREATE TABLE t1;
      

      +-------+----------------------------------------------------------------------------------------+
      | Table | Create Table                                                                           |
      +-------+----------------------------------------------------------------------------------------+
      | t1    | CREATE TABLE `t1` (
        `a` char(10) DEFAULT NULL
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
      +-------+----------------------------------------------------------------------------------------+
      

      The column was created with a wrong character set. The expected character set is utf8mb4.

      Attachments

        Issue Links

          Activity

            peterdd peterdd added a comment -

            I tested with a Debian Testing VM: Linux testing 5.15.0-2-amd64 #1 SMP Debian 5.15.5-2 (2021-12-18) x86_64 GNU/Linux

            MariaDB [test]> CREATE OR REPLACE TABLE t1 (a CHAR(10)) CHARACTER SET utf8mb4 COLLATE DEFAULT;
            Query OK, 0 rows affected (0.005 sec)
             
            MariaDB [test]> 
            MariaDB [test]> SHOW CREATE TABLE t1;
            +-------+-----------------------------------------------------------------------------------------+
            | Table | Create Table                                                                            |
            +-------+-----------------------------------------------------------------------------------------+
            | t1    | CREATE TABLE `t1` (
              `a` char(10) DEFAULT NULL
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 |
            +-------+-----------------------------------------------------------------------------------------+
            1 row in set (0.001 sec)
            

            The result seems ok to me.
            Maybe you have different default settings (database default character set/defautl collation)?

            What is the result of

            select * from information_schema.schemata;
            

            MariaDB [test]> status
            --------------
            mysql  Ver 15.1 Distrib 10.6.7-MariaDB, for debian-linux-gnu (x86_64) using  EditLine wrapper
             
            Connection id:		31
            Current database:	test
            Current user:		root@localhost
            SSL:			Not in use
            Current pager:		stdout
            Using outfile:		''
            Using delimiter:	;
            Server:			MariaDB
            Server version:		10.6.7-MariaDB-3 Debian buildd-unstable
            Protocol version:	10
            Connection:		Localhost via UNIX socket
            Server characterset:	utf8mb4
            Db     characterset:	utf8mb4
            Client characterset:	utf8mb3
            Conn.  characterset:	utf8mb3
            UNIX socket:		/run/mysqld/mysqld.sock
            Uptime:			1 min 11 sec
            

            Update: I am able to reproduce using a database test2 with other charset and collation:

            select * from information_schema.schemata;
            +--------------+--------------------+----------------------------+------------------------+----------+----------------+
            | CATALOG_NAME | SCHEMA_NAME        | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | SQL_PATH | SCHEMA_COMMENT |
            +--------------+--------------------+----------------------------+------------------------+----------+----------------+
            | def          | information_schema | utf8mb3                    | utf8mb3_general_ci     | NULL     |                |
            | def          | test2              | latin1                     | latin1_swedish_ci      | NULL     |                |
            | def          | sys                | utf8mb3                    | utf8mb3_general_ci     | NULL     |                |
            | def          | performance_schema | utf8mb3                    | utf8mb3_general_ci     | NULL     |                |
            | def          | mysql              | utf8mb4                    | utf8mb4_general_ci     | NULL     |                |
            | def          | test               | utf8mb4                    | utf8mb4_general_ci     | NULL     |                |
            +--------------+--------------------+----------------------------+------------------------+----------+----------------+
            6 rows in set (0.001 sec)
             
            MariaDB [test]> use test2
            Database changed
            MariaDB [test2]> CREATE OR REPLACE TABLE t1 (a CHAR(10)) CHARACTER SET utf8mb4 COLLATE DEFAULT;
            Query OK, 0 rows affected (0.006 sec)
             
            MariaDB [test2]> show create table t1;
            +-------+----------------------------------------------------------------------------------------+
            | Table | Create Table                                                                           |
            +-------+----------------------------------------------------------------------------------------+
            | t1    | CREATE TABLE `t1` (
              `a` char(10) DEFAULT NULL
            ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
            +-------+----------------------------------------------------------------------------------------+
            1 row in set (0.001 sec)
            

            So it depends on the database settings and the chosen charset/collation for the table creation.

            So maybe if charset utf8mb4 and the default collation is incompatible to utf8mb4 there is no fallback to fix this?
            Or should the query be denied with an error message?

            How behaves Mysql in this cases? (I wish these behave the same on both.)

            peterdd peterdd added a comment - I tested with a Debian Testing VM: Linux testing 5.15.0-2-amd64 #1 SMP Debian 5.15.5-2 (2021-12-18) x86_64 GNU/Linux MariaDB [test]> CREATE OR REPLACE TABLE t1 (a CHAR (10)) CHARACTER SET utf8mb4 COLLATE DEFAULT ; Query OK, 0 rows affected (0.005 sec)   MariaDB [test]> MariaDB [test]> SHOW CREATE TABLE t1; + -------+-----------------------------------------------------------------------------------------+ | Table | Create Table | + -------+-----------------------------------------------------------------------------------------+ | t1 | CREATE TABLE `t1` ( `a` char (10) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 | + -------+-----------------------------------------------------------------------------------------+ 1 row in set (0.001 sec) The result seems ok to me. Maybe you have different default settings (database default character set/defautl collation)? What is the result of select * from information_schema.schemata; MariaDB [test]> status -------------- mysql Ver 15.1 Distrib 10.6.7-MariaDB, for debian-linux-gnu (x86_64) using EditLine wrapper   Connection id: 31 Current database: test Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server: MariaDB Server version: 10.6.7-MariaDB-3 Debian buildd-unstable Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: utf8mb4 Db characterset: utf8mb4 Client characterset: utf8mb3 Conn. characterset: utf8mb3 UNIX socket: /run/mysqld/mysqld.sock Uptime: 1 min 11 sec Update: I am able to reproduce using a database test2 with other charset and collation: select * from information_schema.schemata; +--------------+--------------------+----------------------------+------------------------+----------+----------------+ | CATALOG_NAME | SCHEMA_NAME | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | SQL_PATH | SCHEMA_COMMENT | +--------------+--------------------+----------------------------+------------------------+----------+----------------+ | def | information_schema | utf8mb3 | utf8mb3_general_ci | NULL | | | def | test2 | latin1 | latin1_swedish_ci | NULL | | | def | sys | utf8mb3 | utf8mb3_general_ci | NULL | | | def | performance_schema | utf8mb3 | utf8mb3_general_ci | NULL | | | def | mysql | utf8mb4 | utf8mb4_general_ci | NULL | | | def | test | utf8mb4 | utf8mb4_general_ci | NULL | | +--------------+--------------------+----------------------------+------------------------+----------+----------------+ 6 rows in set (0.001 sec)   MariaDB [test]> use test2 Database changed MariaDB [test2]> CREATE OR REPLACE TABLE t1 (a CHAR(10)) CHARACTER SET utf8mb4 COLLATE DEFAULT; Query OK, 0 rows affected (0.006 sec)   MariaDB [test2]> show create table t1; +-------+----------------------------------------------------------------------------------------+ | Table | Create Table | +-------+----------------------------------------------------------------------------------------+ | t1 | CREATE TABLE `t1` ( `a` char(10) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 | +-------+----------------------------------------------------------------------------------------+ 1 row in set (0.001 sec) So it depends on the database settings and the chosen charset/collation for the table creation. So maybe if charset utf8mb4 and the default collation is incompatible to utf8mb4 there is no fallback to fix this? Or should the query be denied with an error message? How behaves Mysql in this cases? (I wish these behave the same on both.)
            peterdd peterdd added a comment -

            CREATE OR REPLACE TABLE t2 (a CHAR(10)) CHARACTER SET utf8mb4 COLLATE latin1_swedish_ci;
            ERROR 1253 (42000): COLLATION 'latin1_swedish_ci' is not valid for CHARACTER SET 'utf8mb4'
            

            Explicitely setting an incompatible collation for the charset gives error. So either the 'DEFAULT COLLATE' fallbacks automatically to the 'default collation' of the wished CHARACTER SET or simply spits an error, right?

            What says the SQL Standard? How does MySQL and other database types bahave?

            peterdd peterdd added a comment - CREATE OR REPLACE TABLE t2 (a CHAR (10)) CHARACTER SET utf8mb4 COLLATE latin1_swedish_ci; ERROR 1253 (42000): COLLATION 'latin1_swedish_ci' is not valid for CHARACTER SET 'utf8mb4' Explicitely setting an incompatible collation for the charset gives error. So either the 'DEFAULT COLLATE' fallbacks automatically to the 'default collation' of the wished CHARACTER SET or simply spits an error, right? What says the SQL Standard? How does MySQL and other database types bahave?
            bar Alexander Barkov added a comment - - edited

            Correct, the result depends on the database settings. In my case, when I reported this MDEV, the database was created with latin1 as a default character set.

            Before the fix "CHARACTER SET utf8mb4 COLLATE DEFAULT" forgot the "CHARACTER SET" part and used the database level defaults (both character set and collation). MySQL has the same problem.

            After the fix it uses the default collation of the specified character set.

            SQL Standard supports only explicit collation names in COLLATE clauses and does not have "COLLATE DEFAULT". This is a MySQL/MariaDB extension.

            bar Alexander Barkov added a comment - - edited Correct, the result depends on the database settings. In my case, when I reported this MDEV, the database was created with latin1 as a default character set. Before the fix "CHARACTER SET utf8mb4 COLLATE DEFAULT" forgot the "CHARACTER SET" part and used the database level defaults (both character set and collation). MySQL has the same problem. After the fix it uses the default collation of the specified character set. SQL Standard supports only explicit collation names in COLLATE clauses and does not have "COLLATE DEFAULT". This is a MySQL/MariaDB extension.

            Pushed in a single patch together with MDEV-27896

            bar Alexander Barkov added a comment - Pushed in a single patch together with MDEV-27896

            People

              bar Alexander Barkov
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.