Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33772

Bad SEPARATOR value in GROUP_CONCAT on character set conversion

Details

    Description

      The following script produces a wrong output - a non-ASCII SEPARATOR in GROUP_CONCAT gets replaced to the question mark.

      SET NAMES utf8, @@collation_connection=latin1_swedish_ci;
      CREATE OR REPLACE TABLE t1 (c VARCHAR(10)) CHARACTER SET latin1;
      INSERT INTO t1 VALUES ('a'),('A');
      CREATE OR REPLACE VIEW v1 AS
        SELECT LEFT(GROUP_CONCAT(c SEPARATOR 'ß'), 20) AS example FROM t1 GROUP BY c;
      SELECT * FROM v1;
      

      +---------+
      | example |
      +---------+
      | a?A     |
      +---------+
      

      The output looks wrong. The expected result is 'aßA'.

      SHOW CREATE VIEW output is also wrong.

      SHOW CREATE VIEW v1;
      

      +------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+----------------------+
      | View | Create View                                                                                                                                                                          | character_set_client | collation_connection |
      +------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+----------------------+
      | v1   | CREATE ALGORITHM=UNDEFINED DEFINER=`bar`@`localhost` SQL SECURITY DEFINER VIEW `v1` AS select left(group_concat(`t1`.`c` separator '?'),20) AS `example` from `t1` group by `t1`.`c` | utf8mb3              | latin1_swedish_ci    |
      +------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+----------------------+
      

      Attachments

        Issue Links

          Activity

            bar Alexander Barkov added a comment - - edited

            Another example:

            Now with the default separator (comma), but with a tricky connection character set ucs2:

            SET @@collation_connection=ucs2_general_ci, @@character_set_results=utf8mb3;
            CREATE OR REPLACE TABLE t1 (c VARCHAR(10)) CHARACTER SET ucs2;
            INSERT INTO t1 VALUES ('a'),('A');
            CREATE OR REPLACE VIEW v1 AS
              SELECT COUNT(*) AS cnt, LEFT(GROUP_CONCAT(c), 20) AS example FROM t1 GROUP BY c;
            SELECT * FROM v1;
            SHOW CREATE VIEW v1;
            

            +-----+---------+
            | cnt | example |
            +-----+---------+
            |   2 | a ,A    |
            +-----+---------+
            1 row in set (0.001 sec)
             
            MariaDB [test]> SHOW CREATE VIEW v1;
            +------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+----------------------+
            | View | Create View                                                                                                                                                                                                  | character_set_client | collation_connection |
            +------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+----------------------+
            | v1   | CREATE ALGORITHM=UNDEFINED DEFINER=`bar`@`localhost` SQL SECURITY DEFINER VIEW `v1` AS select count(0) AS `cnt`,left(group_concat(`t1`.`c` separator '\0\0\0,'),20) AS `example` from `t1` group by `t1`.`c` | utf8mb3              | ucs2_general_ci      |
            +------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+----------------------+
            

            Notice, there is a redundant zero byte between the 'a' and the comma, and redundant '\0' bytes in the SHOW output.

            The expected SELECT result should be:

            +-----+---------+
            | cnt | example |
            +-----+---------+
            |   2 | a,A     |
            +-----+---------+
            

            bar Alexander Barkov added a comment - - edited Another example: Now with the default separator (comma), but with a tricky connection character set ucs2: SET @@collation_connection=ucs2_general_ci, @@character_set_results=utf8mb3; CREATE OR REPLACE TABLE t1 (c VARCHAR (10)) CHARACTER SET ucs2; INSERT INTO t1 VALUES ( 'a' ),( 'A' ); CREATE OR REPLACE VIEW v1 AS SELECT COUNT (*) AS cnt, LEFT (GROUP_CONCAT(c), 20) AS example FROM t1 GROUP BY c; SELECT * FROM v1; SHOW CREATE VIEW v1; +-----+---------+ | cnt | example | +-----+---------+ | 2 | a ,A | +-----+---------+ 1 row in set (0.001 sec)   MariaDB [test]> SHOW CREATE VIEW v1; +------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+----------------------+ | View | Create View | character_set_client | collation_connection | +------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+----------------------+ | v1 | CREATE ALGORITHM=UNDEFINED DEFINER=`bar`@`localhost` SQL SECURITY DEFINER VIEW `v1` AS select count(0) AS `cnt`,left(group_concat(`t1`.`c` separator '\0\0\0,'),20) AS `example` from `t1` group by `t1`.`c` | utf8mb3 | ucs2_general_ci | +------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+----------------------+ Notice, there is a redundant zero byte between the 'a' and the comma, and redundant '\0' bytes in the SHOW output. The expected SELECT result should be: +-----+---------+ | cnt | example | +-----+---------+ | 2 | a,A | +-----+---------+
            bar Alexander Barkov added a comment - - edited

            Simpler script versions (without the function LEFT()) also repeat the problem:

            SET NAMES utf8mb3, @@collation_connection=latin1_swedish_ci;
            CREATE OR REPLACE TABLE t1 (c VARCHAR(10)) CHARACTER SET latin1;
            INSERT INTO t1 VALUES ('a'),('A');
            CREATE OR REPLACE VIEW v1 AS
              SELECT GROUP_CONCAT(c SEPARATOR 'ß') AS c1 FROM t1 GROUP BY c;
            SELECT * FROM v1;
            SHOW CREATE VIEW v1;
            

            SET NAMES utf8mb3, @@collation_connection=ucs2_general_ci;
            CREATE OR REPLACE TABLE t1 (c VARCHAR(10)) CHARACTER SET ucs2;
            INSERT INTO t1 VALUES ('a'),('A');
            CREATE OR REPLACE VIEW v1 AS
              SELECT COUNT(*) AS cnt, GROUP_CONCAT(c) AS c1 FROM t1 GROUP BY c;
            SELECT * FROM v1;
            SHOW CREATE VIEW v1;
            

            bar Alexander Barkov added a comment - - edited Simpler script versions (without the function LEFT()) also repeat the problem: SET NAMES utf8mb3, @@collation_connection=latin1_swedish_ci; CREATE OR REPLACE TABLE t1 (c VARCHAR (10)) CHARACTER SET latin1; INSERT INTO t1 VALUES ( 'a' ),( 'A' ); CREATE OR REPLACE VIEW v1 AS SELECT GROUP_CONCAT(c SEPARATOR 'ß' ) AS c1 FROM t1 GROUP BY c; SELECT * FROM v1; SHOW CREATE VIEW v1; SET NAMES utf8mb3, @@collation_connection=ucs2_general_ci; CREATE OR REPLACE TABLE t1 (c VARCHAR (10)) CHARACTER SET ucs2; INSERT INTO t1 VALUES ( 'a' ),( 'A' ); CREATE OR REPLACE VIEW v1 AS SELECT COUNT (*) AS cnt, GROUP_CONCAT(c) AS c1 FROM t1 GROUP BY c; SELECT * FROM v1; SHOW CREATE VIEW v1;

            People

              bar Alexander Barkov
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.