Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-10306

Wrong results with combination of CONCAT, SUBSTR and CONVERT in subquery

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 5.3.12, 5.5.48, 10.0.24, 10.1.11
    • 10.0.32
    • Optimizer
    • None
    • Linux
    • 10.0.30

    Description

      There seems to be a problem with the combination of CONCAT, SUBSTR and CONVERT in subquery. Simplest test case so far that reproduces this is:

      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
       
      INSERT INTO t1 VALUES('1234567');
       
      SELECT CONCAT(SUBSTR(t2, 1, 3), SUBSTR(t2, 5)) c1
                  , CONCAT(SUBSTR(t2,1,3),'---',SUBSTR(t2,5)) c2
         FROM (SELECT CONVERT(t USING latin1) t2 FROM t1) sub;
      

      Expected result, produced by MariaDB 5.1.x and 5.2.x, and all tested versions of MySQL:

      +--------+-----------+
      | c1     | c2        |
      +--------+-----------+
      | 123567 | 123---567 |
      +--------+-----------+
      

      MariaDB 5.3.x result:

      +--------+-----------+
      | c1     | c2        |
      +--------+-----------+
      | 123567 | 123456567 |
      +--------+-----------+
      

      MariaDB 5.5.x - 10.1.x:

      +--------+-----------+
      | c1     | c2        |
      +--------+-----------+
      | 12356  | 123456567 |
      +--------+-----------+
      

      Attachments

        Issue Links

          Activity

            hholzgra Hartmut Holzgraefe added a comment - - edited

            Only happens with optimizer_switch="derived_merge=on"

            hholzgra Hartmut Holzgraefe added a comment - - edited Only happens with optimizer_switch="derived_merge=on"

            10.2 is also affected.

            elenst Elena Stepanova added a comment - 10.2 is also affected.
            bar Alexander Barkov added a comment - - edited

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
            INSERT INTO t1 VALUES('1234567');
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT CONVERT(t USING latin1) t2 FROM t1) sub;
            

            +----------------+
            | c2             |
            +----------------+
            | 12345671234567 |
            +----------------+
            

            The expected result is 1234567-1234567.

            More examples:

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
            INSERT INTO t1 VALUES('abcdefghi');
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT REVERSE(t) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT SOUNDEX(t) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT TO_BASE64(t) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT WEIGHT_STRING(t) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT HEX(t) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT QUOTE(t) t2 FROM t1) sub;
            

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (t VARCHAR(32) CHARSET latin1);
            INSERT INTO t1 VALUES(TO_BASE64('abcdefghi'));
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT FROM_BASE64(t) t2 FROM t1) sub;
            

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (t VARCHAR(32) CHARSET latin1);
            INSERT INTO t1 VALUES(HEX('abcdefghi'));
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT UNHEX(t) t2 FROM t1) sub;
            

            All above queries return results without the '-' character, which means some memory overrun happens.

            The same problem is repeatable with Item_str_ascii_func descendants, when conversion takes place:

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
            INSERT INTO t1 VALUES('abcdefghi');
            SET @save_character_set_connection=@@character_set_connection;
            SET character_set_connection=ucs2;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT HEX(t) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT TO_BASE64(t) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT OLD_PASSWORD(t) t2 FROM t1) sub;
            SET character_set_connection=@save_character_set_connection;
            

            The same problem is repeatable with AES_ENCRYPT:

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (t VARCHAR(30) CHARSET latin1);
            INSERT INTO t1 VALUES('test');
            SELECT LENGTH(CONCAT(t2)) c2 FROM (SELECT AES_ENCRYPT(t,'x') t2 FROM t1) sub;
            SELECT LENGTH(CONCAT(t2,'-',t2)) c2 FROM (SELECT AES_ENCRYPT(t,'x') t2 FROM t1) sub;
            SELECT LENGTH(CONCAT(t2,'--',t2)) c2 FROM (SELECT AES_ENCRYPT(t,'x') t2 FROM t1) sub;
            

            Notice, the first SELECT query returns 16. The second query is expected to return 33, and the third query is expected to return 34. In fact the second and the third query erroneously return 32.

            The same problem is repeatable with AES_DECRYPT:

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (t VARCHAR(30) CHARSET latin1);
            INSERT INTO t1 VALUES('test');
            SELECT LENGTH(CONCAT(t2)) c2 FROM (SELECT AES_DECRYPT(AES_ENCRYPT(t,'x'),'x') t2 FROM t1) sub;
            SELECT LENGTH(CONCAT(t2,'-',t2)) c2 FROM (SELECT AES_DECRYPT(AES_ENCRYPT(t,'x'),'x') t2 FROM t1) sub;
            SELECT LENGTH(CONCAT(t2,'--',t2)) c2 FROM (SELECT AES_DECRYPT(AES_ENCRYPT(t,'x'),'x') t2 FROM t1) sub;
            

            The first SELECT query returns 4, the second and the third query are expected to return 9 and 10 respectively, but in fact the second and the third queries both return 8.

            Also repeatable with ST_BUFFER:

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (x INT, y INT);
            INSERT INTO t1 VALUES(0,0);
            SELECT LENGTH(t2) c2 FROM (SELECT ST_BUFFER(POINT(x,y), 0) t2 FROM t1) sub;
            SELECT LENGTH(CONCAT(t2,'-',t2)) c2 FROM (SELECT ST_BUFFER(POINT(x,y), 0) t2 FROM t1) sub;
            SELECT LENGTH(CONCAT(t2,'--',t2)) c2 FROM (SELECT ST_BUFFER(POINT(x,y), 0) t2 FROM t1) sub;
            

            The first SELECT query returns 25, therefore the second and the third query should return 51 and 52 respectively. In fact both the second and the third query erroneously return 50.

            bar Alexander Barkov added a comment - - edited DROP TABLE IF EXISTS t1; CREATE TABLE t1 (t VARCHAR (10) CHARSET latin1); INSERT INTO t1 VALUES ( '1234567' ); SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT CONVERT (t USING latin1) t2 FROM t1) sub; +----------------+ | c2 | +----------------+ | 12345671234567 | +----------------+ The expected result is 1234567-1234567 . More examples: DROP TABLE IF EXISTS t1; CREATE TABLE t1 (t VARCHAR (10) CHARSET latin1); INSERT INTO t1 VALUES ( 'abcdefghi' ); SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT REVERSE(t) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT SOUNDEX(t) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT TO_BASE64(t) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT WEIGHT_STRING(t) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT HEX(t) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT QUOTE(t) t2 FROM t1) sub; DROP TABLE IF EXISTS t1; CREATE TABLE t1 (t VARCHAR (32) CHARSET latin1); INSERT INTO t1 VALUES (TO_BASE64( 'abcdefghi' )); SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT FROM_BASE64(t) t2 FROM t1) sub; DROP TABLE IF EXISTS t1; CREATE TABLE t1 (t VARCHAR (32) CHARSET latin1); INSERT INTO t1 VALUES (HEX( 'abcdefghi' )); SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT UNHEX(t) t2 FROM t1) sub; All above queries return results without the '-' character, which means some memory overrun happens. The same problem is repeatable with Item_str_ascii_func descendants, when conversion takes place: DROP TABLE IF EXISTS t1; CREATE TABLE t1 (t VARCHAR (10) CHARSET latin1); INSERT INTO t1 VALUES ( 'abcdefghi' ); SET @save_character_set_connection=@@character_set_connection; SET character_set_connection=ucs2; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT HEX(t) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT TO_BASE64(t) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT OLD_PASSWORD(t) t2 FROM t1) sub; SET character_set_connection=@save_character_set_connection; The same problem is repeatable with AES_ENCRYPT : DROP TABLE IF EXISTS t1; CREATE TABLE t1 (t VARCHAR (30) CHARSET latin1); INSERT INTO t1 VALUES ( 'test' ); SELECT LENGTH(CONCAT(t2)) c2 FROM ( SELECT AES_ENCRYPT(t, 'x' ) t2 FROM t1) sub; SELECT LENGTH(CONCAT(t2, '-' ,t2)) c2 FROM ( SELECT AES_ENCRYPT(t, 'x' ) t2 FROM t1) sub; SELECT LENGTH(CONCAT(t2, '--' ,t2)) c2 FROM ( SELECT AES_ENCRYPT(t, 'x' ) t2 FROM t1) sub; Notice, the first SELECT query returns 16 . The second query is expected to return 33 , and the third query is expected to return 34 . In fact the second and the third query erroneously return 32 . The same problem is repeatable with AES_DECRYPT : DROP TABLE IF EXISTS t1; CREATE TABLE t1 (t VARCHAR (30) CHARSET latin1); INSERT INTO t1 VALUES ( 'test' ); SELECT LENGTH(CONCAT(t2)) c2 FROM ( SELECT AES_DECRYPT(AES_ENCRYPT(t, 'x' ), 'x' ) t2 FROM t1) sub; SELECT LENGTH(CONCAT(t2, '-' ,t2)) c2 FROM ( SELECT AES_DECRYPT(AES_ENCRYPT(t, 'x' ), 'x' ) t2 FROM t1) sub; SELECT LENGTH(CONCAT(t2, '--' ,t2)) c2 FROM ( SELECT AES_DECRYPT(AES_ENCRYPT(t, 'x' ), 'x' ) t2 FROM t1) sub; The first SELECT query returns 4 , the second and the third query are expected to return 9 and 10 respectively, but in fact the second and the third queries both return 8 . Also repeatable with ST_BUFFER : DROP TABLE IF EXISTS t1; CREATE TABLE t1 (x INT , y INT ); INSERT INTO t1 VALUES (0,0); SELECT LENGTH(t2) c2 FROM ( SELECT ST_BUFFER(POINT(x,y), 0) t2 FROM t1) sub; SELECT LENGTH(CONCAT(t2, '-' ,t2)) c2 FROM ( SELECT ST_BUFFER(POINT(x,y), 0) t2 FROM t1) sub; SELECT LENGTH(CONCAT(t2, '--' ,t2)) c2 FROM ( SELECT ST_BUFFER(POINT(x,y), 0) t2 FROM t1) sub; The first SELECT query returns 25 , therefore the second and the third query should return 51 and 52 respectively. In fact both the second and the third query erroneously return 50 .

            These functions are not affected:

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
            INSERT INTO t1 VALUES('abcdefghi');
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT LEFT(t,10) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT RIGHT(t,10) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT SUBSTR(t,1,10) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT LTRIM(t) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT RTRIM(t) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT TRIM(t) t2 FROM t1) sub;
            

            bar Alexander Barkov added a comment - These functions are not affected: DROP TABLE IF EXISTS t1; CREATE TABLE t1 (t VARCHAR (10) CHARSET latin1); INSERT INTO t1 VALUES ( 'abcdefghi' ); SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT LEFT (t,10) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT RIGHT (t,10) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT SUBSTR(t,1,10) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT LTRIM(t) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT RTRIM(t) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT TRIM(t) t2 FROM t1) sub;

            The following functions are also affected:

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
            INSERT INTO t1 VALUES('abcdefghi');
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT LOWER(t) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT UPPER(t) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT CAST(t AS CHAR CHARACTER SET utf8) t2 FROM t1) sub;
            SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT MAKE_SET(3,t,t) t2 FROM t1) sub;
            

            but they will be fixed in under terms of separate MDEVs. See the "relates to" links.

            bar Alexander Barkov added a comment - The following functions are also affected: DROP TABLE IF EXISTS t1; CREATE TABLE t1 (t VARCHAR (10) CHARSET latin1); INSERT INTO t1 VALUES ( 'abcdefghi' ); SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT LOWER (t) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT UPPER (t) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT CAST (t AS CHAR CHARACTER SET utf8) t2 FROM t1) sub; SELECT CONCAT(t2, '-' ,t2) c2 FROM ( SELECT MAKE_SET(3,t,t) t2 FROM t1) sub; but they will be fixed in under terms of separate MDEVs. See the "relates to" links.

            Pushed into 10.0.

            bar Alexander Barkov added a comment - Pushed into 10.0.

            People

              bar Alexander Barkov
              hholzgra Hartmut Holzgraefe
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.