[MDEV-10306] Wrong results with combination of CONCAT, SUBSTR and CONVERT in subquery Created: 2016-06-29  Updated: 2020-08-25  Resolved: 2017-06-19

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 5.3.12, 5.5.48, 10.0.24, 10.1.11
Fix Version/s: 10.0.32

Type: Bug Priority: Critical
Reporter: Hartmut Holzgraefe Assignee: Alexander Barkov
Resolution: Fixed Votes: 1
Labels: None
Environment:

Linux


Issue Links:
Relates
relates to MDEV-13118 Wrong results with LOWER and UPPER an... Closed
relates to MDEV-13119 Wrong results with CAST(AS CHAR) and ... Closed
relates to MDEV-13120 Wrong results with MAKE_SET() and sub... Closed
relates to MDEV-13793 Wrong result with combination of CONC... Closed
Sprint: 10.0.30

 Description   

There seems to be a problem with the combination of CONCAT, SUBSTR and CONVERT in subquery. Simplest test case so far that reproduces this is:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
 
INSERT INTO t1 VALUES('1234567');
 
SELECT CONCAT(SUBSTR(t2, 1, 3), SUBSTR(t2, 5)) c1
            , CONCAT(SUBSTR(t2,1,3),'---',SUBSTR(t2,5)) c2
   FROM (SELECT CONVERT(t USING latin1) t2 FROM t1) sub;

Expected result, produced by MariaDB 5.1.x and 5.2.x, and all tested versions of MySQL:

+--------+-----------+
| c1     | c2        |
+--------+-----------+
| 123567 | 123---567 |
+--------+-----------+

MariaDB 5.3.x result:

+--------+-----------+
| c1     | c2        |
+--------+-----------+
| 123567 | 123456567 |
+--------+-----------+

MariaDB 5.5.x - 10.1.x:

+--------+-----------+
| c1     | c2        |
+--------+-----------+
| 12356  | 123456567 |
+--------+-----------+



 Comments   
Comment by Hartmut Holzgraefe [ 2016-06-29 ]

Only happens with optimizer_switch="derived_merge=on"

Comment by Elena Stepanova [ 2016-06-29 ]

10.2 is also affected.

Comment by Alexander Barkov [ 2017-03-01 ]

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
INSERT INTO t1 VALUES('1234567');
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT CONVERT(t USING latin1) t2 FROM t1) sub;

+----------------+
| c2             |
+----------------+
| 12345671234567 |
+----------------+

The expected result is 1234567-1234567.

More examples:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
INSERT INTO t1 VALUES('abcdefghi');
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT REVERSE(t) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT SOUNDEX(t) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT TO_BASE64(t) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT WEIGHT_STRING(t) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT HEX(t) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT QUOTE(t) t2 FROM t1) sub;

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (t VARCHAR(32) CHARSET latin1);
INSERT INTO t1 VALUES(TO_BASE64('abcdefghi'));
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT FROM_BASE64(t) t2 FROM t1) sub;

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (t VARCHAR(32) CHARSET latin1);
INSERT INTO t1 VALUES(HEX('abcdefghi'));
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT UNHEX(t) t2 FROM t1) sub;

All above queries return results without the '-' character, which means some memory overrun happens.

The same problem is repeatable with Item_str_ascii_func descendants, when conversion takes place:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
INSERT INTO t1 VALUES('abcdefghi');
SET @save_character_set_connection=@@character_set_connection;
SET character_set_connection=ucs2;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT HEX(t) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT TO_BASE64(t) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT OLD_PASSWORD(t) t2 FROM t1) sub;
SET character_set_connection=@save_character_set_connection;

The same problem is repeatable with AES_ENCRYPT:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (t VARCHAR(30) CHARSET latin1);
INSERT INTO t1 VALUES('test');
SELECT LENGTH(CONCAT(t2)) c2 FROM (SELECT AES_ENCRYPT(t,'x') t2 FROM t1) sub;
SELECT LENGTH(CONCAT(t2,'-',t2)) c2 FROM (SELECT AES_ENCRYPT(t,'x') t2 FROM t1) sub;
SELECT LENGTH(CONCAT(t2,'--',t2)) c2 FROM (SELECT AES_ENCRYPT(t,'x') t2 FROM t1) sub;

Notice, the first SELECT query returns 16. The second query is expected to return 33, and the third query is expected to return 34. In fact the second and the third query erroneously return 32.

The same problem is repeatable with AES_DECRYPT:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (t VARCHAR(30) CHARSET latin1);
INSERT INTO t1 VALUES('test');
SELECT LENGTH(CONCAT(t2)) c2 FROM (SELECT AES_DECRYPT(AES_ENCRYPT(t,'x'),'x') t2 FROM t1) sub;
SELECT LENGTH(CONCAT(t2,'-',t2)) c2 FROM (SELECT AES_DECRYPT(AES_ENCRYPT(t,'x'),'x') t2 FROM t1) sub;
SELECT LENGTH(CONCAT(t2,'--',t2)) c2 FROM (SELECT AES_DECRYPT(AES_ENCRYPT(t,'x'),'x') t2 FROM t1) sub;

The first SELECT query returns 4, the second and the third query are expected to return 9 and 10 respectively, but in fact the second and the third queries both return 8.

Also repeatable with ST_BUFFER:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (x INT, y INT);
INSERT INTO t1 VALUES(0,0);
SELECT LENGTH(t2) c2 FROM (SELECT ST_BUFFER(POINT(x,y), 0) t2 FROM t1) sub;
SELECT LENGTH(CONCAT(t2,'-',t2)) c2 FROM (SELECT ST_BUFFER(POINT(x,y), 0) t2 FROM t1) sub;
SELECT LENGTH(CONCAT(t2,'--',t2)) c2 FROM (SELECT ST_BUFFER(POINT(x,y), 0) t2 FROM t1) sub;

The first SELECT query returns 25, therefore the second and the third query should return 51 and 52 respectively. In fact both the second and the third query erroneously return 50.

Comment by Alexander Barkov [ 2017-06-14 ]

These functions are not affected:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
INSERT INTO t1 VALUES('abcdefghi');
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT LEFT(t,10) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT RIGHT(t,10) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT SUBSTR(t,1,10) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT LTRIM(t) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT RTRIM(t) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT TRIM(t) t2 FROM t1) sub;

Comment by Alexander Barkov [ 2017-06-19 ]

The following functions are also affected:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (t VARCHAR(10) CHARSET latin1);
INSERT INTO t1 VALUES('abcdefghi');
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT LOWER(t) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT UPPER(t) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT CAST(t AS CHAR CHARACTER SET utf8) t2 FROM t1) sub;
SELECT CONCAT(t2,'-',t2) c2 FROM (SELECT MAKE_SET(3,t,t) t2 FROM t1) sub;

but they will be fixed in under terms of separate MDEVs. See the "relates to" links.

Comment by Alexander Barkov [ 2017-06-19 ]

Pushed into 10.0.

Generated at Thu Feb 08 07:41:12 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.