[MDEV-8790] EXPLAIN SELECT after equal fields propagation can be confusing Created: 2015-09-11  Updated: 2023-04-27

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.0, 10.1, 10.2, 10.3
Fix Version/s: 10.4

Type: Bug Priority: Minor
Reporter: Alexander Barkov Assignee: Alexander Barkov
Resolution: Unresolved Votes: 0
Labels: propagation, upstream

Issue Links:
Relates
relates to MDEV-8728 Fix a number of problems in equal fie... Closed

 Description   

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(20) COLLATE latin1_bin);
INSERT INTO t1 VALUES ('a'),('b');
EXPLAIN EXTENDED SELECT * FROM t1 WHERE a=_latin1'a' AND a<=CONCAT(_latin1'AAAA',RAND());
SHOW WARNINGS;

returns:

+-------+------+----------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | select `test`.`t1`.`a` AS `a` from `test`.`t1` where ((`test`.`t1`.`a` = _latin1'a') and (_latin1'a' <= concat(_latin1'AAAA',rand()))) |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------+

Notice, the column t1.a in the right equality was correctly replaced to the equal constant _latin1'a'.
But this output makes a wrong impression that the right part is now compared using the default collation for latin1, which is lat1in_swedish_ci. This is not true:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(20) COLLATE latin1_bin);
INSERT INTO t1 VALUES ('a'),('b');
SELECT * FROM t1 WHERE a=_latin1'a' AND a<=CONCAT(_latin1'AAAA',RAND()); 

returns empty set, while the rewritten query from the EXPLAIN output:

select `test`.`t1`.`a` AS `a` from `test`.`t1` where ((`test`.`t1`.`a` = _latin1'a') and (_latin1'a' <= concat(_latin1'AAAA',rand())));

returns one row:

+------+
| a    |
+------+
| a    |
+------+

Therefore, equal field propagation works correctly, but the EXPLAIN output does not match to what happens in reality.
It should probably return

select `test`.`t1`.`a` AS `a` from `test`.`t1` where ((`test`.`t1`.`a` = _latin1'a') and (_latin1'a' COLLATE latin1_bin <= concat(_latin1'AAAA',rand())));

but COLLATE would set derivation to EXPLICIT, while we need IMPLICIT, so the displayed optimized query will be not what exactly happens and can return different results again.

Do we need a new syntax:

COLLATE {EXPLICIT|IMPLICIT|COERCIBLE|NONE} <collation name>

to set both collation and collation derivation of the expression at once?

Another option is to use NAME_CONST, whose collation derivation (coercibility) is IMPLICTIT, similar to a column, so the rewritten query could be:

SELECT * FROM t1 WHERE a=_latin1'a' AND NAME_CONST('',_latin1'a' COLLATE latin1_bin)<=CONCAT(_latin1'AAAA',RAND());


Generated at Thu Feb 08 07:29:49 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.