Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.0.13
-
None
-
None
Description
This bug is related to MDEV-6679, but for the cases when
the column character set is different from the connection character set.
This script:
SET NAMES utf8, collation_connection=utf8_swedish_ci; |
CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET latin1, b INT NOT NULL DEFAULT 0, key(a)); |
INSERT INTO t1 VALUES ('a'),('b'),('c'),('d'),('¢'); |
SET @arg='¢'; |
PREPARE stmt FROM "EXPLAIN SELECT * FROM t1 WHERE a BETWEEN _utf8'¢' AND ?"; |
EXECUTE stmt USING @arg; |
PREPARE stmt FROM "EXPLAIN SELECT * FROM t1 WHERE a between ? AND _utf8'¢'"; |
EXECUTE stmt USING @arg; |
DEALLOCATE PREPARE stmt; |
returns
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|
| 1 | SIMPLE | t1 | range | a | a | 13 | NULL | 8 | Using where; Using index |
|
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|
for both EXPLAIN queries. Notice RANGE access type, which is wrong.
If I change the queries to:
EXPLAIN SELECT * FROM t1 WHERE a BETWEEN _utf8'¢' AND '¢'; |
EXPLAIN SELECT * FROM t1 WHERE a between '¢' AND _utf8'¢'; |
it returns:
+------+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
|
| 1 | SIMPLE | t1 | ref | a | a | 13 | const | 8 | Using where; Using index |
|
+------+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
|
Notice, REF access type, which means the two constants were
recognized as the same and BETWEEN was changed to equality.
The prepared statements should be fixed to use REF access,
like the non-prepared do.
Note, there is a subtle difference with MDEV-6679:
- in
MDEV-6679, the character set of the field is the same
with the character set of the connection (utf8). The string literal
with the introdcuder and the parameter values also have character
set utf8, but they have different collations:
utf8_general_ci vs utf8_swedish_ci respectively.
So RANGE access is probably acceptable.
- in this report, the field character set and the connection character set are different,
so both literal with the introducer with the collation utf8_general_ci
and the PS parameter with the collation utf8_swedish_ci are converted
to latin1 with its default collation latin1_swedish_ci and thus loose the
original collation difference and become equal. Thus REF access
looks more suitable.