Details
-
Task
-
Status: In Review (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
10.1.7-2, 10.1.8-1
Description
Under terms of MDEV-7649 we disabled using indexes when comparing a field to a broken character string when processing non-equality operations <, >, <=, =>, <> , for example:
SET NAMES 'utf8';
|
DROP TABLE IF EXISTS t1;
|
CREATE TABLE t1 (
|
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
|
a varchar(60) COLLATE utf8_general_ci NOT NULL DEFAULT '',
|
KEY(a)
|
) ENGINE=MyISAM CHARSET=utf8 COLLATE=utf8_general_ci;
|
INSERT INTO t1 (a) VALUES ('admin');
|
INSERT INTO t1 (a) VALUES ('admin;');
|
INSERT INTO t1 (a) VALUES ('adminx');
|
EXPLAIN SELECT * FROM t1 WHERE a < 'admin��';
|
returns
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+
|
| 1 | SIMPLE | t1 | ALL | a | NULL | NULL | NULL | 3 | Using where |
|
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+
|
Notice, '��' is a broken utf8 character (it's a valid utf8mb4 character).
Using indexes was disabled to have consistent result set with and without index.
Now with MDEV-8036 done we can enable using indexes again.
get_mm_leaf() calls Field::store() to put the value to search into the Field buffer.
This task will need changes in Fields, which should be able to store broken byte sequences somehow for index search purposes, as the normal behavior of Field::store() to replace bad bytes to question marks won't work for search optimization purposes.
Possible solutions:
- Add a new parameter to Field::store()
- Add a new method, say Field::store_for_search()
- Add a new flag into Field::flags and make Field::store() check this flag to decide whether to replace bad bytes to question marks (on INSERT/UPDATE), or to handle bad bytes differently (on search).
- Add new classes Field_string_key, Field_varstring_key, Field_blob_key and overwrite their store() methods
As soon as we're able to store bad values into the Field buffer in a way suitable for search (instead of just replacing to '?'), the collation should be able to do index scan in exactly the same way with a non-indexed search.