[MDEV-27443] Wrong result with prefix key on CHAR column and nopad collation Created: 2022-01-07  Updated: 2022-12-14  Resolved: 2022-12-14

Status: Closed
Project: MariaDB Server
Component/s: Character Sets, Optimizer
Affects Version/s: 10.5, 10.6, 10.7
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Alexander Barkov
Resolution: Cannot Reproduce Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-27572 Wrong result with DISTINCT and indexes Confirmed

 Description   

--source include/have_innodb.inc
 
CREATE TABLE t1 (f char(8), KEY (f(4))) ENGINE=InnoDB COLLATE=utf8mb3_nopad_bin;
INSERT INTO t1 VALUES ('y'),('x'),('x'),('x'),('w'),('w'),('v'),('u'),('t'),('t');
 
EXPLAIN FORMAT=JSON
SELECT * FROM t1 WHERE f >= 'w';
SELECT * FROM t1 WHERE f >= 'w';
 
# Cleanup
DROP TABLE t1;

Actual result on 10.5 4c3ad244

SELECT * FROM t1 WHERE f >= 'w';
f
x
x
x
y

This is wrong, 'w' values should also be there.

The following query will also fail with an empty result:

SELECT * FROM t1 WHERE f = 'w';

Changing the engine to MyISAM will return the correct result.

The failure appeared in 10.5 branch after this commit in 10.5.3:

commit eb483c5181ab430877c135c16224284cfc517b3d
Author: Monty
Date:   Fri Feb 28 12:59:30 2020 +0200
 
    Updated optimizer costs in multi_range_read_info_const() and sql_select.cc

Plan on the current branch (wrong result)

{
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "t1",
      "access_type": "range",
      "possible_keys": ["f"],
      "key": "f",
      "key_length": "13",
      "used_key_parts": ["f"],
      "rows": 4,
      "filtered": 100,
      "attached_condition": "t1.f >= 'w'"
    }
  }
}

Plan before the change (correct result)

{
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "t1",
      "access_type": "ALL",
      "possible_keys": ["f"],
      "rows": 10,
      "filtered": 40,
      "attached_condition": "t1.f >= 'w'"
    }
  }
}



 Comments   
Comment by Michael Widenius [ 2022-02-01 ]

I analyze the case of the query:
SELECT * FROM t1 WHERE f = 'w';

InnoDB is called with ha_innobase::index_read() .. with a key value of:
0x7ffe9001d630: 0x00 0x77 0x20 0x20 0x20 0x20 0x20 0x20
0x7ffe9001d638: 0x20 0x20 0x20 0x20 0x20
Call has key length of 13 and find_flag HA_READ_KEY_EXACT. This looks correct so far.

Some innodb variables in row_search_mvcc()
(gdb) print *search_tuple
$24 =

{info_bits = 0, n_fields = 1, n_fields_cmp = 1, fields = 0x7ffe9006eb10, n_v_fields = 0, v_fields = 0x0, magic_n = 65478679}

(gdb) print *search_tuple->fields
$25 = {data = 0x7ffe9001d631, ext = 0, spatial_status = 0, len = 8, type = {prtype = 72548606, mtype = 13, len = 24, mbminlen = 1, mbmaxlen = 3}}.
(gdb) x/13bx search_tuple->fields->data
0x7ffe9001d631: 0x77 0x20 0x20 0x20 0x20 0x20 0x20 0x20
0x7ffe9001d639: 0x20 0x20 0x20 0x20 0x00

The bug seams to be that InnoDB is not able to compare the above key to what is in the secondary key index.
Probably some utf8mb3 comparison that goes wrong.

Comment by Alexander Barkov [ 2022-02-03 ]

MDEV-25904 + MDEV-25440 seem to have fixed the problem. It does not repeat in 10.8-merge. Let's wait for the current "merge to 10.8" cycle to finish and verify again.

Comment by Marko Mäkelä [ 2022-02-03 ]

Because the wrong result is also repeatable with ROW_FORMAT=REDUNDANT, I do not think that this is directly related to MDEV-25440. (But perhaps MDEV-25904.)

I think that bar is a better assignee of this, because InnoDB is merely invoking some collation code that he knows much better than me.

Comment by Alice Sherepa [ 2022-12-14 ]

Currently correct results on 10.3-10.11 (b81b1943932a383a92b)

Generated at Thu Feb 08 09:52:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.