[MDEV-14433] RocksDB may show empty or incorrect output with rocksdb_strict_collation_check=off Created: 2017-11-17  Updated: 2017-12-06  Resolved: 2017-12-06

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - RocksDB
Affects Version/s: 10.2
Fix Version/s: 10.2.12

Type: Bug Priority: Major
Reporter: Andrii Nikitin (Inactive) Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
PartOf
includes MDEV-14563 Wrong query plan for query with no PK Closed
Relates
relates to MDEV-14293 MyRocks lacks basic functionality Closed

 Description   

It is hard to find exact conditions which reliably trigger the problem, but following steps did demonstrate it (see outputs later):

set global rocksdb_strict_collation_check=off;
drop table t;
CREATE TABLE t(
  a varchar(10) NOT NULL,
  b char(1) DEFAULT 'X',
  c char(2) NOT NULL DEFAULT '??',
  d varchar(10) NOT NULL,
  e int(11) DEFAULT 0,
  PRIMARY KEY (a,d),
  KEY (e)
) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
 
insert into t select 1,1,1,1,0;
insert into t select 2,1,1,1,0;
insert into t select 3,1,1,1,0;
 
select a from t;
explain select * from t where not e;
select a from t;

MariaDB [test]> drop table t;
Query OK, 0 rows affected (0.04 sec)
 
MariaDB [test]> CREATE TABLE t(
    ->   a varchar(10) NOT NULL,
    ->   b char(1) DEFAULT 'X',
    ->   c char(2) NOT NULL DEFAULT '??',
    ->   d varchar(10) NOT NULL,
    ->   e int(11) DEFAULT 0,
    ->   PRIMARY KEY (a,d),
    ->   KEY (e)
    -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.22 sec)
 
MariaDB [test]> 
MariaDB [test]> insert into t select 1,1,1,1,0;
Query OK, 1 row affected (0.07 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
MariaDB [test]> insert into t select 2,1,1,1,0;
Query OK, 1 row affected (0.03 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
MariaDB [test]> insert into t select 3,1,1,1,0;
Query OK, 1 row affected (0.03 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
MariaDB [test]> 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> explain select * from t where not e;
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
|    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    2 |       |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
1 row in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
|   |
|   |
|   |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> \r
Connection id:    11
Current database: test
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
|   |
|   |
|   |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> flush tables;
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
|   |
|   |
|   |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> select a from t;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    9
Current database: test
 
+---+
| a |
+---+
|   |
|   |
|   |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> optimize table t;
+--------+----------+----------+----------+
| Table  | Op       | Msg_type | Msg_text |
+--------+----------+----------+----------+
| test.t | optimize | status   | OK       |
+--------+----------+----------+----------+
1 row in set (0.15 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
|   |
|   |
|   |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> select a,b from t;
+---+------+
| a | b    |
+---+------+
| 1 | 1    |
| 2 | 1    |
| 3 | 1    |
+---+------+
3 rows in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
|   |
|   |
|   |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> alter table t drop primary key;
Query OK, 3 rows affected (0.27 sec)               
Records: 3  Duplicates: 0  Warnings: 0
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> explain select * from t where not e;
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
|    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    1 |       |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
1 row in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> alter table t add primary key(a,d);
Query OK, 3 rows affected (0.27 sec)               
Records: 3  Duplicates: 0  Warnings: 0
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
|   |
|   |
|   |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> drop table t;
Query OK, 0 rows affected (0.03 sec)
 
MariaDB [test]> CREATE TABLE t(
    ->   a varchar(10) NOT NULL,
    ->   b char(1) DEFAULT 'X',
    ->   c char(2) NOT NULL DEFAULT '??',
    ->   d varchar(10) NOT NULL,
    ->   e int(11) DEFAULT 0,
    ->   PRIMARY KEY (a,d),
    ->   KEY (e)
    -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
ERROR 4063 (HY000): Unsupported collation on string indexed column test.t.a Use binary collation (latin1_bin, binary, utf8_bin).
MariaDB [test]> 
MariaDB [test]> insert into t select 1,1,1,1,0;
ERROR 1146 (42S02): Table 'test.t' doesn't exist
MariaDB [test]> insert into t select 2,1,1,1,0;
ERROR 1146 (42S02): Table 'test.t' doesn't exist
MariaDB [test]> insert into t select 3,1,1,1,0;
ERROR 1146 (42S02): Table 'test.t' doesn't exist
MariaDB [test]> 
MariaDB [test]> select a from t;
ERROR 1146 (42S02): Table 'test.t' doesn't exist
MariaDB [test]> set global rocksdb_strict_collation_check=off;
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]> drop table t;
ERROR 1051 (42S02): Unknown table 'test.t'
MariaDB [test]> CREATE TABLE t(
    ->   a varchar(10) NOT NULL,
    ->   b char(1) DEFAULT 'X',
    ->   c char(2) NOT NULL DEFAULT '??',
    ->   d varchar(10) NOT NULL,
    ->   e int(11) DEFAULT 0,
    ->   PRIMARY KEY (a,d),
    ->   KEY (e)
    -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.14 sec)
 
MariaDB [test]> 
MariaDB [test]> insert into t select 1,1,1,1,0;
Query OK, 1 row affected (0.03 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
MariaDB [test]> insert into t select 2,1,1,1,0;
Query OK, 1 row affected (0.03 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
MariaDB [test]> insert into t select 3,1,1,1,0;
Query OK, 1 row affected (0.04 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
MariaDB [test]> 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> select a,b from t;
+---+------+
| a | b    |
+---+------+
| 1 | 1    |
| 2 | 1    |
| 3 | 1    |
+---+------+
3 rows in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> explain select a from t;
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|    1 | SIMPLE      | t     | index | NULL          | PRIMARY | 64      | NULL |    0 | Using index |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> explain select a from t where not 1;
+------+-------------+-------+------+---------------+------+---------+------+------+------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra            |
+------+-------------+-------+------+---------------+------+---------+------+------+------------------+
|    1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Impossible WHERE |
+------+-------------+-------+------+---------------+------+---------+------+------+------------------+
1 row in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> explain select a from t where not a;
+------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
|    1 | SIMPLE      | t     | index | PRIMARY       | PRIMARY | 64      | NULL |    0 | Using where; Using index |
+------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> explain select a from t where not e;
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
|    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    1 | Using index |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
1 row in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> explain select * from t where not e;
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
|    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    2 |       |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
1 row in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
|   |
|   |
|   |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> alter table t engine=rocksdb;
Query OK, 3 rows affected (0.27 sec)               
Records: 3  Duplicates: 0  Warnings: 0
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
|   |
|   |
|   |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> set global rocksdb_strict_collation_check=off;
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]> drop table t;
Query OK, 0 rows affected (0.03 sec)
 
MariaDB [test]> CREATE TABLE t(
    ->   a varchar(10) NOT NULL,
    ->   b char(1) DEFAULT 'X',
    ->   c char(2) NOT NULL DEFAULT '??',
    ->   d varchar(10) NOT NULL,
    ->   e int(11) DEFAULT 0,
    ->   PRIMARY KEY (a,d),
    ->   KEY (e)
    -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.20 sec)
 
MariaDB [test]> 
MariaDB [test]> insert into t select 1,1,1,1,0;
Query OK, 1 row affected (0.04 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
MariaDB [test]> insert into t select 2,1,1,1,0;
Query OK, 1 row affected (0.03 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
MariaDB [test]> insert into t select 3,1,1,1,0;
Query OK, 1 row affected (0.03 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
MariaDB [test]> 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> explain select a from t where not e;
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
|    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    1 | Using index |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
1 row in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> explain select * from t where not a;
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | t     | ALL  | PRIMARY       | NULL | NULL    | NULL |    0 | Using where |
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> explain select * from t where not e;
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
|    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    1 |       |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
1 row in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> explain select * from t where not e;
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
|    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    5 |       |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------+
1 row in set (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
|   |
|   |
|   |
+---+
3 rows in set (0.00 sec)
 
MariaDB [test]> flush query cache;
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]> select a from t;
+---+
| a |
+---+
|   |
|   |
|   |
+---+
3 rows in set (0.00 sec)



 Comments   
Comment by Alice Sherepa [ 2017-11-17 ]

I can reproduce on 10.2 ( commit 0c4d11e8199f4c7a) with mtr test:

--source include/have_rocksdb.inc
--disable_warnings
DROP TABLE IF EXISTS t;
--enable_warnings
set global rocksdb_strict_collation_check=off;
CREATE TABLE t(
  a varchar(10) NOT NULL,
  b char(1) DEFAULT 'X',
  c char(2) NOT NULL DEFAULT '??',
  d varchar(10) NOT NULL,
  e int(11) DEFAULT 0,
  PRIMARY KEY (a,d),
  KEY (e)
) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
 
insert into t select 1,1,1,1,0;
insert into t select 2,1,1,1,0;
insert into t select 3,1,1,1,0;
 
select a from t;
select * from t;
DROP TABLE t;

here are results ---weird empty output of select a from t;

DROP TABLE IF EXISTS t;
set global rocksdb_strict_collation_check=off;
CREATE TABLE t(
a varchar(10) NOT NULL,
b char(1) DEFAULT 'X',
c char(2) NOT NULL DEFAULT '??',
d varchar(10) NOT NULL,
e int(11) DEFAULT 0,
PRIMARY KEY (a,d),
KEY (e)
) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
insert into t select 1,1,1,1,0;
insert into t select 2,1,1,1,0;
insert into t select 3,1,1,1,0;
select a from t;
a
 
 
 
select * from t;
a	b	c	d	e
1	1	1	1	0
2	1	1	1	0
3	1	1	1	0
DROP TABLE t;
rocksdb.1rmy                             [ pass ]    240

Comment by Andrii Nikitin (Inactive) [ 2017-11-17 ]

Below is another variabtion for the same bug on the same table/data:

MariaDB [test]> select * from t;
+---+------+---+---+------+
| a | b    | c | d | e    |
+---+------+---+---+------+
| 1 | 1    | 1 | 1 |    0 |
| 2 | 1    | 1 | 1 |    0 |
| 3 | 1    | 1 | 1 |    0 |
+---+------+---+---+------+
3 rows in set (0.00 sec)
 
MariaDB [test]> insert into t select a,2,2,2,2 from t;
ERROR 1062 (23000): Duplicate entry '2' for key 'PRIMARY'

That error happens because of instead of (a,2) primary key gets (NULL,2):

MariaDB [test]> select * from t;
+---+------+---+---+------+
| a | b    | c | d | e    |
+---+------+---+---+------+
| 1 | 1    | 1 | 1 |    0 |
| 2 | 1    | 1 | 1 |    0 |
| 3 | 1    | 1 | 1 |    0 |
+---+------+---+---+------+
3 rows in set (0.00 sec)
 
MariaDB [test]> insert into t select a,2,2,2,2 from t;
ERROR 1062 (23000): Duplicate entry '2' for key 'PRIMARY'
MariaDB [test]> insert into t select a,2,2,2,2 from t limit 1;
Query OK, 1 row affected (0.04 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
MariaDB [test]> select * from t;
+---+------+---+---+------+
| a | b    | c | d | e    |
+---+------+---+---+------+
|   | 2    | 2 | 2 |    2 |
| 1 | 1    | 1 | 1 |    0 |
| 2 | 1    | 1 | 1 |    0 |
| 3 | 1    | 1 | 1 |    0 |
+---+------+---+---+------+
4 rows in set (0.00 sec)

Comment by Andrii Nikitin (Inactive) [ 2017-11-17 ]

Another case of the same issue:

MariaDB [test]> CREATE TABLE t(
    ->   a varchar(10),
    ->   b varchar(10) NOT NULL,
    ->   c int(11),
    ->   PRIMARY KEY (a,b),
    ->   KEY (c)
    -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.25 sec)
 
MariaDB [test]> 
MariaDB [test]> insert into t select floor(seq%3),floor(seq/3),(seq>=3) from seq_1_to_10;
Query OK, 10 rows affected (0.04 sec)
Records: 10  Duplicates: 0  Warnings: 0
 
MariaDB [test]> select * from t;
+---+---+------+
| a | b | c    |
+---+---+------+
|   |   |    0 |
|   |   |    0 |
|   |   |    1 |
|   |   |    1 |
|   |   |    1 |
|   |   |    1 |
|   |   |    1 |
|   |   |    1 |
|   |   |    1 |
|   |   |    1 |
+---+---+------+
10 rows in set (0.00 sec)

Comment by Andrii Nikitin (Inactive) [ 2017-11-17 ]

This is incorrect result with and without index. I believe it is the same problem, but let me know if new case should be created for it:

set global rocksdb_strict_collation_check=off;
drop table t;
CREATE TABLE t(
  a varchar(10),
  b varchar(10),
  c int(11),
  PRIMARY KEY (a,b),
  KEY (c)
) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
 
insert into t select floor(seq%3),floor(seq/3),(seq>=3) from seq_1_to_10;
select c from t;
select a, b from t;
# incorrect result - must be one row
select b from t where b='1' and c=0;
# incorrect result - must be one row
select b from t ignore index(c) where b='1' and c=0;
# incorrect result - must be one row
select c from t where b='1';
# incorrect result, incorrect warnings
select * from t where b=1;
show warnings limit 10;

Current output:

MariaDB [test]> set global rocksdb_strict_collation_check=off;
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]> drop table t;
Query OK, 0 rows affected (0.04 sec)
 
MariaDB [test]> CREATE TABLE t(
    ->   a varchar(10),
    ->   b varchar(10),
    ->   c int(11),
    ->   PRIMARY KEY (a,b),
    ->   KEY (c)
    -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.21 sec)
 
MariaDB [test]> 
MariaDB [test]> insert into t select floor(seq%3),floor(seq/3),(seq>=3) from seq_1_to_10;
Query OK, 10 rows affected (0.07 sec)
Records: 10  Duplicates: 0  Warnings: 0
 
MariaDB [test]> select c from t;
+------+
| c    |
+------+
|    0 |
|    0 |
|    1 |
|    1 |
|    1 |
|    1 |
|    1 |
|    1 |
|    1 |
|    1 |
+------+
10 rows in set (0.00 sec)
 
MariaDB [test]> select a, b from t;
+---+---+
| a | b |
+---+---+
| 0 | 1 |
| 0 | 2 |
| 0 | 3 |
| 1 | 0 |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 0 |
| 2 | 1 |
| 2 | 2 |
+---+---+
10 rows in set (0.00 sec)
 
MariaDB [test]> # incorrect result - must be one row
MariaDB [test]> select b from t where b='1' and c=0;
+---+
| b |
+---+
| 1 |
| 1 |
+---+
2 rows in set (0.00 sec)
 
MariaDB [test]> # incorrect result - must be one row
MariaDB [test]> select b from t ignore index(c) where b='1' and c=0;
Empty set (0.00 sec)
 
MariaDB [test]> # incorrect result - must be 3 rows
MariaDB [test]> select c from t where b='1';
Empty set (0.00 sec)
 
MariaDB [test]> # incorrect result, incorrect warnings
MariaDB [test]> select * from t where b=1;
Empty set, 10 warnings (0.00 sec)
 
MariaDB [test]> show warnings limit 10;
+---------+------+--------------------------------------+
| Level   | Code | Message                              |
+---------+------+--------------------------------------+
| Warning | 1292 | Truncated incorrect DOUBLE value: '' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '' |
+---------+------+--------------------------------------+
10 rows in set (0.00 sec)

Comment by Sergei Petrunia [ 2017-11-29 ]

Ok, debugging Alice's MTR test:
The query:

select a from t;

is resolved with an index scan using index KEY(e). Due to "extended keys", the index has PK columns suffix:

  KEY (e)
  PRIMARY KEY (a,d)

We do an index read and unpack index tuple.
In Rdb_key_def::unpack_record

(gdb) p field->field_name
  $49 = 0x7fffb8062a81 "a"
(gdb) print covered_column
  $51 = true
(gdb) p fpi->m_unpack_func
  $53 = NULL

That is, there is no way to unpack column a. That's why it gets a NULL value.

Comment by Sergei Petrunia [ 2017-11-29 ]

Upstream doesn't fail the testcase because it does a scan on PK, not on key
e.

If I force KEY(e) to be used, I get:

MariaDB:

explain
select a from t force index(e) where e<10000;
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  t       range   e       e       5       NULL    3       Using where; Using index

Upstream:

explain
select a from t force index(e) where e<10000;
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  t       range   e       e       5       NULL    2       Using index condition

That is, MariaDB allows index-only while the upstream does not.

Comment by Sergei Petrunia [ 2017-11-29 ]

Found an interesting mismatch between query plans that depends on the index
number:

Upstream, MySQL 5.6

CREATE TABLE t3(
  pk int primary key,
  a varchar(10) NOT NULL,   
  e int(11) DEFAULT 0,   
  KEY (a) 
) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;

insert into t3 values (1,1,1),(2,2,2);
explain select a from t3 where a <'zzz';
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | t3    | range | a             | a    | 32      | NULL |    2 | Using where |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+

CREATE TABLE t4(
  pk int,
  a varchar(10) NOT NULL,
  e int(11) DEFAULT 0,   
  KEY (a)
) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
insert into t4 values (1,1,1),(2,2,2);

explain select a from t4 where a <'zzz';
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | t4    | range | a             | a    | 32      | NULL |    1 | Using where |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+

MariaDB:

CREATE TABLE t3(
  pk int primary key,
  a varchar(10) NOT NULL,   
  e int(11) DEFAULT 0,   
  KEY (a) 
) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
insert into t3 values (1,1,1),(2,2,2);

explain select a from t3 where a <'zzz';
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | t3    | range | a             | a    | 32      | NULL |    1 | Using where |
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+

CREATE TABLE t4(
  pk int,
  a varchar(10) NOT NULL,
  e int(11) DEFAULT 0,   
  KEY (a)
) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
insert into t4 values (1,1,1),(2,2,2);

explain select a from t4 where a <'zzz';
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|    1 | SIMPLE      | t4    | index | a             | a    | 32      | NULL |    0 | Using where; Using index |
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+

Comment by Sergei Petrunia [ 2017-11-30 ]

... Alice testcase has an explicitly defined PK, though, so it should
not suffer from the [possible] issue described in the last comment.

For Alice's testcase:

insert into t select 1,1,1,1,0;

Upstream:

(gdb) p table->s->keys_for_keyread
  $78 = {map = 3}
(gdb) p table->field[0]->field_name
  $79 = 0x7fffd801c541 "a"
(gdb) p table->field[0]->part_of_key
  $80 = {map = 1}

MariaDB:

(gdb) p table->s->keys_for_keyread
  $335 = {map = 3}
(gdb) p table->field[0]->field_name
  $336 = 0x7fffac0215f1 "a"
(gdb) p table->field[0]->part_of_key
  $337 = {map = 3}

Comment by Sergei Petrunia [ 2017-11-30 ]

The difference is caused by this piece of code in
TABLE_SHARE::init_from_binary_frm_image:

        field->flags|= PART_KEY_FLAG;
        if (key == primary_key)
        {
          field->flags|= PRI_KEY_FLAG;
          /*
            If this field is part of the primary key and all keys contains
            the primary key, then we can use any key to find this column
          */
          if (ha_option & HA_PRIMARY_KEY_IN_READ_INDEX)
          {
            if (field->key_length() == key_part->length &&
                !(field->flags & BLOB_FLAG))
              field->part_of_key= share->keys_in_use;

Comment by Sergei Petrunia [ 2017-12-01 ]

Ok, "Alice's testcase" cab be fixed by rewriting TABLE_SHARE::init_from_binary_frm_image to not set field->part_of_key for secondary indexes for fields that cannot be decoded from their key image. (As for the primary key, field->part_of_key still includes it, because one
does get field values when reading the PK).

Comment by Sergei Petrunia [ 2017-12-01 ]

There are some difficult cases:

innodb_ext_key failure:

"Extended keys" feature the way it works on the SQL layer does not always fully extend the index. One example: when the length of the key reaches MAX_KEY_LENGTH, it will not append any more PK key parts to the secondary key (even if the storage engine actually does).

So, SQL layer thinks that not all of PK key parts are added to the secondary index. Well, the only way to know if a PK column can be decoded from the secondary index is to make a call like this:

  handler->index(secondary_key, extended_pk_key_part, 0) & HA_KEYREAD_ONLY

if there is no extended_pk_key_part, there is no way to check...

(A testcase in innodb_ext_key.test assumes the following: if a table has ha_table_flags() & HA_PRIMARY_KEY_IN_READ_INDEX, then all PK columns can be obtained by doing a secondary index read).

Comment by Sergei Petrunia [ 2017-12-01 ]

Another failure is partition_innodb.test. The partitioning storage engine has share->use_ext_keys=FALSE, so keys are not extended (That is, one cannot use them for doing index lookups, for example). But it does have table_flags() & HA_PRIMARY_KEY_IN_READ_INDEX, which means one gets PK column values when doing secondary index reads.

Comment by Sergei Petrunia [ 2017-12-02 ]

Note that the upstream has:

  ulonglong table_flags() const override {
    ...
    DBUG_RETURN(HA_BINLOG_ROW_CAPABLE | HA_BINLOG_STMT_CAPABLE |
                HA_REC_NOT_IN_SEQ | HA_CAN_INDEX_BLOBS |
                (m_pk_can_be_decoded ? HA_PRIMARY_KEY_IN_READ_INDEX : 0) |

(added there by myself at some point)

while MariaDB has

  ulonglong table_flags() const override {
...
    /*
...
      HA_PRIMARY_KEY_IN_READ_INDEX
        This flag is always set, even for tables that:
        - have no PK
        - have some (or all) of PK that can't be decoded from the secondary
          index.
    */
    DBUG_RETURN(HA_BINLOG_ROW_CAPABLE | HA_BINLOG_STMT_CAPABLE |
                HA_REC_NOT_IN_SEQ | HA_CAN_INDEX_BLOBS |
                HA_PRIMARY_KEY_IN_READ_INDEX |
...

The difference comes from this change where init_with_fields call was removed in MariaDB:
https://github.com/MariaDB/server/commit/7fb3b348d71ef29ed10ed7f41b332074c8ac0a3d

Comment by Sergei Petrunia [ 2017-12-02 ]

To sum up: existing storage engines (InnoDB, Partition+InnoDB) require that the following is done:

  pk_field->part_of_key= share->keys_in_use

whenever the table has HA_PRIMARY_KEY_IN_READ_INDEX flag.
This needs to be done regardless of whether the "extended keys" feature puts all/some PK columns at the ends of secondary indexes or not.

Enter MyRocks:

pk_field->part_of_key should only include secondary indexes if the pk_field can be decoded from its key image.

The only way for SQL layer to know if the PK field is decodable is to check

h->index_flags(secondary_key, pk_key_part, 0) & HA_KEYREAD_ONLY

This can only be done when Extended Keys has added the pk_field into the secondary key.
When it hasn't, SQL layer assumes the column is decodable (see ), which has a potential of causing wrong query result (**).

Can one solve (**) by having MyRocks not return HA_PRIMARY_KEY_IN_READ_INDEX when at least one of PK columns is not decodable? This is what the upstream does, and no, we can't do it because we are not allowed to have handler->init_with_fields call that the upstream does have.

(The issue is that table_flags() is called before the storage engine has a chance to know what the table DDL is)

Comment by Sergei Petrunia [ 2017-12-04 ]
  • Committed a patch for this issue, testing it in buildbot.
  • Filed MDEV-14563 for the "another testcase", made a patch for that too. Testing it also.
Generated at Thu Feb 08 08:13:31 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.