Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14433

RocksDB may show empty or incorrect output with rocksdb_strict_collation_check=off

Details

    Description

      It is hard to find exact conditions which reliably trigger the problem, but following steps did demonstrate it (see outputs later):

      set global rocksdb_strict_collation_check=off;
      drop table t;
      CREATE TABLE t(
        a varchar(10) NOT NULL,
        b char(1) DEFAULT 'X',
        c char(2) NOT NULL DEFAULT '??',
        d varchar(10) NOT NULL,
        e int(11) DEFAULT 0,
        PRIMARY KEY (a,d),
        KEY (e)
      ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
       
      insert into t select 1,1,1,1,0;
      insert into t select 2,1,1,1,0;
      insert into t select 3,1,1,1,0;
       
      select a from t;
      explain select * from t where not e;
      select a from t;
      

      MariaDB [test]> drop table t;
      Query OK, 0 rows affected (0.04 sec)
       
      MariaDB [test]> CREATE TABLE t(
          ->   a varchar(10) NOT NULL,
          ->   b char(1) DEFAULT 'X',
          ->   c char(2) NOT NULL DEFAULT '??',
          ->   d varchar(10) NOT NULL,
          ->   e int(11) DEFAULT 0,
          ->   PRIMARY KEY (a,d),
          ->   KEY (e)
          -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
      Query OK, 0 rows affected (0.22 sec)
       
      MariaDB [test]> 
      MariaDB [test]> insert into t select 1,1,1,1,0;
      Query OK, 1 row affected (0.07 sec)
      Records: 1  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> insert into t select 2,1,1,1,0;
      Query OK, 1 row affected (0.03 sec)
      Records: 1  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> insert into t select 3,1,1,1,0;
      Query OK, 1 row affected (0.03 sec)
      Records: 1  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> 
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> explain select * from t where not e;
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      |    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    2 |       |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      |   |
      |   |
      |   |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> \r
      Connection id:    11
      Current database: test
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      |   |
      |   |
      |   |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> flush tables;
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      |   |
      |   |
      |   |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      ERROR 2006 (HY000): MySQL server has gone away
      No connection. Trying to reconnect...
      Connection id:    9
      Current database: test
       
      +---+
      | a |
      +---+
      |   |
      |   |
      |   |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> optimize table t;
      +--------+----------+----------+----------+
      | Table  | Op       | Msg_type | Msg_text |
      +--------+----------+----------+----------+
      | test.t | optimize | status   | OK       |
      +--------+----------+----------+----------+
      1 row in set (0.15 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      |   |
      |   |
      |   |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> select a,b from t;
      +---+------+
      | a | b    |
      +---+------+
      | 1 | 1    |
      | 2 | 1    |
      | 3 | 1    |
      +---+------+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      |   |
      |   |
      |   |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> alter table t drop primary key;
      Query OK, 3 rows affected (0.27 sec)               
      Records: 3  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> explain select * from t where not e;
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      |    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    1 |       |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> alter table t add primary key(a,d);
      Query OK, 3 rows affected (0.27 sec)               
      Records: 3  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      |   |
      |   |
      |   |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> drop table t;
      Query OK, 0 rows affected (0.03 sec)
       
      MariaDB [test]> CREATE TABLE t(
          ->   a varchar(10) NOT NULL,
          ->   b char(1) DEFAULT 'X',
          ->   c char(2) NOT NULL DEFAULT '??',
          ->   d varchar(10) NOT NULL,
          ->   e int(11) DEFAULT 0,
          ->   PRIMARY KEY (a,d),
          ->   KEY (e)
          -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
      ERROR 4063 (HY000): Unsupported collation on string indexed column test.t.a Use binary collation (latin1_bin, binary, utf8_bin).
      MariaDB [test]> 
      MariaDB [test]> insert into t select 1,1,1,1,0;
      ERROR 1146 (42S02): Table 'test.t' doesn't exist
      MariaDB [test]> insert into t select 2,1,1,1,0;
      ERROR 1146 (42S02): Table 'test.t' doesn't exist
      MariaDB [test]> insert into t select 3,1,1,1,0;
      ERROR 1146 (42S02): Table 'test.t' doesn't exist
      MariaDB [test]> 
      MariaDB [test]> select a from t;
      ERROR 1146 (42S02): Table 'test.t' doesn't exist
      MariaDB [test]> set global rocksdb_strict_collation_check=off;
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [test]> drop table t;
      ERROR 1051 (42S02): Unknown table 'test.t'
      MariaDB [test]> CREATE TABLE t(
          ->   a varchar(10) NOT NULL,
          ->   b char(1) DEFAULT 'X',
          ->   c char(2) NOT NULL DEFAULT '??',
          ->   d varchar(10) NOT NULL,
          ->   e int(11) DEFAULT 0,
          ->   PRIMARY KEY (a,d),
          ->   KEY (e)
          -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
      Query OK, 0 rows affected (0.14 sec)
       
      MariaDB [test]> 
      MariaDB [test]> insert into t select 1,1,1,1,0;
      Query OK, 1 row affected (0.03 sec)
      Records: 1  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> insert into t select 2,1,1,1,0;
      Query OK, 1 row affected (0.03 sec)
      Records: 1  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> insert into t select 3,1,1,1,0;
      Query OK, 1 row affected (0.04 sec)
      Records: 1  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> 
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> select a,b from t;
      +---+------+
      | a | b    |
      +---+------+
      | 1 | 1    |
      | 2 | 1    |
      | 3 | 1    |
      +---+------+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> explain select a from t;
      +------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
      | id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
      |    1 | SIMPLE      | t     | index | NULL          | PRIMARY | 64      | NULL |    0 | Using index |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> explain select a from t where not 1;
      +------+-------------+-------+------+---------------+------+---------+------+------+------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra            |
      +------+-------------+-------+------+---------------+------+---------+------+------+------------------+
      |    1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Impossible WHERE |
      +------+-------------+-------+------+---------------+------+---------+------+------+------------------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> explain select a from t where not a;
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
      | id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
      |    1 | SIMPLE      | t     | index | PRIMARY       | PRIMARY | 64      | NULL |    0 | Using where; Using index |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> explain select a from t where not e;
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
      |    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    1 | Using index |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> explain select * from t where not e;
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      |    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    2 |       |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      |   |
      |   |
      |   |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> alter table t engine=rocksdb;
      Query OK, 3 rows affected (0.27 sec)               
      Records: 3  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      |   |
      |   |
      |   |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> set global rocksdb_strict_collation_check=off;
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [test]> drop table t;
      Query OK, 0 rows affected (0.03 sec)
       
      MariaDB [test]> CREATE TABLE t(
          ->   a varchar(10) NOT NULL,
          ->   b char(1) DEFAULT 'X',
          ->   c char(2) NOT NULL DEFAULT '??',
          ->   d varchar(10) NOT NULL,
          ->   e int(11) DEFAULT 0,
          ->   PRIMARY KEY (a,d),
          ->   KEY (e)
          -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
      Query OK, 0 rows affected (0.20 sec)
       
      MariaDB [test]> 
      MariaDB [test]> insert into t select 1,1,1,1,0;
      Query OK, 1 row affected (0.04 sec)
      Records: 1  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> insert into t select 2,1,1,1,0;
      Query OK, 1 row affected (0.03 sec)
      Records: 1  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> insert into t select 3,1,1,1,0;
      Query OK, 1 row affected (0.03 sec)
      Records: 1  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> 
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> explain select a from t where not e;
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
      |    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    1 | Using index |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> explain select * from t where not a;
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      |    1 | SIMPLE      | t     | ALL  | PRIMARY       | NULL | NULL    | NULL |    0 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> explain select * from t where not e;
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      |    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    1 |       |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      | 1 |
      | 2 |
      | 3 |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> explain select * from t where not e;
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      |    1 | SIMPLE      | t     | ref  | e             | e    | 5       | const |    5 |       |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      |   |
      |   |
      |   |
      +---+
      3 rows in set (0.00 sec)
       
      MariaDB [test]> flush query cache;
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [test]> select a from t;
      +---+
      | a |
      +---+
      |   |
      |   |
      |   |
      +---+
      3 rows in set (0.00 sec)
      

      Attachments

        Issue Links

          Activity

            anikitin Andrii Nikitin (Inactive) created issue -
            anikitin Andrii Nikitin (Inactive) made changes -
            Field Original Value New Value
            Assignee Sergei Petrunia [ psergey ] Andrii Nikitin [ anikitin ]
            anikitin Andrii Nikitin (Inactive) made changes -
            Summary (draft) sometimes RocksDB incorrect output (draft) sometimes RocksDB incorrect (empty) output
            anikitin Andrii Nikitin (Inactive) made changes -
            alice Alice Sherepa added a comment -

            I can reproduce on 10.2 ( commit 0c4d11e8199f4c7a) with mtr test:

            --source include/have_rocksdb.inc
            --disable_warnings
            DROP TABLE IF EXISTS t;
            --enable_warnings
            set global rocksdb_strict_collation_check=off;
            CREATE TABLE t(
              a varchar(10) NOT NULL,
              b char(1) DEFAULT 'X',
              c char(2) NOT NULL DEFAULT '??',
              d varchar(10) NOT NULL,
              e int(11) DEFAULT 0,
              PRIMARY KEY (a,d),
              KEY (e)
            ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
             
            insert into t select 1,1,1,1,0;
            insert into t select 2,1,1,1,0;
            insert into t select 3,1,1,1,0;
             
            select a from t;
            select * from t;
            DROP TABLE t;
            

            here are results ---weird empty output of select a from t;

            DROP TABLE IF EXISTS t;
            set global rocksdb_strict_collation_check=off;
            CREATE TABLE t(
            a varchar(10) NOT NULL,
            b char(1) DEFAULT 'X',
            c char(2) NOT NULL DEFAULT '??',
            d varchar(10) NOT NULL,
            e int(11) DEFAULT 0,
            PRIMARY KEY (a,d),
            KEY (e)
            ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
            insert into t select 1,1,1,1,0;
            insert into t select 2,1,1,1,0;
            insert into t select 3,1,1,1,0;
            select a from t;
            a
             
             
             
            select * from t;
            a	b	c	d	e
            1	1	1	1	0
            2	1	1	1	0
            3	1	1	1	0
            DROP TABLE t;
            rocksdb.1rmy                             [ pass ]    240
            

            alice Alice Sherepa added a comment - I can reproduce on 10.2 ( commit 0c4d11e8199f4c7a) with mtr test: --source include/have_rocksdb.inc --disable_warnings DROP TABLE IF EXISTS t; --enable_warnings set global rocksdb_strict_collation_check= off ; CREATE TABLE t( a varchar (10) NOT NULL , b char (1) DEFAULT 'X' , c char (2) NOT NULL DEFAULT '??' , d varchar (10) NOT NULL , e int (11) DEFAULT 0, PRIMARY KEY (a,d), KEY (e) ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8; insert into t select 1,1,1,1,0; insert into t select 2,1,1,1,0; insert into t select 3,1,1,1,0; select a from t; select * from t; DROP TABLE t; here are results ---weird empty output of select a from t; DROP TABLE IF EXISTS t; set global rocksdb_strict_collation_check=off; CREATE TABLE t( a varchar(10) NOT NULL, b char(1) DEFAULT 'X', c char(2) NOT NULL DEFAULT '??', d varchar(10) NOT NULL, e int(11) DEFAULT 0, PRIMARY KEY (a,d), KEY (e) ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8; insert into t select 1,1,1,1,0; insert into t select 2,1,1,1,0; insert into t select 3,1,1,1,0; select a from t; a       select * from t; a b c d e 1 1 1 1 0 2 1 1 1 0 3 1 1 1 0 DROP TABLE t; rocksdb.1rmy [ pass ] 240

            Below is another variabtion for the same bug on the same table/data:

            MariaDB [test]> select * from t;
            +---+------+---+---+------+
            | a | b    | c | d | e    |
            +---+------+---+---+------+
            | 1 | 1    | 1 | 1 |    0 |
            | 2 | 1    | 1 | 1 |    0 |
            | 3 | 1    | 1 | 1 |    0 |
            +---+------+---+---+------+
            3 rows in set (0.00 sec)
             
            MariaDB [test]> insert into t select a,2,2,2,2 from t;
            ERROR 1062 (23000): Duplicate entry '2' for key 'PRIMARY'
            

            That error happens because of instead of (a,2) primary key gets (NULL,2):

            MariaDB [test]> select * from t;
            +---+------+---+---+------+
            | a | b    | c | d | e    |
            +---+------+---+---+------+
            | 1 | 1    | 1 | 1 |    0 |
            | 2 | 1    | 1 | 1 |    0 |
            | 3 | 1    | 1 | 1 |    0 |
            +---+------+---+---+------+
            3 rows in set (0.00 sec)
             
            MariaDB [test]> insert into t select a,2,2,2,2 from t;
            ERROR 1062 (23000): Duplicate entry '2' for key 'PRIMARY'
            MariaDB [test]> insert into t select a,2,2,2,2 from t limit 1;
            Query OK, 1 row affected (0.04 sec)
            Records: 1  Duplicates: 0  Warnings: 0
             
            MariaDB [test]> select * from t;
            +---+------+---+---+------+
            | a | b    | c | d | e    |
            +---+------+---+---+------+
            |   | 2    | 2 | 2 |    2 |
            | 1 | 1    | 1 | 1 |    0 |
            | 2 | 1    | 1 | 1 |    0 |
            | 3 | 1    | 1 | 1 |    0 |
            +---+------+---+---+------+
            4 rows in set (0.00 sec)
            

            anikitin Andrii Nikitin (Inactive) added a comment - Below is another variabtion for the same bug on the same table/data: MariaDB [test]> select * from t; +---+------+---+---+------+ | a | b | c | d | e | +---+------+---+---+------+ | 1 | 1 | 1 | 1 | 0 | | 2 | 1 | 1 | 1 | 0 | | 3 | 1 | 1 | 1 | 0 | +---+------+---+---+------+ 3 rows in set (0.00 sec)   MariaDB [test]> insert into t select a,2,2,2,2 from t; ERROR 1062 (23000): Duplicate entry '2' for key 'PRIMARY' That error happens because of instead of (a,2) primary key gets (NULL,2): MariaDB [test]> select * from t; +---+------+---+---+------+ | a | b | c | d | e | +---+------+---+---+------+ | 1 | 1 | 1 | 1 | 0 | | 2 | 1 | 1 | 1 | 0 | | 3 | 1 | 1 | 1 | 0 | +---+------+---+---+------+ 3 rows in set (0.00 sec)   MariaDB [test]> insert into t select a,2,2,2,2 from t; ERROR 1062 (23000): Duplicate entry '2' for key 'PRIMARY' MariaDB [test]> insert into t select a,2,2,2,2 from t limit 1; Query OK, 1 row affected (0.04 sec) Records: 1 Duplicates: 0 Warnings: 0   MariaDB [test]> select * from t; +---+------+---+---+------+ | a | b | c | d | e | +---+------+---+---+------+ | | 2 | 2 | 2 | 2 | | 1 | 1 | 1 | 1 | 0 | | 2 | 1 | 1 | 1 | 0 | | 3 | 1 | 1 | 1 | 0 | +---+------+---+---+------+ 4 rows in set (0.00 sec)
            anikitin Andrii Nikitin (Inactive) made changes -
            Summary (draft) sometimes RocksDB incorrect (empty) output RocksDB may show empty output (for columns which are part of PK(?) )
            anikitin Andrii Nikitin (Inactive) made changes -
            Assignee Andrii Nikitin [ anikitin ] Sergei Petrunia [ psergey ]
            anikitin Andrii Nikitin (Inactive) made changes -
            Fix Version/s 10.2 [ 14601 ]

            Another case of the same issue:

            MariaDB [test]> CREATE TABLE t(
                ->   a varchar(10),
                ->   b varchar(10) NOT NULL,
                ->   c int(11),
                ->   PRIMARY KEY (a,b),
                ->   KEY (c)
                -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
            Query OK, 0 rows affected (0.25 sec)
             
            MariaDB [test]> 
            MariaDB [test]> insert into t select floor(seq%3),floor(seq/3),(seq>=3) from seq_1_to_10;
            Query OK, 10 rows affected (0.04 sec)
            Records: 10  Duplicates: 0  Warnings: 0
             
            MariaDB [test]> select * from t;
            +---+---+------+
            | a | b | c    |
            +---+---+------+
            |   |   |    0 |
            |   |   |    0 |
            |   |   |    1 |
            |   |   |    1 |
            |   |   |    1 |
            |   |   |    1 |
            |   |   |    1 |
            |   |   |    1 |
            |   |   |    1 |
            |   |   |    1 |
            +---+---+------+
            10 rows in set (0.00 sec)
            
            

            anikitin Andrii Nikitin (Inactive) added a comment - Another case of the same issue: MariaDB [test]> CREATE TABLE t( -> a varchar (10), -> b varchar (10) NOT NULL , -> c int (11), -> PRIMARY KEY (a,b), -> KEY (c) -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8; Query OK, 0 rows affected (0.25 sec)   MariaDB [test]> MariaDB [test]> insert into t select floor(seq%3),floor(seq/3),(seq>=3) from seq_1_to_10; Query OK, 10 rows affected (0.04 sec) Records: 10 Duplicates: 0 Warnings: 0   MariaDB [test]> select * from t; + ---+---+------+ | a | b | c | + ---+---+------+ | | | 0 | | | | 0 | | | | 1 | | | | 1 | | | | 1 | | | | 1 | | | | 1 | | | | 1 | | | | 1 | | | | 1 | + ---+---+------+ 10 rows in set (0.00 sec)
            anikitin Andrii Nikitin (Inactive) added a comment - - edited

            This is incorrect result with and without index. I believe it is the same problem, but let me know if new case should be created for it:

            set global rocksdb_strict_collation_check=off;
            drop table t;
            CREATE TABLE t(
              a varchar(10),
              b varchar(10),
              c int(11),
              PRIMARY KEY (a,b),
              KEY (c)
            ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
             
            insert into t select floor(seq%3),floor(seq/3),(seq>=3) from seq_1_to_10;
            select c from t;
            select a, b from t;
            # incorrect result - must be one row
            select b from t where b='1' and c=0;
            # incorrect result - must be one row
            select b from t ignore index(c) where b='1' and c=0;
            # incorrect result - must be one row
            select c from t where b='1';
            # incorrect result, incorrect warnings
            select * from t where b=1;
            show warnings limit 10;
            

            Current output:

            MariaDB [test]> set global rocksdb_strict_collation_check=off;
            Query OK, 0 rows affected (0.00 sec)
             
            MariaDB [test]> drop table t;
            Query OK, 0 rows affected (0.04 sec)
             
            MariaDB [test]> CREATE TABLE t(
                ->   a varchar(10),
                ->   b varchar(10),
                ->   c int(11),
                ->   PRIMARY KEY (a,b),
                ->   KEY (c)
                -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
            Query OK, 0 rows affected (0.21 sec)
             
            MariaDB [test]> 
            MariaDB [test]> insert into t select floor(seq%3),floor(seq/3),(seq>=3) from seq_1_to_10;
            Query OK, 10 rows affected (0.07 sec)
            Records: 10  Duplicates: 0  Warnings: 0
             
            MariaDB [test]> select c from t;
            +------+
            | c    |
            +------+
            |    0 |
            |    0 |
            |    1 |
            |    1 |
            |    1 |
            |    1 |
            |    1 |
            |    1 |
            |    1 |
            |    1 |
            +------+
            10 rows in set (0.00 sec)
             
            MariaDB [test]> select a, b from t;
            +---+---+
            | a | b |
            +---+---+
            | 0 | 1 |
            | 0 | 2 |
            | 0 | 3 |
            | 1 | 0 |
            | 1 | 1 |
            | 1 | 2 |
            | 1 | 3 |
            | 2 | 0 |
            | 2 | 1 |
            | 2 | 2 |
            +---+---+
            10 rows in set (0.00 sec)
             
            MariaDB [test]> # incorrect result - must be one row
            MariaDB [test]> select b from t where b='1' and c=0;
            +---+
            | b |
            +---+
            | 1 |
            | 1 |
            +---+
            2 rows in set (0.00 sec)
             
            MariaDB [test]> # incorrect result - must be one row
            MariaDB [test]> select b from t ignore index(c) where b='1' and c=0;
            Empty set (0.00 sec)
             
            MariaDB [test]> # incorrect result - must be 3 rows
            MariaDB [test]> select c from t where b='1';
            Empty set (0.00 sec)
             
            MariaDB [test]> # incorrect result, incorrect warnings
            MariaDB [test]> select * from t where b=1;
            Empty set, 10 warnings (0.00 sec)
             
            MariaDB [test]> show warnings limit 10;
            +---------+------+--------------------------------------+
            | Level   | Code | Message                              |
            +---------+------+--------------------------------------+
            | Warning | 1292 | Truncated incorrect DOUBLE value: '' |
            | Warning | 1292 | Truncated incorrect DOUBLE value: '' |
            | Warning | 1292 | Truncated incorrect DOUBLE value: '' |
            | Warning | 1292 | Truncated incorrect DOUBLE value: '' |
            | Warning | 1292 | Truncated incorrect DOUBLE value: '' |
            | Warning | 1292 | Truncated incorrect DOUBLE value: '' |
            | Warning | 1292 | Truncated incorrect DOUBLE value: '' |
            | Warning | 1292 | Truncated incorrect DOUBLE value: '' |
            | Warning | 1292 | Truncated incorrect DOUBLE value: '' |
            | Warning | 1292 | Truncated incorrect DOUBLE value: '' |
            +---------+------+--------------------------------------+
            10 rows in set (0.00 sec)
            

            anikitin Andrii Nikitin (Inactive) added a comment - - edited This is incorrect result with and without index. I believe it is the same problem, but let me know if new case should be created for it: set global rocksdb_strict_collation_check= off ; drop table t; CREATE TABLE t( a varchar (10), b varchar (10), c int (11), PRIMARY KEY (a,b), KEY (c) ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;   insert into t select floor(seq%3),floor(seq/3),(seq>=3) from seq_1_to_10; select c from t; select a, b from t; # incorrect result - must be one row select b from t where b= '1' and c=0; # incorrect result - must be one row select b from t ignore index (c) where b= '1' and c=0; # incorrect result - must be one row select c from t where b= '1' ; # incorrect result, incorrect warnings select * from t where b=1; show warnings limit 10; Current output: MariaDB [test]> set global rocksdb_strict_collation_check=off; Query OK, 0 rows affected (0.00 sec)   MariaDB [test]> drop table t; Query OK, 0 rows affected (0.04 sec)   MariaDB [test]> CREATE TABLE t( -> a varchar(10), -> b varchar(10), -> c int(11), -> PRIMARY KEY (a,b), -> KEY (c) -> ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8; Query OK, 0 rows affected (0.21 sec)   MariaDB [test]> MariaDB [test]> insert into t select floor(seq%3),floor(seq/3),(seq>=3) from seq_1_to_10; Query OK, 10 rows affected (0.07 sec) Records: 10 Duplicates: 0 Warnings: 0   MariaDB [test]> select c from t; +------+ | c | +------+ | 0 | | 0 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | +------+ 10 rows in set (0.00 sec)   MariaDB [test]> select a, b from t; +---+---+ | a | b | +---+---+ | 0 | 1 | | 0 | 2 | | 0 | 3 | | 1 | 0 | | 1 | 1 | | 1 | 2 | | 1 | 3 | | 2 | 0 | | 2 | 1 | | 2 | 2 | +---+---+ 10 rows in set (0.00 sec)   MariaDB [test]> # incorrect result - must be one row MariaDB [test]> select b from t where b='1' and c=0; +---+ | b | +---+ | 1 | | 1 | +---+ 2 rows in set (0.00 sec)   MariaDB [test]> # incorrect result - must be one row MariaDB [test]> select b from t ignore index(c) where b='1' and c=0; Empty set (0.00 sec)   MariaDB [test]> # incorrect result - must be 3 rows MariaDB [test]> select c from t where b='1'; Empty set (0.00 sec)   MariaDB [test]> # incorrect result, incorrect warnings MariaDB [test]> select * from t where b=1; Empty set, 10 warnings (0.00 sec)   MariaDB [test]> show warnings limit 10; +---------+------+--------------------------------------+ | Level | Code | Message | +---------+------+--------------------------------------+ | Warning | 1292 | Truncated incorrect DOUBLE value: '' | | Warning | 1292 | Truncated incorrect DOUBLE value: '' | | Warning | 1292 | Truncated incorrect DOUBLE value: '' | | Warning | 1292 | Truncated incorrect DOUBLE value: '' | | Warning | 1292 | Truncated incorrect DOUBLE value: '' | | Warning | 1292 | Truncated incorrect DOUBLE value: '' | | Warning | 1292 | Truncated incorrect DOUBLE value: '' | | Warning | 1292 | Truncated incorrect DOUBLE value: '' | | Warning | 1292 | Truncated incorrect DOUBLE value: '' | | Warning | 1292 | Truncated incorrect DOUBLE value: '' | +---------+------+--------------------------------------+ 10 rows in set (0.00 sec)
            anikitin Andrii Nikitin (Inactive) made changes -
            Summary RocksDB may show empty output (for columns which are part of PK(?) ) RocksDB may show empty or incorrect output with
            anikitin Andrii Nikitin (Inactive) made changes -
            Summary RocksDB may show empty or incorrect output with RocksDB may show empty or incorrect output with rocksdb_strict_collation_check
            anikitin Andrii Nikitin (Inactive) made changes -
            Summary RocksDB may show empty or incorrect output with rocksdb_strict_collation_check RocksDB may show empty or incorrect output with rocksdb_strict_collation_check=off

            Ok, debugging Alice's MTR test:
            The query:

            select a from t;
            

            is resolved with an index scan using index KEY(e). Due to "extended keys", the index has PK columns suffix:

              KEY (e)
              PRIMARY KEY (a,d)
            

            We do an index read and unpack index tuple.
            In Rdb_key_def::unpack_record

            (gdb) p field->field_name
              $49 = 0x7fffb8062a81 "a"
            (gdb) print covered_column
              $51 = true
            (gdb) p fpi->m_unpack_func
              $53 = NULL
            

            That is, there is no way to unpack column a. That's why it gets a NULL value.

            psergei Sergei Petrunia added a comment - Ok, debugging Alice's MTR test: The query: select a from t; is resolved with an index scan using index KEY(e) . Due to "extended keys", the index has PK columns suffix: KEY (e) PRIMARY KEY (a,d) We do an index read and unpack index tuple. In Rdb_key_def::unpack_record (gdb) p field->field_name $49 = 0x7fffb8062a81 "a" (gdb) print covered_column $51 = true (gdb) p fpi->m_unpack_func $53 = NULL That is, there is no way to unpack column a . That's why it gets a NULL value.

            Upstream doesn't fail the testcase because it does a scan on PK, not on key
            e.

            If I force KEY(e) to be used, I get:

            MariaDB:

            explain
            select a from t force index(e) where e<10000;
            id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
            1       SIMPLE  t       range   e       e       5       NULL    3       Using where; Using index
            

            Upstream:

            explain
            select a from t force index(e) where e<10000;
            id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
            1       SIMPLE  t       range   e       e       5       NULL    2       Using index condition
            

            That is, MariaDB allows index-only while the upstream does not.

            psergei Sergei Petrunia added a comment - Upstream doesn't fail the testcase because it does a scan on PK, not on key e . If I force KEY(e) to be used, I get: MariaDB: explain select a from t force index(e) where e<10000; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE t range e e 5 NULL 3 Using where; Using index Upstream: explain select a from t force index(e) where e<10000; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE t range e e 5 NULL 2 Using index condition That is, MariaDB allows index-only while the upstream does not.

            Found an interesting mismatch between query plans that depends on the index
            number:

            Upstream, MySQL 5.6

            CREATE TABLE t3(
              pk int primary key,
              a varchar(10) NOT NULL,   
              e int(11) DEFAULT 0,   
              KEY (a) 
            ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
            

            insert into t3 values (1,1,1),(2,2,2);
            explain select a from t3 where a <'zzz';
            +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
            | id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
            +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
            |  1 | SIMPLE      | t3    | range | a             | a    | 32      | NULL |    2 | Using where |
            +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
            

            CREATE TABLE t4(
              pk int,
              a varchar(10) NOT NULL,
              e int(11) DEFAULT 0,   
              KEY (a)
            ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
            insert into t4 values (1,1,1),(2,2,2);
            

            explain select a from t4 where a <'zzz';
            +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
            | id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
            +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
            |  1 | SIMPLE      | t4    | range | a             | a    | 32      | NULL |    1 | Using where |
            +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
            

            MariaDB:

            CREATE TABLE t3(
              pk int primary key,
              a varchar(10) NOT NULL,   
              e int(11) DEFAULT 0,   
              KEY (a) 
            ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
            insert into t3 values (1,1,1),(2,2,2);
            

            explain select a from t3 where a <'zzz';
            +------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
            | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
            +------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
            |    1 | SIMPLE      | t3    | range | a             | a    | 32      | NULL |    1 | Using where |
            +------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
            

            CREATE TABLE t4(
              pk int,
              a varchar(10) NOT NULL,
              e int(11) DEFAULT 0,   
              KEY (a)
            ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8;
            insert into t4 values (1,1,1),(2,2,2);
            

            explain select a from t4 where a <'zzz';
            +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
            | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
            +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
            |    1 | SIMPLE      | t4    | index | a             | a    | 32      | NULL |    0 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
            

            psergei Sergei Petrunia added a comment - Found an interesting mismatch between query plans that depends on the index number: Upstream, MySQL 5.6 CREATE TABLE t3( pk int primary key, a varchar(10) NOT NULL, e int(11) DEFAULT 0, KEY (a) ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8; insert into t3 values (1,1,1),(2,2,2); explain select a from t3 where a <'zzz'; +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | t3 | range | a | a | 32 | NULL | 2 | Using where | +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+ CREATE TABLE t4( pk int, a varchar(10) NOT NULL, e int(11) DEFAULT 0, KEY (a) ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8; insert into t4 values (1,1,1),(2,2,2); explain select a from t4 where a <'zzz'; +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | t4 | range | a | a | 32 | NULL | 1 | Using where | +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+ MariaDB: CREATE TABLE t3( pk int primary key, a varchar(10) NOT NULL, e int(11) DEFAULT 0, KEY (a) ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8; insert into t3 values (1,1,1),(2,2,2); explain select a from t3 where a <'zzz'; +------+-------------+-------+-------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | t3 | range | a | a | 32 | NULL | 1 | Using where | +------+-------------+-------+-------+---------------+------+---------+------+------+-------------+ CREATE TABLE t4( pk int, a varchar(10) NOT NULL, e int(11) DEFAULT 0, KEY (a) ) ENGINE=ROCKSDB DEFAULT CHARSET=utf8; insert into t4 values (1,1,1),(2,2,2); explain select a from t4 where a <'zzz'; +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+ | 1 | SIMPLE | t4 | index | a | a | 32 | NULL | 0 | Using where; Using index | +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
            psergei Sergei Petrunia added a comment - - edited

            ... Alice testcase has an explicitly defined PK, though, so it should
            not suffer from the [possible] issue described in the last comment.

            For Alice's testcase:

            insert into t select 1,1,1,1,0;

            Upstream:

            (gdb) p table->s->keys_for_keyread
              $78 = {map = 3}
            (gdb) p table->field[0]->field_name
              $79 = 0x7fffd801c541 "a"
            (gdb) p table->field[0]->part_of_key
              $80 = {map = 1}
            

            MariaDB:

            (gdb) p table->s->keys_for_keyread
              $335 = {map = 3}
            (gdb) p table->field[0]->field_name
              $336 = 0x7fffac0215f1 "a"
            (gdb) p table->field[0]->part_of_key
              $337 = {map = 3}
            

            psergei Sergei Petrunia added a comment - - edited ... Alice testcase has an explicitly defined PK, though, so it should not suffer from the [possible] issue described in the last comment. For Alice's testcase: insert into t select 1,1,1,1,0; Upstream: (gdb) p table->s->keys_for_keyread $78 = {map = 3} (gdb) p table->field[0]->field_name $79 = 0x7fffd801c541 "a" (gdb) p table->field[0]->part_of_key $80 = {map = 1} MariaDB: (gdb) p table->s->keys_for_keyread $335 = {map = 3} (gdb) p table->field[0]->field_name $336 = 0x7fffac0215f1 "a" (gdb) p table->field[0]->part_of_key $337 = {map = 3}

            The difference is caused by this piece of code in
            TABLE_SHARE::init_from_binary_frm_image:

                    field->flags|= PART_KEY_FLAG;
                    if (key == primary_key)
                    {
                      field->flags|= PRI_KEY_FLAG;
                      /*
                        If this field is part of the primary key and all keys contains
                        the primary key, then we can use any key to find this column
                      */
                      if (ha_option & HA_PRIMARY_KEY_IN_READ_INDEX)
                      {
                        if (field->key_length() == key_part->length &&
                            !(field->flags & BLOB_FLAG))
                          field->part_of_key= share->keys_in_use;
            

            psergei Sergei Petrunia added a comment - The difference is caused by this piece of code in TABLE_SHARE::init_from_binary_frm_image : field->flags|= PART_KEY_FLAG; if (key == primary_key) { field->flags|= PRI_KEY_FLAG; /* If this field is part of the primary key and all keys contains the primary key, then we can use any key to find this column */ if (ha_option & HA_PRIMARY_KEY_IN_READ_INDEX) { if (field->key_length() == key_part->length && !(field->flags & BLOB_FLAG)) field->part_of_key= share->keys_in_use;

            Ok, "Alice's testcase" cab be fixed by rewriting TABLE_SHARE::init_from_binary_frm_image to not set field->part_of_key for secondary indexes for fields that cannot be decoded from their key image. (As for the primary key, field->part_of_key still includes it, because one
            does get field values when reading the PK).

            psergei Sergei Petrunia added a comment - Ok, "Alice's testcase" cab be fixed by rewriting TABLE_SHARE::init_from_binary_frm_image to not set field->part_of_key for secondary indexes for fields that cannot be decoded from their key image. (As for the primary key, field->part_of_key still includes it, because one does get field values when reading the PK).

            There are some difficult cases:

            innodb_ext_key failure:

            "Extended keys" feature the way it works on the SQL layer does not always fully extend the index. One example: when the length of the key reaches MAX_KEY_LENGTH, it will not append any more PK key parts to the secondary key (even if the storage engine actually does).

            So, SQL layer thinks that not all of PK key parts are added to the secondary index. Well, the only way to know if a PK column can be decoded from the secondary index is to make a call like this:

              handler->index(secondary_key, extended_pk_key_part, 0) & HA_KEYREAD_ONLY
            

            if there is no extended_pk_key_part, there is no way to check...

            (A testcase in innodb_ext_key.test assumes the following: if a table has ha_table_flags() & HA_PRIMARY_KEY_IN_READ_INDEX, then all PK columns can be obtained by doing a secondary index read).

            psergei Sergei Petrunia added a comment - There are some difficult cases: innodb_ext_key failure: "Extended keys" feature the way it works on the SQL layer does not always fully extend the index. One example: when the length of the key reaches MAX_KEY_LENGTH, it will not append any more PK key parts to the secondary key (even if the storage engine actually does). So, SQL layer thinks that not all of PK key parts are added to the secondary index. Well, the only way to know if a PK column can be decoded from the secondary index is to make a call like this: handler->index(secondary_key, extended_pk_key_part, 0) & HA_KEYREAD_ONLY if there is no extended_pk_key_part, there is no way to check... (A testcase in innodb_ext_key.test assumes the following: if a table has ha_table_flags() & HA_PRIMARY_KEY_IN_READ_INDEX , then all PK columns can be obtained by doing a secondary index read).

            Another failure is partition_innodb.test. The partitioning storage engine has share->use_ext_keys=FALSE, so keys are not extended (That is, one cannot use them for doing index lookups, for example). But it does have table_flags() & HA_PRIMARY_KEY_IN_READ_INDEX, which means one gets PK column values when doing secondary index reads.

            psergei Sergei Petrunia added a comment - Another failure is partition_innodb.test . The partitioning storage engine has share->use_ext_keys=FALSE , so keys are not extended (That is, one cannot use them for doing index lookups, for example). But it does have table_flags() & HA_PRIMARY_KEY_IN_READ_INDEX , which means one gets PK column values when doing secondary index reads.

            Note that the upstream has:

              ulonglong table_flags() const override {
                ...
                DBUG_RETURN(HA_BINLOG_ROW_CAPABLE | HA_BINLOG_STMT_CAPABLE |
                            HA_REC_NOT_IN_SEQ | HA_CAN_INDEX_BLOBS |
                            (m_pk_can_be_decoded ? HA_PRIMARY_KEY_IN_READ_INDEX : 0) |
            

            (added there by myself at some point)

            while MariaDB has

              ulonglong table_flags() const override {
            ...
                /*
            ...
                  HA_PRIMARY_KEY_IN_READ_INDEX
                    This flag is always set, even for tables that:
                    - have no PK
                    - have some (or all) of PK that can't be decoded from the secondary
                      index.
                */
                DBUG_RETURN(HA_BINLOG_ROW_CAPABLE | HA_BINLOG_STMT_CAPABLE |
                            HA_REC_NOT_IN_SEQ | HA_CAN_INDEX_BLOBS |
                            HA_PRIMARY_KEY_IN_READ_INDEX |
            ...
            

            The difference comes from this change where init_with_fields call was removed in MariaDB:
            https://github.com/MariaDB/server/commit/7fb3b348d71ef29ed10ed7f41b332074c8ac0a3d

            psergei Sergei Petrunia added a comment - Note that the upstream has: ulonglong table_flags() const override { ... DBUG_RETURN(HA_BINLOG_ROW_CAPABLE | HA_BINLOG_STMT_CAPABLE | HA_REC_NOT_IN_SEQ | HA_CAN_INDEX_BLOBS | (m_pk_can_be_decoded ? HA_PRIMARY_KEY_IN_READ_INDEX : 0) | (added there by myself at some point) while MariaDB has ulonglong table_flags() const override { ... /* ... HA_PRIMARY_KEY_IN_READ_INDEX This flag is always set, even for tables that: - have no PK - have some (or all) of PK that can't be decoded from the secondary index. */ DBUG_RETURN(HA_BINLOG_ROW_CAPABLE | HA_BINLOG_STMT_CAPABLE | HA_REC_NOT_IN_SEQ | HA_CAN_INDEX_BLOBS | HA_PRIMARY_KEY_IN_READ_INDEX | ... The difference comes from this change where init_with_fields call was removed in MariaDB: https://github.com/MariaDB/server/commit/7fb3b348d71ef29ed10ed7f41b332074c8ac0a3d

            To sum up: existing storage engines (InnoDB, Partition+InnoDB) require that the following is done:

              pk_field->part_of_key= share->keys_in_use
            

            whenever the table has HA_PRIMARY_KEY_IN_READ_INDEX flag.
            This needs to be done regardless of whether the "extended keys" feature puts all/some PK columns at the ends of secondary indexes or not.

            Enter MyRocks:

            pk_field->part_of_key should only include secondary indexes if the pk_field can be decoded from its key image.

            The only way for SQL layer to know if the PK field is decodable is to check

            h->index_flags(secondary_key, pk_key_part, 0) & HA_KEYREAD_ONLY

            This can only be done when Extended Keys has added the pk_field into the secondary key.
            When it hasn't, SQL layer assumes the column is decodable (see ), which has a potential of causing wrong query result (**).

            Can one solve (**) by having MyRocks not return HA_PRIMARY_KEY_IN_READ_INDEX when at least one of PK columns is not decodable? This is what the upstream does, and no, we can't do it because we are not allowed to have handler->init_with_fields call that the upstream does have.

            (The issue is that table_flags() is called before the storage engine has a chance to know what the table DDL is)

            psergei Sergei Petrunia added a comment - To sum up: existing storage engines (InnoDB, Partition+InnoDB) require that the following is done: pk_field->part_of_key= share->keys_in_use whenever the table has HA_PRIMARY_KEY_IN_READ_INDEX flag. This needs to be done regardless of whether the "extended keys" feature puts all/some PK columns at the ends of secondary indexes or not. Enter MyRocks: pk_field->part_of_key should only include secondary indexes if the pk_field can be decoded from its key image. The only way for SQL layer to know if the PK field is decodable is to check h->index_flags(secondary_key, pk_key_part, 0) & HA_KEYREAD_ONLY This can only be done when Extended Keys has added the pk_field into the secondary key. When it hasn't, SQL layer assumes the column is decodable (see ), which has a potential of causing wrong query result (**). Can one solve (**) by having MyRocks not return HA_PRIMARY_KEY_IN_READ_INDEX when at least one of PK columns is not decodable? This is what the upstream does, and no, we can't do it because we are not allowed to have handler->init_with_fields call that the upstream does have. (The issue is that table_flags() is called before the storage engine has a chance to know what the table DDL is)
            psergei Sergei Petrunia made changes -
            • Committed a patch for this issue, testing it in buildbot.
            • Filed MDEV-14563 for the "another testcase", made a patch for that too. Testing it also.
            psergei Sergei Petrunia added a comment - Committed a patch for this issue, testing it in buildbot. Filed MDEV-14563 for the "another testcase", made a patch for that too. Testing it also.
            psergei Sergei Petrunia made changes -
            Fix Version/s 10.2.12 [ 22810 ]
            Fix Version/s 10.2 [ 14601 ]
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 83872 ] MariaDB v4 [ 153208 ]

            People

              psergei Sergei Petrunia
              anikitin Andrii Nikitin (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.