[MDEV-13151] Indeterministic results of multi-table update on RocksDB (or other) tables Created: 2017-06-22  Updated: 2017-07-31

Status: Open
Project: MariaDB Server
Component/s: Data Manipulation - Update
Affects Version/s: 10.2
Fix Version/s: 10.2

Type: Bug Priority: Minor
Reporter: Elena Stepanova Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: None
Environment:

10.2 0992be927e1c686c39c39fe53fc2a7869d55143d



 Description   

DROP TABLE IF EXISTS t1,t2;
CREATE TABLE t1 (b CHAR(8)) ENGINE=RocksDB;
INSERT INTO t1 VALUES ('u'),('c'),('d'),('e');
CREATE TABLE t2 (c CHAR(8)) ENGINE=RocksDB;
INSERT INTO t2 VALUES ('u'),('c'),('d'),('e');
UPDATE t1, t2 SET b = 'upd2a', c = 'upd2b' WHERE c < b;
SELECT * FROM t1;
SELECT * FROM t2;
DROP TABLE t1, t2;

By simply repeating this set of queries over and over I am getting various results, typically in table t1.

This is more common (it is also a correct result):

MariaDB [test]> SELECT * FROM t1;
+-------+
| b     |
+-------+
| upd2a |
| c     |
| upd2a |
| upd2a |
+-------+
4 rows in set (0.00 sec)

This is less common (happens for me every 5-6 tries):

MariaDB [test]> SELECT * FROM t1;
+-------+
| b     |
+-------+
| upd2a |
| c     |
| d     |
| e     |
+-------+
4 rows in set (0.00 sec)



 Comments   
Comment by Sergei Petrunia [ 2017-07-28 ]

Looking at the explain in the slow query log, one can see that

  • join order alternates between "t1,t2" and "t2,t1" (normal as the tables are identical and both orders have the same cost)
  • join order of t1,t2 produces expected result
  • join order of t2,t1 produces the wrong query result (see Rows_affected):

# Query_time: 0.073521  Lock_time: 0.000311  Rows_sent: 0  Rows_examined: 20
# Rows_affected: 6
# Full_scan: Yes  Full_join: Yes  Tmp_table: Yes  Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: No
#
# explain: id   select_type     table   type    possible_keys   key     key_len ref     rows    r_rows  filtered        r_filteredExtra
# explain: 1    SIMPLE  t1      ALL     NULL    NULL    NULL    NULL    4       4.00    100.00  100.00
# explain: 1    SIMPLE  t2      ALL     NULL    NULL    NULL    NULL    4       4.00    100.00  56.25   Using where
#
SET timestamp=1501257310;
UPDATE t1, t2 SET b = 'upd2a', c = 'upd2b' WHERE c < b;

# Query_time: 0.064360  Lock_time: 0.000392  Rows_sent: 0  Rows_examined: 20
# Rows_affected: 4
# Full_scan: Yes  Full_join: Yes  Tmp_table: Yes  Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: No
#
# explain: id   select_type     table   type    possible_keys   key     key_len ref     rows    r_rows  filtered        r_filteredExtra
# explain: 1    SIMPLE  t2      ALL     NULL    NULL    NULL    NULL    5       4.00    100.00  100.00
# explain: 1    SIMPLE  t1      ALL     NULL    NULL    NULL    NULL    7       4.00    100.00  18.75   Using where
#
SET timestamp=1501257312;
UPDATE t1, t2 SET b = 'upd2a', c = 'upd2b' WHERE c < b;

Comment by Sergei Petrunia [ 2017-07-28 ]

When one uses InnoDB, the join order is t1,t2:

# Thread_id: 9  Schema: test  QC_hit: No
# Query_time: 0.203752  Lock_time: 0.000334  Rows_sent: 0  Rows_examined: 20
# Rows_affected: 6
# Full_scan: Yes  Full_join: Yes  Tmp_table: Yes  Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: No
#
# explain: id   select_type     table   type    possible_keys   key     key_len ref     rows    r_rows  filtered        r_filteredExtra
# explain: 1    SIMPLE  t1      ALL     NULL    NULL    NULL    NULL    4       4.00    100.00  100.00
# explain: 1    SIMPLE  t2      ALL     NULL    NULL    NULL    NULL    4       4.00    100.00  56.25   Using where
SET timestamp=1501261030;
UPDATE t1, t2 SET b = 'upd2a', c = 'upd2b' WHERE c < b;

What if I use gdb to make t1,t2 query plan to be more expensive? t2, t1 is picked instead:

# Query_time: 74.967641  Lock_time: 0.000178  Rows_sent: 0  Rows_examined: 20
# Rows_affected: 1
# Full_scan: Yes  Full_join: Yes  Tmp_table: Yes  Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: No
#
# explain: id   select_type     table   type    possible_keys   key     key_len ref     rows    r_rows  filtered        r_filteredExtra
# explain: 1    SIMPLE  t2      ALL     NULL    NULL    NULL    NULL    4       4.00    100.00  100.00
# explain: 1    SIMPLE  t1      ALL     NULL    NULL    NULL    NULL    4       4.00    100.00  6.25    Using where
#
SET timestamp=1501261224;
UPDATE t1, t2 SET b = 'upd2a', c = 'upd2b' WHERE c < b;

The results are what I would assume "incorrect" (putting that into quotes because I'm not certain if we promised to handle such cases) :

MariaDB [test]> select * from t1;
+-------+
| b     |
+-------+
| upd2a |
| c     |
| upd2a |
| upd2a |
+-------+
4 rows in set (3.51 sec)
 
MariaDB [test]> select * from t2;
+-------+
| c     |
+-------+
| upd2b |
| upd2b |
| upd2b |
| upd2b |
+-------+
4 rows in set (0.01 sec)

Comment by Sergei Petrunia [ 2017-07-28 ]

Removing RocksDB label as the issue is not specific to RocksDB, I also do not think it is practically significant.

Generated at Thu Feb 08 08:03:18 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.