[MDEV-24961] Multi-table UPDATE does not consider ORDER BY clause Created: 2021-02-24  Updated: 2023-11-28

Status: Stalled
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.3.27, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1, 11.2
Fix Version/s: 10.4, 10.5, 10.6, 10.11

Type: Bug Priority: Major
Reporter: Oskar Schneider Assignee: Rex Johnston
Resolution: Unresolved Votes: 2
Labels: None
Environment:

Debian


Issue Links:
Relates
relates to MDEV-13911 Support ORDER BY and LIMIT in multi-t... Closed
relates to MDEV-31828 Multi-table update with `order by` gi... Open

 Description   

In this script, the ORDER BY AP.prio is not considered:

CREATE TABLE AP (
 prid int not null,
 prio int not null
);
 
CREATE TABLE APD (
 prid int not null,
 info varchar(64)
);
 
INSERT INTO AP VALUES (1,3), (2,3), (3,2), (4,2), (5,1), (6,1);
INSERT INTO APD (prid) VALUES (1), (2), (3), (4), (5), (6);
 
SELECT * FROM AP;
SELECT * FROM APD;
 
# The SELECT returns the correct values for prid (prid 3,5,6) 
SELECT APD.prid, AP.prio FROM AP INNER JOIN APD ON AP.prid = APD.prid
 ORDER BY AP.prio, AP.prid
 LIMIT 3;
 
# The UPDATE with identical ORDER updates the columns with prid 1,2,3
UPDATE AP INNER JOIN APD ON AP.prid = APD.prid
 SET info = 'UPDATED'
 ORDER BY AP.prio, AP.prid
 LIMIT 3;
 
SELECT * FROM APD;

The script returns this output:

+------+------+
| prid | prio |
+------+------+
|    1 |    3 |
|    2 |    3 |
|    3 |    2 |
|    4 |    2 |
|    5 |    1 |
|    6 |    1 |
+------+------+
6 rows in set (0.001 sec)
 
+------+------+
| prid | info |
+------+------+
|    1 | NULL |
|    2 | NULL |
|    3 | NULL |
|    4 | NULL |
|    5 | NULL |
|    6 | NULL |
+------+------+
6 rows in set (0.000 sec)
 
+------+------+
| prid | prio |
+------+------+
|    5 |    1 |
|    6 |    1 |
|    3 |    2 |
+------+------+
3 rows in set (0.000 sec)
 
Query OK, 3 rows affected (0.001 sec)
Rows matched: 3  Changed: 3  Warnings: 0
 
+------+---------+
| prid | info    |
+------+---------+
|    1 | UPDATED |
|    2 | UPDATED |
|    3 | UPDATED |
|    4 | NULL    |
|    5 | NULL    |
|    6 | NULL    |
+------+---------+



 Comments   
Comment by Alice Sherepa [ 2021-02-24 ]

Thanks for the report!
as a temporary workaround:

MariaDB [test]> UPDATE AP INNER JOIN APD ON AP.prid = APD.prid
    ->  SET info = 'UPDATED', AP.prio=AP.prio
    ->  ORDER BY AP.prio, AP.prid
    ->  LIMIT 3;
Query OK, 3 rows affected (0.003 sec)
Rows matched: 6  Changed: 3  Warnings: 0
 
MariaDB [test]> SELECT * FROM APD;
+------+---------+
| prid | info    |
+------+---------+
|    1 | NULL    |
|    2 | NULL    |
|    3 | UPDATED |
|    4 | NULL    |
|    5 | UPDATED |
|    6 | UPDATED |
+------+---------+
6 rows in set (0.001 sec)

Comment by Oskar Schneider [ 2021-03-16 ]

The workaround unfortunately does not work reliably. Sometimes it updates the data as expected, sometimes in the wrong order.
When the UPDATE didn't work it seems that the workaround changed the value of AP.prio to a different value although the assignment is AP.prio = AP.prio!

Comment by David Townes [ 2023-03-19 ]

Encountered this bug today in MariaDB 10.6.12, and it appears it might be related to internal use of temporary tables.

In all the instances I have tested where "Using temporary" shows up in the query explanation and ordering is not only requested by also required on any table other than the first in a join sequence (the above example outputs correctly if the values in AP.prio are instead ascending and sequential), the ORDER BY clause seems to result in updates being applied to randomly sorted rows on subsequent tables.

See MDEV-20081, where it looks like this issue was observed in Daniel Black's comments but dismissed as irrelevant to that particular case. It also looks like related problems were addressed in MDEV-14551 and MDEV-20515, with commits related to multi_update::prepare2, but this issue somehow remains, and my C++ debugging skills are not sophisticated enough to localize to problem in sql_update.cc (I tried).

The workaround above likely works sometimes but not others because internal temporary table use depends on a wide variety of factors, and convincing the server to avoid use of internal temporary tables is highly context dependent. In my testing, moving the table referenced by columns in the ORDER BY clause to first position and then swapping STRAIGHT_JOIN for INNER JOIN stopped the server from using internal temporary tables for larger queries and cause the update to perform correctly, but the same tactic does not appear to help with the simple text example above, and while the workaround above does work with the simple text example it does not with larger queries.

For the queries in the original report, compare:

EXPLAIN UPDATE AP INNER JOIN APD ON AP.prid = APD.prid SET info = 'UPDATED' ORDER BY AP.prio, AP.prid LIMIT 3;
 
+------+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                           |
+------+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
|    1 | SIMPLE      | AP    | ALL  | NULL          | NULL | NULL    | NULL | 6    | Using temporary; Using filesort |
|    1 | SIMPLE      | APD   | ALL  | NULL          | NULL | NULL    | NULL | 6    | Using where                     |
+------+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
 
EXPLAIN UPDATE AP INNER JOIN APD ON AP.prid = APD.prid SET info = 'UPDATED', AP.prio=AP.prio ORDER BY AP.prio, AP.prid LIMIT 3;
 
+------+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+------+-------------+-------+------+---------------+------+---------+------+------+----------------+
|    1 | SIMPLE      | AP    | ALL  | NULL          | NULL | NULL    | NULL | 6    | Using filesort |
|    1 | SIMPLE      | APD   | ALL  | NULL          | NULL | NULL    | NULL | 6    | Using where    |
+------+-------------+-------+------+---------------+------+---------+------+------+----------------+

Comment by Rex Johnston [ 2023-08-01 ]

A slightly clearer test.

CREATE TABLE t (
 incr int,
 decr int
);
 
CREATE TABLE target (
 incr int,
 label varchar(10)
);
 
INSERT INTO t VALUES (1,6), (2,5), (3,4), (4,3), (5,2), (6,1);
INSERT INTO target (incr, label) VALUES (1, 'one'), (2, 'two'), (3, 'three'), (4, 'four'), (5, 'five'), (6, 'six');
 
SELECT * FROM t;
SELECT * FROM target;
 
# The SELECT returns the correct values for incr {4,5,6} 
SELECT target.incr, t.decr, target.label FROM t INNER JOIN target ON t.incr = target.incr
 ORDER BY t.decr
 LIMIT 3;
 
# The UPDATE with identical ORDER we expect the labels shown above {4,5,6} to be replaced with "SELECTED", {1,2,3} left unchanged
UPDATE t INNER JOIN target ON t.incr = target.incr
 SET label = 'SELECTED'
 ORDER BY t.decr
 LIMIT 3;
 
SELECT * FROM target;
 
DROP table t, target;

Just tested on 11.2, which fails in a slightly different way to 10.6.

In testing, the temporary table created for iterating over (representing the equivalent select), has "rowid" pushed onto the front of the order by list, but the fix requires more than just correcting this.

Generated at Thu Feb 08 09:34:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.