[MDEV-18431] Select max + row_number giving incorrect result Created: 2019-01-31  Updated: 2019-03-12  Resolved: 2019-03-12

Status: Closed
Project: MariaDB Server
Component/s: Optimizer - Window functions
Affects Version/s: 10.3.11, 10.3.12, 10.2, 10.3, 10.4
Fix Version/s: 10.2.23

Type: Bug Priority: Major
Reporter: Jarno Antikainen Assignee: Varun Gupta (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Server version: 10.3.12-MariaDB-1:10.3.12+maria~bionic mariadb.org binary distribution
Running docker image mariadb:10.3.12 in Docker version 18.09.1, build 4c52b90, on macOS 10.14.2, MacBook Pro (Retina, 15-inch, Mid 2015).



 Description   

I have a table like this:

MariaDB [test]> CREATE TABLE `event` (
  `entity_id` int(10) NOT NULL,
  `entity_version` int(10) NOT NULL,
  `something` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`entity_id`,`entity_version`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

With data:

MariaDB [test]> insert into event values (1, 1, 'foo'), (1,2,'bar'), (1,3,'bar'), (2, 1, 'baz'), (2, 2, 'grunt');

It is a simplified version of a table used for storing events of entities, identified by an entity id and an increasing entity version number, used for event sourcing.

Some background for the use case: I wanted to insert some new events for the entities using insert...select, with possibly more than one event to add for each entity. So, in the select part I need to get the next available entity version number for each new event to insert. I figured I would get max(entity_version) and row_number() and add them together to get the entity_version to insert for each event. I found an issue in the select part of the insert...select, and the SQL statements below are simplified to concentrate on issue. Now to the actual issue:

If I select the max entity_version and row_number in separate columns, the results are what I expected:

MariaDB [test]> select e.entity_id, (select max(e2.entity_version) from event e2 where e2.entity_id=e.entity_id) as maxver, row_number() over (partition by e.entity_id order by e.entity_id, e.entity_version) as rownum from event e order by e.entity_id, e.entity_version;
+-----------+--------+--------+
| entity_id | maxver | rownum |
+-----------+--------+--------+
|         1 |      3 |      1 |
|         1 |      3 |      2 |
|         1 |      3 |      3 |
|         2 |      2 |      1 |
|         2 |      2 |      2 |
+-----------+--------+--------+

But for insert...select I need to add them together, and then I get something I didn't expect:

MariaDB [test]> select e.entity_id, (select max(e2.entity_version) from event e2 where e2.entity_id=e.entity_id) + (row_number() over (partition by e.entity_id order by e.entity_id, e.entity_version)) as maxverplusrownum from event e order by e.entity_id, e.entity_version;
+-----------+------------------+
| entity_id | maxverplusrownum |
+-----------+------------------+
|         1 |                3 |
|         1 |                4 |
|         1 |                5 |
|         2 |                3 |
|         2 |                4 |
+-----------+------------------+

I tried the same with MySQL 8.0.14, and that gives the expected results:

mysql> select e.entity_id, (select max(e2.entity_version) from event e2 where e2.entity_id=e.entity_id) + (row_number() over (partition by e.entity_id order by e.entity_id, e.entity_version)) as maxverplusrownum from event e order by e.entity_id, e.entity_version;
+-----------+------------------+
| entity_id | maxverplusrownum |
+-----------+------------------+
|         1 |                4 |
|         1 |                5 |
|         1 |                6 |
|         2 |                3 |
|         2 |                4 |
+-----------+------------------+

I might miss something as this is the first time I need to use row_number, and I think I can work around this with a temporary table, but at this point I suspect this might be a bug. I have only tried the versions mentioned above.



 Comments   
Comment by Alice Sherepa [ 2019-01-31 ]

Thanks a lot for the report and the test case! Reproducible on 10.2-10.4, with MyIsam and Innodb

create table t1 (id int, v int);
insert into t1 values (1, 1), (1,2), (1,3), (2, 1), (2, 2);
 
select e.id, 
 (select max(t1.v) from t1 where t1.id=e.id) as a,
 row_number() over (partition by e.id order by e.v) as b,
 (select max(t1.v) from t1 where t1.id=e.id) + (row_number() over (partition by e.id order by e.v)) as sum_a_b
from t1 e;

MariaDB [test]> select e.id, 
    ->  (select max(t1.v) from t1 where t1.id=e.id) as a,
    ->  row_number() over (partition by e.id ORDER BY e.v) as b,
    ->  (select max(t1.v) from t1 where t1.id=e.id) + (row_number() over (partition by e.id ORDER BY e.v)) as sum_a_b
    -> FROM t1 e;
+------+------+---+---------+
| id   | a    | b | sum_a_b |
+------+------+---+---------+
|    1 |    3 | 1 |       3 |
|    1 |    3 | 2 |       4 |
|    1 |    3 | 3 |       5 |
|    2 |    2 | 1 |       3 |
|    2 |    2 | 2 |       4 |
+------+------+---+---------+
5 rows in set (0.01 sec)

expected result:

mysql> select e.id, 
    ->  (select max(t1.v) from t1 where t1.id=e.id) as a,
    ->  row_number() over (partition by e.id ORDER BY e.v) as b,
    ->  (select max(t1.v) from t1 where t1.id=e.id) + (row_number() over (partition by e.id ORDER BY e.v)) as sum_a_b
    -> FROM t1 e;
+------+------+---+---------+
| id   | a    | b | sum_a_b |
+------+------+---+---------+
|    1 |    3 | 1 |       4 |
|    1 |    3 | 2 |       5 |
|    1 |    3 | 3 |       6 |
|    2 |    2 | 1 |       3 |
|    2 |    2 | 2 |       4 |
+------+------+---+---------+
5 rows in set (0.00 sec)

Comment by Varun Gupta (Inactive) [ 2019-03-09 ]

After discussion with psergey, there are 2 approaches to solve this issue:

Approach 1:

  • compute the value of subquery before the window function computation step, then save into the temp. table

Approach 2:

  • have a field in the temp. table which saves the value of e.id
  • then, compute the subquery after the window function computation step, but make sure the subquery uses the saved value of e.id (from the temp.table)
Comment by Oleksandr Byelkin [ 2019-03-12 ]

OK, to push

Generated at Thu Feb 08 08:44:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.