Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18431

Select max + row_number giving incorrect result

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.3.11, 10.3.12, 10.2(EOL), 10.3(EOL), 10.4(EOL)
    • 10.2.23
    • None
    • Server version: 10.3.12-MariaDB-1:10.3.12+maria~bionic mariadb.org binary distribution
      Running docker image mariadb:10.3.12 in Docker version 18.09.1, build 4c52b90, on macOS 10.14.2, MacBook Pro (Retina, 15-inch, Mid 2015).

    Description

      I have a table like this:

      MariaDB [test]> CREATE TABLE `event` (
        `entity_id` int(10) NOT NULL,
        `entity_version` int(10) NOT NULL,
        `something` varchar(10) DEFAULT NULL,
        PRIMARY KEY (`entity_id`,`entity_version`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1
      

      With data:

      MariaDB [test]> insert into event values (1, 1, 'foo'), (1,2,'bar'), (1,3,'bar'), (2, 1, 'baz'), (2, 2, 'grunt');
      

      It is a simplified version of a table used for storing events of entities, identified by an entity id and an increasing entity version number, used for event sourcing.

      Some background for the use case: I wanted to insert some new events for the entities using insert...select, with possibly more than one event to add for each entity. So, in the select part I need to get the next available entity version number for each new event to insert. I figured I would get max(entity_version) and row_number() and add them together to get the entity_version to insert for each event. I found an issue in the select part of the insert...select, and the SQL statements below are simplified to concentrate on issue. Now to the actual issue:

      If I select the max entity_version and row_number in separate columns, the results are what I expected:

      MariaDB [test]> select e.entity_id, (select max(e2.entity_version) from event e2 where e2.entity_id=e.entity_id) as maxver, row_number() over (partition by e.entity_id order by e.entity_id, e.entity_version) as rownum from event e order by e.entity_id, e.entity_version;
      +-----------+--------+--------+
      | entity_id | maxver | rownum |
      +-----------+--------+--------+
      |         1 |      3 |      1 |
      |         1 |      3 |      2 |
      |         1 |      3 |      3 |
      |         2 |      2 |      1 |
      |         2 |      2 |      2 |
      +-----------+--------+--------+
      

      But for insert...select I need to add them together, and then I get something I didn't expect:

      MariaDB [test]> select e.entity_id, (select max(e2.entity_version) from event e2 where e2.entity_id=e.entity_id) + (row_number() over (partition by e.entity_id order by e.entity_id, e.entity_version)) as maxverplusrownum from event e order by e.entity_id, e.entity_version;
      +-----------+------------------+
      | entity_id | maxverplusrownum |
      +-----------+------------------+
      |         1 |                3 |
      |         1 |                4 |
      |         1 |                5 |
      |         2 |                3 |
      |         2 |                4 |
      +-----------+------------------+
      

      I tried the same with MySQL 8.0.14, and that gives the expected results:

      mysql> select e.entity_id, (select max(e2.entity_version) from event e2 where e2.entity_id=e.entity_id) + (row_number() over (partition by e.entity_id order by e.entity_id, e.entity_version)) as maxverplusrownum from event e order by e.entity_id, e.entity_version;
      +-----------+------------------+
      | entity_id | maxverplusrownum |
      +-----------+------------------+
      |         1 |                4 |
      |         1 |                5 |
      |         1 |                6 |
      |         2 |                3 |
      |         2 |                4 |
      +-----------+------------------+
      

      I might miss something as this is the first time I need to use row_number, and I think I can work around this with a temporary table, but at this point I suspect this might be a bug. I have only tried the versions mentioned above.

      Attachments

        Activity

          alice Alice Sherepa added a comment -

          Thanks a lot for the report and the test case! Reproducible on 10.2-10.4, with MyIsam and Innodb

          create table t1 (id int, v int);
          insert into t1 values (1, 1), (1,2), (1,3), (2, 1), (2, 2);
           
          select e.id, 
           (select max(t1.v) from t1 where t1.id=e.id) as a,
           row_number() over (partition by e.id order by e.v) as b,
           (select max(t1.v) from t1 where t1.id=e.id) + (row_number() over (partition by e.id order by e.v)) as sum_a_b
          from t1 e;
          

          MariaDB [test]> select e.id, 
              ->  (select max(t1.v) from t1 where t1.id=e.id) as a,
              ->  row_number() over (partition by e.id ORDER BY e.v) as b,
              ->  (select max(t1.v) from t1 where t1.id=e.id) + (row_number() over (partition by e.id ORDER BY e.v)) as sum_a_b
              -> FROM t1 e;
          +------+------+---+---------+
          | id   | a    | b | sum_a_b |
          +------+------+---+---------+
          |    1 |    3 | 1 |       3 |
          |    1 |    3 | 2 |       4 |
          |    1 |    3 | 3 |       5 |
          |    2 |    2 | 1 |       3 |
          |    2 |    2 | 2 |       4 |
          +------+------+---+---------+
          5 rows in set (0.01 sec)
          

          expected result:

          mysql> select e.id, 
              ->  (select max(t1.v) from t1 where t1.id=e.id) as a,
              ->  row_number() over (partition by e.id ORDER BY e.v) as b,
              ->  (select max(t1.v) from t1 where t1.id=e.id) + (row_number() over (partition by e.id ORDER BY e.v)) as sum_a_b
              -> FROM t1 e;
          +------+------+---+---------+
          | id   | a    | b | sum_a_b |
          +------+------+---+---------+
          |    1 |    3 | 1 |       4 |
          |    1 |    3 | 2 |       5 |
          |    1 |    3 | 3 |       6 |
          |    2 |    2 | 1 |       3 |
          |    2 |    2 | 2 |       4 |
          +------+------+---+---------+
          5 rows in set (0.00 sec)
          

          alice Alice Sherepa added a comment - Thanks a lot for the report and the test case! Reproducible on 10.2-10.4, with MyIsam and Innodb create table t1 (id int , v int ); insert into t1 values (1, 1), (1,2), (1,3), (2, 1), (2, 2);   select e.id, ( select max (t1.v) from t1 where t1.id=e.id) as a, row_number() over (partition by e.id order by e.v) as b, ( select max (t1.v) from t1 where t1.id=e.id) + (row_number() over (partition by e.id order by e.v)) as sum_a_b from t1 e; MariaDB [test]> select e.id, -> (select max(t1.v) from t1 where t1.id=e.id) as a, -> row_number() over (partition by e.id ORDER BY e.v) as b, -> (select max(t1.v) from t1 where t1.id=e.id) + (row_number() over (partition by e.id ORDER BY e.v)) as sum_a_b -> FROM t1 e; +------+------+---+---------+ | id | a | b | sum_a_b | +------+------+---+---------+ | 1 | 3 | 1 | 3 | | 1 | 3 | 2 | 4 | | 1 | 3 | 3 | 5 | | 2 | 2 | 1 | 3 | | 2 | 2 | 2 | 4 | +------+------+---+---------+ 5 rows in set (0.01 sec) expected result: mysql> select e.id, -> (select max(t1.v) from t1 where t1.id=e.id) as a, -> row_number() over (partition by e.id ORDER BY e.v) as b, -> (select max(t1.v) from t1 where t1.id=e.id) + (row_number() over (partition by e.id ORDER BY e.v)) as sum_a_b -> FROM t1 e; +------+------+---+---------+ | id | a | b | sum_a_b | +------+------+---+---------+ | 1 | 3 | 1 | 4 | | 1 | 3 | 2 | 5 | | 1 | 3 | 3 | 6 | | 2 | 2 | 1 | 3 | | 2 | 2 | 2 | 4 | +------+------+---+---------+ 5 rows in set (0.00 sec)

          After discussion with psergey, there are 2 approaches to solve this issue:

          Approach 1:

          • compute the value of subquery before the window function computation step, then save into the temp. table

          Approach 2:

          • have a field in the temp. table which saves the value of e.id
          • then, compute the subquery after the window function computation step, but make sure the subquery uses the saved value of e.id (from the temp.table)
          varun Varun Gupta (Inactive) added a comment - After discussion with psergey , there are 2 approaches to solve this issue: Approach 1: compute the value of subquery before the window function computation step, then save into the temp. table Approach 2: have a field in the temp. table which saves the value of e.id then, compute the subquery after the window function computation step, but make sure the subquery uses the saved value of e.id (from the temp.table)

          OK, to push

          sanja Oleksandr Byelkin added a comment - OK, to push

          People

            varun Varun Gupta (Inactive)
            Antikainen Jarno Antikainen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.