[MDEV-30244] ROW_NUMBER () OVER () with empty OVER () clause seems to perform a sort Created: 2022-12-16  Updated: 2022-12-16

Status: Open
Project: MariaDB Server
Component/s: Optimizer - Window functions
Affects Version/s: 10.10.2
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Lukas Eder Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None


 Description   

Try this query:

SELECT a, ROW_NUMBER() OVER () 
FROM (
  SELECT 'a' a UNION ALL 
  SELECT 'c' a UNION ALL 
  SELECT 'b' a UNION ALL 
  SELECT 'b' a UNION ALL 
  SELECT 'c' a UNION ALL 
  SELECT 'b' a UNION ALL 
  SELECT 'b' a
) t;

It seems to perform a sort, resulting in:

a ROW_NUMBER() OVER ()
a 1
b 2
b 3
b 4
b 5
c 6
c 7

On the other hand, when adding a dummy ORDER BY clause, then the lexical order is preserved:

SELECT a, ROW_NUMBER() OVER (ORDER BY NULL) 
FROM (
  SELECT 'a' a UNION ALL 
  SELECT 'c' a UNION ALL 
  SELECT 'b' a UNION ALL 
  SELECT 'b' a UNION ALL 
  SELECT 'c' a UNION ALL 
  SELECT 'b' a UNION ALL 
  SELECT 'b' a
) t;

Resulting in:

a ROW_NUMBER() OVER (ORDER BY NULL)
a 1
c 2
b 3
b 4
c 5
b 6
b 7

I would expect both executions to not touch the ordering of the derived table contents, from a performance perspective. I'm aware that both results are technically correct. I just wanted to make sure there's no performance penalty.



 Comments   
Comment by Lukas Eder [ 2022-12-16 ]

I've made a benchmark here: https://github.com/jOOQ/jOOQ/issues/14413#issuecomment-1354623806
It seems that the version without ORDER BY is still slightly faster for this small data set, which I find surprising. They should both be equally fast and produce the same result, in my opinion.

Generated at Thu Feb 08 10:14:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.