[MCOL-606] select * without order by return non deterministics (random) results Created: 2017-03-04  Updated: 2017-03-05  Resolved: 2017-03-05

Status: Closed
Project: MariaDB ColumnStore
Component/s: MariaDB Server
Affects Version/s: 1.0.7
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Alexander Rubin Assignee: Unassigned
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

Ubuntu 16.04.1 LTS, single node install



 Description   

{{select * from table give random results

MariaDB [wikistat]> select * from wikistat limit 1\G
date: 2008-01-18
time: 2008-01-18 06:00:00
project: en
subproject: NULL
path: Doctor_Who:_Original_Television_Soundtrack
hits: 2
size: 2
1 row in set (1.63 sec)

MariaDB [wikistat]> select * from wikistat limit 1\G
date: 2008-01-31
time: 2008-01-31 10:00:00
project: de
subproject: NULL
path: Haramaki
hits: 1
size: 1
1 row in set (1.58 sec)

MariaDB [wikistat]> select * from wikistat limit 1\G
date: 2008-01-31
time: 2008-01-31 10:00:00
project: de
subproject: NULL
path: Haramaki
hits: 1
size: 1
1 row in set (1.70 sec)

MariaDB [wikistat]> select * from wikistat limit 1\G
date: 2008-01-18
time: 2008-01-18 22:00:00
project: en
subproject: NULL
path: Doctor_of_Professional_Studies
hits: 1
size: 1
1 row in set (1.69 sec)

MariaDB [wikistat]> select * from wikistat limit 1\G
date: 2008-01-11
time: 2008-01-11 05:00:00
project: en
subproject: NULL
path: Doesn\'t_Really_Matter
hits: 2
size: 2
1 row in set (1.71 sec)}}



 Comments   
Comment by Dipti Joshi (Inactive) [ 2017-03-04 ]

"Select * from <table-name>" results are not ordered in any SQL database. When you add LIMIT 1, the first row returned from the database is displayed as result since order of result is not guaranteed on SELECT * from <table-name>, LIMIT 1 will show different but valid row .

What you are seeing is expected behavior.

Comment by David Thompson (Inactive) [ 2017-03-05 ]

Correct, while columnstore still generally returns unordered results in order of processing, even on single server this can vary due to multi core processing. On a multi node install it will vary more. If order is important then you must have an order by clause where the final results are sorted in the mariadb server process before returning.

Comment by Alexander Rubin [ 2017-03-05 ]

What I think I forgot to mention is: the data was not changing at all, the system is completely idle, there are no other open sessions. I agree that the order is not guaranteed, I was surprised that it is changing when the data is not changed and not accessed by any other connections.
My guess will be:
If the network was involved, then the "answer" from each node can be random due to non deterministic network latency.
On 1 node it can be the same thou - the requests are sent in parallel to all the listeners and response is served as it come back. Then when we got N rows back it will just abort the connection. Does it make sense?
If it is documented - could you point out to the documentation?

Comment by David Thompson (Inactive) [ 2017-03-05 ]

On a single node deployment, it can still be variable due to the fact that the processing is distributed amongst cores on a multi core server and results are returned by each thread when ready. Many factors could influence one core's thread block being returned before anothers. If you use a vm and limit it to 1 core you'll see that the results are returned in the same order due to this.

https://mariadb.com/kb/en/mariadb/mariadb-columnstore-performance-concepts/ covers some details on query execution, it doesn't explicitly state this but in general even on innodb you should never make assumptions about an implicit order on queries without order by - i've seen this be the cause of nasty bugs in applications in my past.

Generated at Thu Feb 08 02:22:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.