[MCOL-5492] RAND() uses static seed of 0 Created: 2023-05-15  Updated: 2023-07-02

Status: Open
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 6.2.3
Fix Version/s: 23.10

Type: Bug Priority: Major
Reporter: Claudio Nanni Assignee: Roman
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Unless there's a specific reason for it, maybe RAND() in Columnstore should behave like the Server's implementation by using a random seed as default parameter.

Columnstore Test Case

CREATE TABLE `test_rand` (
  `c1` int(11) DEFAULT NULL
) ENGINE=Columnstore DEFAULT CHARSET=utf8;
 
insert into test_rand values (1),(2),(3),(4),(5),(6);
 
select * from (select c1,count(*) cnt from test_rand group by c1 order by cnt) x order by rand() limit 3;
select * from (select c1,count(*) cnt from test_rand group by c1 order by cnt) x order by rand() limit 3;

MariaDB Server

CREATE TABLE `test_rand2` (
  `c1` int(11) DEFAULT NULL
) ENGINE=Innodb DEFAULT CHARSET=utf8;
 
insert into test_rand2 values (1),(2),(3),(4),(5),(6);
 
select * from (select c1,count(*) cnt from test_rand2 group by c1 order by cnt) x order by rand() limit 3;
select * from (select c1,count(*) cnt from test_rand2 group by c1 order by cnt) x order by rand() limit 3;



 Comments   
Comment by Claudio Nanni [ 2023-05-15 ]

Sorry forgot to add a test case.
Columnstore engine must be summoned in some way, I guess the function has to be overridden by Columnstore.

CREATE TABLE `test_rand` (
  `c1` int(11) DEFAULT NULL
) ENGINE=Columnstore DEFAULT CHARSET=utf8;
 
insert into test_rand values (1),(2),(3),(4),(5),(6);
 
select * from (select c1,count(*) cnt from test_rand group by c1 order by cnt) x order by rand() limit 3;
 
 
MariaDB [db01]> select * from (select c1,count(*) cnt from test_rand group by c1 order by cnt) x order by rand() limit 3;
+------+-----+
| c1   | cnt |
+------+-----+
|    4 |   1 |
|    2 |   1 |
|    3 |   1 |
+------+-----+
3 rows in set (0,122 sec)
 
MariaDB [db01]> select * from (select c1,count(*) cnt from test_rand group by c1 order by cnt) x order by rand() limit 3;
+------+-----+
| c1   | cnt |
+------+-----+
|    4 |   1 |
|    2 |   1 |
|    3 |   1 |
+------+-----+
3 rows in set (0,119 sec)
 
 

And:

MariaDB [db01]> select rand() from test_rand;
+----------------------------+
| rand()                     |
+----------------------------+
| 0.000000021420419236105326 |
|  0.00000011641532193535503 |
|   0.0000005485489969593929 |
|    0.000002424232663981833 |
|    0.000010506249974021921 |
|     0.00004528928552315504 |
+----------------------------+
6 rows in set (0,117 sec)
 
MariaDB [db01]> select rand() from test_rand;
+----------------------------+
| rand()                     |
+----------------------------+
| 0.000000021420419236105326 |
|  0.00000011641532193535503 |
|   0.0000005485489969593929 |
|    0.000002424232663981833 |
|    0.000010506249974021921 |
|     0.00004528928552315504 |
+----------------------------+
6 rows in set (0,117 sec)

Comment by Todd Stoffel (Inactive) [ 2023-05-16 ]

drrtuy Can you provide an example of the work around that you suggested in Slack?

I assume something like this:

MariaDB [test]> SET @random = (SELECT CEILING(RAND()*100000)); SELECT RAND(@random) FROM test_rand;
Query OK, 0 rows affected (0.001 sec)
 
+--------------------+
| RAND(@random)      |
+--------------------+
| 0.5546934870581082 |
| 0.7188192454341978 |
| 0.9300157967303095 |
| 0.4936212780825992 |
| 0.6780590309557123 |
| 0.9094313512644091 |
+--------------------+
6 rows in set (0.026 sec)
 
MariaDB [test]> SET @random = (SELECT CEILING(RAND()*100000)); SELECT RAND(@random) FROM test_rand;
Query OK, 0 rows affected (0.001 sec)
 
+---------------------+
| RAND(@random)       |
+---------------------+
|  0.5861628377681252 |
| 0.09470570934443336 |
|  0.7150400641514362 |
|  0.2910832234575257 |
|   0.310295955566965 |
|   0.678230138195893 |
+---------------------+
6 rows in set (0.021 sec)
 
MariaDB [test]> SET @random = (SELECT CEILING(RAND()*100000)); SELECT RAND(@random) FROM test_rand;
Query OK, 0 rows affected (0.001 sec)
 
+---------------------+
| RAND(@random)       |
+---------------------+
|   0.550410765735815 |
| 0.20168579481699112 |
|  0.3571967076111554 |
| 0.18092618433844873 |
|  0.8330408295924224 |
|  0.6224256256804109 |
+---------------------+
6 rows in set (0.021 sec)

Comment by Todd Stoffel (Inactive) [ 2023-05-16 ]

claudio.nanni What is the use case for this?

Comment by Claudio Nanni [ 2023-05-16 ]

toddstoffel
Apart the example it's reported with, the use case should be whatever is the use case for MariaDB server RAND() function and I guess the general expectations for random function in most languages, they tend to use a random seed without any parameter.
IMHO it's more a consistency question than a real problem, you can easily provide a random seed to it if needed, e.g:

SELECT *
FROM   (SELECT c1,
               Count(*) cnt
        FROM   test_rand
        GROUP  BY c1
        ORDER  BY cnt) x
ORDER  BY Rand(Microsecond(CURRENT_TIMESTAMP(6)))
LIMIT  3; 

Generated at Thu Feb 08 02:58:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.