Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23066

RAND() evaluated and filtered twice with subquery

Details

    • Bug
    • Status: Confirmed (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.4.8, 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5
    • 10.5
    • None
    • None
    • Windows 10

    Description

      When calling RAND() in a subquery and filtering on it in the outer query, it appears that a random number is being generated twice, with both values checked against the WHERE clause. This issue sounds similar to MDEV-5414, but that issue is marked as fixed in 10.0.8. MySQL (or at least recent versions of it) does not appear to exhibit this problem.

      Example:

      WITH RECURSIVE one_hundred (N) AS (
        SELECT 1
        UNION ALL
        SELECT N+1 FROM one_hundred WHERE N < 100
      ),
      randomized (RANDOM_VALUE) AS (
        SELECT RAND()
        FROM one_hundred
      )
      SELECT COUNT(*)
      FROM randomized
      WHERE RANDOM_VALUE <= 0.1;
      

      This query should return 10 on average, but instead returns 1 on average.

      One workaround is to add a LIMIT clause to the inner query. The following code returns an average of 10 as expected:

      WITH RECURSIVE one_hundred (N) AS (
        SELECT 1
        UNION ALL
        SELECT N+1 FROM one_hundred WHERE N < 100
      ),
      randomized (RANDOM_VALUE) AS (
        SELECT RAND()
        FROM one_hundred
        LIMIT 999999999999
      )
      SELECT COUNT(*)
      FROM randomized
      WHERE RANDOM_VALUE <= 0.1;
      

      Attachments

        Issue Links

          Activity

            alice Alice Sherepa added a comment -

            It is repeatable on 10.2-10.5 - when condition_pushdown_for_derived=on.

            MariaDB [test]> SELECT count(a)
                -> FROM (SELECT RAND()*100 a FROM seq_1_to_1000) tbl
                -> WHERE a <= 10;
            +----------+
            | count(a) |
            +----------+
            |        9 |
            +----------+
            1 row in set (0.006 sec)
             
            MariaDB [test]> explain extended
                -> SELECT count(a)
                -> FROM (SELECT RAND()*100 a FROM seq_1_to_1000) tbl
                -> WHERE a <= 10;
             
            +------+-------------+---------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
            | id   | select_type | table         | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                    |
            +------+-------------+---------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
            |    1 | PRIMARY     | <derived2>    | ALL   | NULL          | NULL    | NULL    | NULL | 1000 |   100.00 | Using where              |
            |    2 | DERIVED     | seq_1_to_1000 | index | NULL          | PRIMARY | 8       | NULL | 1000 |   100.00 | Using where; Using index |
            +------+-------------+---------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
            2 rows in set, 1 warning (0.004 sec)
             
            Note (Code 1003): /* select#1 */ select count(`tbl`.`a`) AS `count(a)` from (/* select#2 */ select rand() * 100 AS `a` from `test`.`seq_1_to_1000` where rand() * 100 <= 10) `tbl` where `tbl`.`a` <= 10
             
            MariaDB [test]> set optimizer_switch='condition_pushdown_for_derived=off'; 
            Query OK, 0 rows affected (0.001 sec)
             
            MariaDB [test]> SELECT count(a) FROM (SELECT RAND()*100 a FROM seq_1_to_1000) tbl WHERE a <= 10;
            +----------+
            | count(a) |
            +----------+
            |      103 |
            +----------+
            1 row in set (0.012 sec)
            
            

            alice Alice Sherepa added a comment - It is repeatable on 10.2-10.5 - when condition_pushdown_for_derived=on. MariaDB [test]> SELECT count(a) -> FROM (SELECT RAND()*100 a FROM seq_1_to_1000) tbl -> WHERE a <= 10; +----------+ | count(a) | +----------+ | 9 | +----------+ 1 row in set (0.006 sec)   MariaDB [test]> explain extended -> SELECT count(a) -> FROM (SELECT RAND()*100 a FROM seq_1_to_1000) tbl -> WHERE a <= 10;   +------+-------------+---------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+---------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using where | | 2 | DERIVED | seq_1_to_1000 | index | NULL | PRIMARY | 8 | NULL | 1000 | 100.00 | Using where; Using index | +------+-------------+---------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ 2 rows in set, 1 warning (0.004 sec)   Note (Code 1003): /* select#1 */ select count(`tbl`.`a`) AS `count(a)` from (/* select#2 */ select rand() * 100 AS `a` from `test`.`seq_1_to_1000` where rand() * 100 <= 10) `tbl` where `tbl`.`a` <= 10   MariaDB [test]> set optimizer_switch='condition_pushdown_for_derived=off'; Query OK, 0 rows affected (0.001 sec)   MariaDB [test]> SELECT count(a) FROM (SELECT RAND()*100 a FROM seq_1_to_1000) tbl WHERE a <= 10; +----------+ | count(a) | +----------+ | 103 | +----------+ 1 row in set (0.012 sec)

            People

              Unassigned Unassigned
              Christopher Granahan Christopher Granahan
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.