Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-12556

Possible performance problem with recursive CTE and UNION involving TEXT/BLOB columns

Details

    Description

      The CTE below is intentionally endless. It's controlled by max_recursive_iterations, which works just fine. But the execution time is radically different, depending on whether we use a VARCHAR column, or a TEXT column of the same length.

      DROP TABLE IF EXISTS t;
      CREATE TABLE t (c1 VARCHAR(255), c2 TINYTEXT);
      INSERT INTO t VALUES ('a','a'),('b','b'),('c','c'),('d','d');
       
      SET max_recursive_iterations = 8;
       
      WITH RECURSIVE cte(f) AS (
        SELECT c2 FROM t
        UNION
        SELECT c2 FROM t, cte
      ) SELECT COUNT(*) FROM cte;
       
      WITH RECURSIVE cte(f) AS (
        SELECT c1 FROM t
        UNION
        SELECT c1 FROM t, cte
      ) SELECT COUNT(*) FROM cte;
      

      Result with TEXT column (15.23 sec)

      MariaDB [test]> WITH RECURSIVE cte(f) AS (
          ->   SELECT c2 FROM t
          ->   UNION
          ->   SELECT c2 FROM t, cte
          -> ) SELECT COUNT(*) FROM cte;
      +----------+
      | COUNT(*) |
      +----------+
      |        4 |
      +----------+
      1 row in set (15.23 sec)
      

      Result with VARCHAR column (0.01 sec)

      MariaDB [test]> WITH RECURSIVE cte(f) AS (
          ->   SELECT c1 FROM t
          ->   UNION
          ->   SELECT c1 FROM t, cte
          -> ) SELECT COUNT(*) FROM cte;
      +----------+
      | COUNT(*) |
      +----------+
      |        4 |
      +----------+
      1 row in set (0.01 sec)
      

      If I understand the idea, with UNION the temporary table should never grow above 4 rows, so there is no reason for such long execution time.

      If I replace UNION with UNION ALL, execution time naturally grows, and it becomes approximately the same for VARCHAR and TEXT column. The result (final count) is the same, which shows that the number of iterations is the same too.

      MariaDB [test]> WITH RECURSIVE cte(f) AS (
          ->   SELECT c2 FROM t
          ->   UNION ALL
          ->   SELECT c2 FROM t, cte
          -> ) SELECT COUNT(*) FROM cte;
      +----------+
      | COUNT(*) |
      +----------+
      |   349524 |
      +----------+
      1 row in set (21.04 sec)
      

      MariaDB [test]> WITH RECURSIVE cte(f) AS (
          ->   SELECT c1 FROM t
          ->   UNION ALL
          ->   SELECT c1 FROM t, cte
          -> ) SELECT COUNT(*) FROM cte;
      +----------+
      | COUNT(*) |
      +----------+
      |   349524 |
      +----------+
      1 row in set (19.43 sec)
      

      Attachments

        Issue Links

          Activity

            ANALYZE run for the above two queries shows why their performance is so different:

             MariaDB [test]> ANALYZE
                -> WITH RECURSIVE cte(f) AS (
                ->   SELECT c1 FROM t
                ->   UNION
                ->   SELECT c1 FROM t, cte
                -> ) SELECT COUNT(*) FROM cte;
            +------+-----------------+------------+------+---------------+------+---------+------+------+--------+----------+------------+------------------------------------+
            | id   | select_type     | table      | type | possible_keys | key  | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra                              |
            +------+-----------------+------------+------+---------------+------+---------+------+------+--------+----------+------------+------------------------------------+
            |    1 | PRIMARY         | <derived2> | ALL  | NULL          | NULL | NULL    | NULL |    4 |   4.00 |   100.00 |     100.00 |                                    |
            |    2 | SUBQUERY        | t          | ALL  | NULL          | NULL | NULL    | NULL |    4 |   4.00 |   100.00 |     100.00 |                                    |
            |    3 | RECURSIVE UNION | t          | ALL  | NULL          | NULL | NULL    | NULL |    4 |   4.00 |   100.00 |     100.00 |                                    |
            |    3 | RECURSIVE UNION | <derived2> | ALL  | NULL          | NULL | NULL    | NULL |    4 |   4.00 |   100.00 |     100.00 | Using join buffer (flat, BNL join) |
            | NULL | UNION RESULT    | <union2,3> | ALL  | NULL          | NULL | NULL    | NULL | NULL |   0.00 |     NULL |       NULL |                                    |
            +------+-----------------+------------+------+---------------+------+---------+------+------+--------+----------+------------+------------------------------------+
            MariaDB [test]> ANALYZE
                -> WITH RECURSIVE cte(f) AS (
                ->   SELECT c2 FROM t
                ->   UNION
                ->   SELECT c2 FROM t, cte
                -> ) SELECT COUNT(*) FROM cte;
            +------+-----------------+------------+------+---------------+------+---------+------+------+----------+----------+------------+------------------------------------+
            | id   | select_type     | table      | type | possible_keys | key  | key_len | ref  | rows | r_rows   | filtered | r_filtered | Extra                              |
            +------+-----------------+------------+------+---------------+------+---------+------+------+----------+----------+------------+------------------------------------+
            |    1 | PRIMARY         | <derived2> | ALL  | NULL          | NULL | NULL    | NULL |    4 |     4.00 |   100.00 |     100.00 |                                    |
            |    2 | SUBQUERY        | t          | ALL  | NULL          | NULL | NULL    | NULL |    4 |     4.00 |   100.00 |     100.00 |                                    |
            |    3 | RECURSIVE UNION | t          | ALL  | NULL          | NULL | NULL    | NULL |    4 |     4.00 |   100.00 |     100.00 |                                    |
            |    3 | RECURSIVE UNION | <derived2> | ALL  | NULL          | NULL | NULL    | NULL |    4 | 10922.50 |   100.00 |     100.00 | Using join buffer (flat, BNL join) |
            | NULL | UNION RESULT    | <union2,3> | ALL  | NULL          | NULL | NULL    | NULL | NULL |     0.00 |     NULL |       NULL |                                    |
            +------+-----------------+------------+------+---------------+------+---------+------+------+----------+----------+------------+------------------------------------+
            

            We see that when the second query is executed the temporary table <derived2> used to store the new rows for the next iteration contains far more than 4 rows that it should contain.

            igor Igor Babaev (Inactive) added a comment - ANALYZE run for the above two queries shows why their performance is so different: MariaDB [test]> ANALYZE -> WITH RECURSIVE cte(f) AS ( -> SELECT c1 FROM t -> UNION -> SELECT c1 FROM t, cte -> ) SELECT COUNT(*) FROM cte; +------+-----------------+------------+------+---------------+------+---------+------+------+--------+----------+------------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+-----------------+------------+------+---------------+------+---------+------+------+--------+----------+------------+------------------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | 4.00 | 100.00 | 100.00 | | | 2 | SUBQUERY | t | ALL | NULL | NULL | NULL | NULL | 4 | 4.00 | 100.00 | 100.00 | | | 3 | RECURSIVE UNION | t | ALL | NULL | NULL | NULL | NULL | 4 | 4.00 | 100.00 | 100.00 | | | 3 | RECURSIVE UNION | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | 4.00 | 100.00 | 100.00 | Using join buffer (flat, BNL join) | | NULL | UNION RESULT | <union2,3> | ALL | NULL | NULL | NULL | NULL | NULL | 0.00 | NULL | NULL | | +------+-----------------+------------+------+---------------+------+---------+------+------+--------+----------+------------+------------------------------------+ MariaDB [test]> ANALYZE -> WITH RECURSIVE cte(f) AS ( -> SELECT c2 FROM t -> UNION -> SELECT c2 FROM t, cte -> ) SELECT COUNT(*) FROM cte; +------+-----------------+------------+------+---------------+------+---------+------+------+----------+----------+------------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+-----------------+------------+------+---------------+------+---------+------+------+----------+----------+------------+------------------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | 4.00 | 100.00 | 100.00 | | | 2 | SUBQUERY | t | ALL | NULL | NULL | NULL | NULL | 4 | 4.00 | 100.00 | 100.00 | | | 3 | RECURSIVE UNION | t | ALL | NULL | NULL | NULL | NULL | 4 | 4.00 | 100.00 | 100.00 | | | 3 | RECURSIVE UNION | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | 10922.50 | 100.00 | 100.00 | Using join buffer (flat, BNL join) | | NULL | UNION RESULT | <union2,3> | ALL | NULL | NULL | NULL | NULL | NULL | 0.00 | NULL | NULL | | +------+-----------------+------------+------+---------------+------+---------+------+------+----------+----------+------------+------------------------------------+ We see that when the second query is executed the temporary table <derived2> used to store the new rows for the next iteration contains far more than 4 rows that it should contain.
            igor Igor Babaev (Inactive) added a comment - - edited

            Actually the CTE tables in the queries from the reported test case should have stabilized after the first iteration.
            It's really so for the query that uses the c1 column, and not so for the other query.

            igor Igor Babaev (Inactive) added a comment - - edited Actually the CTE tables in the queries from the reported test case should have stabilized after the first iteration. It's really so for the query that uses the c1 column, and not so for the other query.
            igor Igor Babaev (Inactive) added a comment - - edited

            Here's what happens:
            When the rows produced on the current iteration are sent to the temporary table T of the UNION type created for CTE the rows that were not there simultaneously are sent to the temporary table D that contains rows for the next iteration. The test whether a row was in T checks the return code of writing into T. If just a HEAP table is used for T then the return code is HA_ERR_FOUND_DUPP_KEY, but if an ARIA table is used for T then the return code is HA_ERR_FOUND_DUPP_UNIQUE.
            The current implementation erroneously checks only for the first return code. So if an Aria
            table is used for T then all rows produced by the current iteration go to D. For the first iteration
            it's 4*4, for the second 4*4*4 and so on. Whether T has reached stabilization is detected by
            checking whether D is empty, but it only grows. So as a result, we never stop the iterations
            unless a limit for them is set.

            igor Igor Babaev (Inactive) added a comment - - edited Here's what happens: When the rows produced on the current iteration are sent to the temporary table T of the UNION type created for CTE the rows that were not there simultaneously are sent to the temporary table D that contains rows for the next iteration. The test whether a row was in T checks the return code of writing into T. If just a HEAP table is used for T then the return code is HA_ERR_FOUND_DUPP_KEY, but if an ARIA table is used for T then the return code is HA_ERR_FOUND_DUPP_UNIQUE. The current implementation erroneously checks only for the first return code. So if an Aria table is used for T then all rows produced by the current iteration go to D. For the first iteration it's 4*4, for the second 4*4*4 and so on. Whether T has reached stabilization is detected by checking whether D is empty, but it only grows. So as a result, we never stop the iterations unless a limit for them is set.

            The fix for this bug was pushed into the 10.2 tree.

            igor Igor Babaev (Inactive) added a comment - The fix for this bug was pushed into the 10.2 tree.

            People

              igor Igor Babaev (Inactive)
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.