[MDEV-13297] RECURSIVE and NULL returning empty string fields in results Created: 2017-07-11  Updated: 2017-07-17  Resolved: 2017-07-17

Status: Closed
Project: MariaDB Server
Component/s: Documentation, Optimizer - CTE
Affects Version/s: 10.2.6, 10.2.7
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Vincent Milum Jr Assignee: Ian Gilfillan
Resolution: Fixed Votes: 0
Labels: None


 Description   

RECURSIVE results with NULL or empty string values in the anchor row will break the results in all other result rows (values for these columns will ALWAYS return an empty string regardless of what their values actually are)

This will work properly, the `tbl` recursive table is NOT referenced

WITH RECURSIVE tbl AS (
	SELECT NULL AS col
	UNION
	SELECT "THIS WORKS" AS col
)
SELECT * FROM tbl

Addition of recursive `tbl` reference. The string in the second result never shows up.

WITH RECURSIVE tbl AS (
	SELECT NULL AS col
	UNION
	SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)
SELECT * FROM tbl

Using any value other than NULL in the anchor results in only a single row being returned

WITH RECURSIVE tbl AS (
	SELECT 0 AS col
	UNION
	SELECT "NOW ONLY 1 ROW IS RETURNED" AS col FROM tbl
)
SELECT * FROM tbl

The intended purpose was to have a custom anchor row separate from the data in the table, such as "SELECT 5 AS `id`, NULL AS `data`" and have the recursive functionality pull the data afterwards, this way the 10+ joined tables needed in my application would only need to be specified in the second half of the union, and only an otherwise empty dummy row would be specified in the anchor part of the union to simplify the query.



 Comments   
Comment by Alice Sherepa [ 2017-07-14 ]

In your example it needs to be cast to one type.

WITH RECURSIVE tbl AS (
	SELECT NULL AS col
	UNION
	SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)
SELECT * FROM tbl
--------------
+------+
| col  |
+------+
| NULL |
|      |
+------+
 
WITH RECURSIVE tbl AS (
	SELECT CAST(NULL as CHAR(50)) AS col
	UNION
	SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)
SELECT * FROM tbl
+---------------------+
| col                 |
+---------------------+
| NULL                |
| THIS NEVER SHOWS UP |
+---------------------+
 
WITH RECURSIVE tbl AS (
	SELECT CAST(0 as CHAR(50)) AS col
	UNION
	SELECT "THIS NEVER SHOWS UP" FROM tbl
)
SELECT * FROM tbl
+---------------------+
| col                 |
+---------------------+
| 0                   |
| THIS NEVER SHOWS UP |
+---------------------+
 
WITH RECURSIVE tbl AS (
	SELECT "aaa" AS col
	UNION
	SELECT "THIS NEVER SHOWS UP" FROM tbl
)
SELECT * FROM tbl
+-----+
| col |
+-----+
| aaa |
| THI |
+-----+

Comment by Vincent Milum Jr [ 2017-07-14 ]

That would then be an inconsistent behavior between using recursive and non-recursive unions. Non-recursive unions don't require column data types to be specifically cast. As quoted from the MySQL docs on Unions:

"If the data types of corresponding SELECT columns do not match, the types and lengths of the columns in the UNION result take into account the values retrieved by all of the SELECT statements." - https://dev.mysql.com/doc/refman/5.7/en/union.html

So while casting the values to force them to be the same across all the SELECT statements does indeed act as a workaround, it is going against what the MySQL docs say is the defined and designed standard of how unions should work.

SELECT 0 AS col UNION SELECT "THIS WORKS";
+------------+
| col        |
+------------+
| 0          |
| THIS WORKS |
+------------+
 
SELECT NULL AS col UNION SELECT "THIS WORKS";
+------------+
| col        |
+------------+
| NULL       |
| THIS WORKS |
+------------+
 
SELECT "A STRING" AS col UNION SELECT "THIS WORKS";
+------------+
| col        |
+------------+
| A STRING   |
| THIS WORKS |
+------------+

Comment by Vincent Milum Jr [ 2017-07-14 ]

Without explicitly defining the column size, data is also bring truncated (which explains why in my previous examples, I was getting a zero-length string)

 WITH RECURSIVE tbl AS (SELECT "test" AS col UNION SELECT "BROKEN" AS col FROM tbl) SELECT * FROM tbl;
+------+
| col  |
+------+
| test |
| BROK |
+------+

However, none of this explains why using an integer returns a single result instead of two results

WITH RECURSIVE tbl AS (SELECT 0 AS col UNION SELECT "BROKEN" AS col FROM tbl) SELECT * FROM tbl;
+-----+
| col |
+-----+
|   0 |
+-----+
 
EXPLAIN WITH RECURSIVE tbl AS (SELECT 0 AS col UNION SELECT "BROKEN" AS col FROM tbl) SELECT * FROM tbl;
+------+-----------------+------------+------+---------------+------+---------+------+------+----------------+
| id   | select_type     | table      | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+------+-----------------+------------+------+---------------+------+---------+------+------+----------------+
|    1 | PRIMARY         | <derived2> | ALL  | NULL          | NULL | NULL    | NULL |    2 |                |
|    2 | SUBQUERY        | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL | No tables used |
|    3 | RECURSIVE UNION | <derived2> | ALL  | NULL          | NULL | NULL    | NULL |    2 |                |
| NULL | UNION RESULT    | <union2,3> | ALL  | NULL          | NULL | NULL    | NULL | NULL |                |
+------+-----------------+------------+------+---------------+------+---------+------+------+----------------+

Comment by Alice Sherepa [ 2017-07-14 ]

I agree that it is confusing, that union silently truncates data. Maybe it needs to be nailed in documentation. The Standart says that the type of a recursive CTE is determined exclusively by non-recursive selects of the specification and it is implemented as it. But there is a discussion about it her https://jira.mariadb.org/browse/MDEV-12325.

Please see also documentation about recursive CTE use in mysql, they also suggest use CAST to avoid truncating data https://dev.mysql.com/doc/refman/8.0/en/with.html#common-table-expressions-recursive

In your example with int : string is casted to integer type and the result is 0, union of 0 and 0 is just 0 as it exclude the same rows.

select cast("AAAA" as int);
+---------------------+
| cast("AAAA" as int) |
+---------------------+
|                   0 |
+---------------------+
1 row in set, 1 warning (0.00 sec)
 
 
WITH RECURSIVE tbl AS (
SELECT 5 AS col
UNION
SELECT "THIS NEVER SHOWS UP" FROM tbl)
SELECT * FROM tbl
--------------
+-----+
| col |
+-----+
|   5 |
|   0 |
+-----+
2 rows in set (0.00 sec)

Comment by Elena Stepanova [ 2017-07-16 ]

As alice explained above, the current behavior is intentional and, however confusing, correlates with SQL standard. The possibility to change it is discussed in MDEV-12325, please use it for further tracking and commenting.

I'm reassigning this issue to greenman for covering explicitly in documentation, because it's very non-obvious specifics which is going to raise many question in future, so the explanation needs to be easily searchable.

Comment by Ian Gilfillan [ 2017-07-17 ]

Added to the docs

Generated at Thu Feb 08 08:04:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.