Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.2(EOL), 10.3(EOL)
-
None
Description
I have a table with a company employee hierarchy:
DROP TABLE IF EXISTS t1; |
CREATE TABLE t1 |
(
|
id INT, -- employee ID |
mid INT, -- manager ID, or NULL if top level employee |
name TEXT -- employee name |
);
|
INSERT INTO t1 VALUES (0,NULL, 'Name'); |
INSERT INTO t1 VALUES (1,0, 'Name1'); |
INSERT INTO t1 VALUES (2,0, 'Name2'); |
INSERT INTO t1 VALUES (11,1, 'Name11'); |
INSERT INTO t1 VALUES (12,1, 'Name12'); |
Now I want to print the company hierarchy as a table that additionally includes employee hierarchy levels and manager names. I use a recursive CTE for that:
DROP TABLE IF EXISTS t2; |
CREATE TABLE t2 AS |
WITH RECURSIVE |
cteReports (level, id, mid, name) AS |
(
|
SELECT 1, id, mid, name FROM t1 WHERE mid IS NULL |
UNION ALL |
SELECT r.level + 1, e.id, e.mid, e.name FROM t1 e |
INNER JOIN cteReports r ON e.mid = r.id |
)
|
SELECT
|
level, id, mid, name, |
(SELECT name FROM t1 WHERE id= cteReports.mid) AS mname |
FROM cteReports |
ORDER BY level, mid; |
SHOW CREATE TABLE t2; |
SELECT * FROM t2; |
It creates a table with the expected structure:
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| Table | Create Table |
|
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
| t2 | CREATE TABLE `t2` (
|
`level` bigint(1) NOT NULL DEFAULT 0,
|
`id` int(11) DEFAULT NULL,
|
`mid` int(11) DEFAULT NULL,
|
`name` text DEFAULT NULL,
|
`mname` text DEFAULT NULL
|
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
|
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
and with correct data:
+-------+------+------+--------+-------+
|
| level | id | mid | name | mname |
|
+-------+------+------+--------+-------+
|
| 1 | 0 | NULL | Name | NULL |
|
| 2 | 1 | 0 | Name1 | Name |
|
| 2 | 2 | 0 | Name2 | Name |
|
| 3 | 11 | 1 | Name11 | Name1 |
|
| 3 | 12 | 1 | Name12 | Name1 |
|
+-------+------+------+--------+-------+
|
Now I want for some reasons to add some big number to manager IDs (mid):
DROP TABLE IF EXISTS t2; |
CREATE TABLE t2 AS |
WITH RECURSIVE |
cteReports (level, id, mid, name) AS |
(
|
SELECT 1, id, mid, name FROM t1 WHERE mid IS NULL |
UNION ALL |
SELECT r.level + 1, e.id, e.mid + 1000000000000, e.name FROM t1 e |
INNER JOIN cteReports r ON e.mid = r.id |
)
|
SELECT
|
level, id, mid, name, |
(SELECT name FROM t1 WHERE id= cteReports.mid) AS mname |
FROM cteReports |
ORDER BY level, mid; |
SHOW CREATE TABLE t2; |
SELECT * FROM t2; |
It creates exactly the same table to the one in the previous script and truncates data silently:
+-------+------+------------+--------+-------+
|
| level | id | mid | name | mname |
|
+-------+------+------------+--------+-------+
|
| 1 | 0 | NULL | Name | NULL |
|
| 2 | 1 | 2147483647 | Name1 | NULL |
|
| 2 | 2 | 2147483647 | Name2 | NULL |
|
| 3 | 11 | 2147483647 | Name11 | NULL |
|
| 3 | 12 | 2147483647 | Name12 | NULL |
|
+-------+------+------------+--------+-------+
|
This looks wrong. Instead if INT, the expected column type for mid should be BIGINT or DECIMAL, and no data truncation should happen.
Attachments
Issue Links
- relates to
-
MDEV-32202 Document MDEV-12325 Unexpected data type and truncation when using CTE
-
- Open
-
I will try to here summary the things learned from this MDEV:
For recursive CTE:s, according to the SQL standard, the type of the created columns are taken from the non recursive part of a CTE.
This means that for the query:
WITH RECURSIVE
cteReports (level, id, mid, name) AS
(
SELECT 1, id, mid, name FROM t1 WHERE mid IS NULL
UNION ALL
SELECT r.level + 1, e.id, e.mid + 1000000000000, e.name FROM t1 e
INNER JOIN cteReports r ON e.mid = r.id
)
SELECT
level, id, mid, name,
(SELECT name FROM t1 WHERE id= cteReports.mid) AS mname
FROM cteReports
ORDER BY level, mid;
The type of the result from the CTE is taken from:
SELECT 1, id, mid, name FROM t1;
This means that the type of mid should be int (as it it is defined in t1).
If we follow this logic, the bug is that we don't get a warning or error (depending on strict mode) when trying to store 2e.mid + 1000000000000" into an int column. Adding this error should be trivial and I will look into creating a patch for this. This is, as far as I understand, what the SQL standard would require.
An extension to the SQL standard that could be considered in case of recursive CTE's:
When there are expression for a column, we could extend the type to the 'biggest possible' of that type. This means that we would do the following type conversations for the result:
Int -> bigint
char/varchar -> blob
decimal(X,Y) -> decimal(65,Y)
For the moment we are considering to follow the SQL standard and give an error if the result of an expression causes an overflow.
For the cases that one gets an error, one can always fix it by adding a cast for the 'non recursive part' of the CTE:
WITH RECURSIVE
cteReports (level, id, mid, name) AS
(
SELECT 1, id, cast(mid as double), name FROM t1 WHERE mid IS NULL
UNION ALL
SELECT r.level + 1, e.id, e.mid + 1000000000000, e.name FROM t1 e
INNER JOIN cteReports r ON e.mid = r.id
) .....