[MDEV-30623] JSON_TABLE in subquery not working Created: 2023-02-09  Updated: 2023-11-28

Status: Confirmed
Project: MariaDB Server
Component/s: JSON, Optimizer
Affects Version/s: 10.6.11, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11
Fix Version/s: 10.6, 10.11

Type: Bug Priority: Major
Reporter: Thomas G. Jensen Assignee: Rex Johnston
Resolution: Unresolved Votes: 2
Labels: JSON_TABLE, optimizer-easy


 Description   

CREATE TEMPORARY TABLE t1 (t1_id int, t1_json text);
INSERT t1 VALUES (1, '[[10,20]]'), (2, '[[100,200]]'), (3, '[[1000,2000]]');
 
SELECT
	t1_id,
	t2_json,
	(SELECT SUM(x2) 
	FROM 
	  JSON_TABLE(t2_json, '$[*]' COLUMNS ( x2 int PATH '$' )) t3
	) t2_json_sum
FROM 
  t1
  CROSS JOIN 
  JSON_TABLE(t1_json, '$[*]' COLUMNS (t2_json json PATH '$' )) t2;

Gives the result:

+-------+-------------+-------------+
| t1_id | t2_json     | t2_json_sum |
+-------+-------------+-------------+
|     1 | [10,20]     |          30 |
|     2 | [100,200]   |          30 |
|     3 | [1000,2000] |          30 |
+-------+-------------+-------------+

In the subquery the t2_json column from first row is used on every row.



 Comments   
Comment by Alice Sherepa [ 2023-02-14 ]

Thank you for the report!
I repeated on 10.6-10.11:

MariaDB [test]> create table t1(x json);
Query OK, 0 rows affected (0,039 sec)
 
MariaDB [test]> insert into t1 values ('[{"x":"1"},{"x":"2"}]'),('[{"x":"10"},{"x":"20"}]'),('[{"x":"100"},{"x":"200"}]');
Query OK, 3 rows affected (0,003 sec)
Records: 3  Duplicates: 0  Warnings: 0
 
MariaDB [test]> select *, (SELECT sum(x) FROM json_table(x, '$[*]' columns( x int path '$.x')) AS jt) from t1;
+---------------------------+-----------------------------------------------------------------------------+
| x                         | (SELECT sum(x) FROM json_table(x, '$[*]' columns( x int path '$.x')) AS jt) |
+---------------------------+-----------------------------------------------------------------------------+
| [{"x":"1"},{"x":"2"}]     |                                                                           3 |
| [{"x":"10"},{"x":"20"}]   |                                                                           3 |
| [{"x":"100"},{"x":"200"}] |                                                                           3 |
+---------------------------+-----------------------------------------------------------------------------+
3 rows in set (0,000 sec)
MariaDB [test]> explain extended select *, (SELECT sum(x) FROM json_table(x, '$[*]' columns( x int path '$.x')) AS jt) from t1;
+------+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                      |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------+
|    1 | PRIMARY     | t1    | ALL  | NULL          | NULL | NULL    | NULL | 3    |   100.00 |                            |
|    2 | SUBQUERY    | jt    | ALL  | NULL          | NULL | NULL    | NULL | 40   |   100.00 | Table function: json_table |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------+
2 rows in set, 2 warnings (0,000 sec)
 
Note (Code 1276): Field or reference 'test.t1.x' of SELECT #2 was resolved in SELECT #1
Note (Code 1003): /* select#1 */ select `test`.`t1`.`x` AS `x`,(/* select#2 */ select sum(`jt`.`x`) from JSON_TABLE(`test`.`t1`.`x`, '$[*]' COLUMNS (`x` int(11) PATH '$.x')) `jt`) AS `(SELECT sum(x) FROM json_table(x, '$[*]' columns( x int path '$.x')) AS jt)` from `test`.`t1`

Comment by clint chenery [ 2023-10-04 ]

I have the same issue as this with MariaDb 10.6, i have worked around it by creating a custom function containing the json_table command that is used in place of the subquery and then passing the JSON column to the function which returns some value

Comment by Sergei Petrunia [ 2023-10-04 ]

The subquery uses JSON_TABLE and its first parameter JSON_TABLE(t2_json is a reference to outside the subquery:

SELECT ...
	t2_json,
	(SELECT SUM(x2) 
	FROM 
	  JSON_TABLE(t2_json, '$[*]' COLUMNS ( x2 int PATH '$' )) t3
	) t2_json_sum

But EXPLAIN output in the first comment shows type=SUBQUERY, which means the optimizer thinks the subquery is not correlated.

Could the cause be that the code that collects subquery's outside references doesn't check table function's parameters? And the fix would be to check them?

Comment by Daniel Black [ 2023-10-26 ]

From https://stackoverflow.com/questions/77360787/mariadb-query-exists-and-between

create table events (event_id int, event_title varchar(30), event_distances json);
insert into events values (1, 'event1', '[{"order":0,"value":21}]'),
   (2, 'event2', '[{"order":0,"value":8}]'),
   (3, 'event3', '[{"order":0,"value":10}]'),
   (4, 'event4', '[{"order":0,"value":5},{"order":1,"value":10},{"order":2,"value":21}]');
 
SELECT events.event_title, event_distances 
FROM events 
WHERE 
EXISTS ( 
    SELECT 1 
    FROM JSON_TABLE(event_distances, '$[*]' COLUMNS (distance_value FLOAT PATH '$.value')) AS distances 
    WHERE distances.distance_value BETWEEN 9 AND 22
) 
ORDER BY events.event_id ASC;
+-------------+-----------------------------------------------------------------------+
| event_title | event_distances                                                       |
+-------------+-----------------------------------------------------------------------+
| event1      | [{"order":0,"value":21}]                                              |
| event2      | [{"order":0,"value":8}]                                               |
| event3      | [{"order":0,"value":10}]                                              |
| event4      | [{"order":0,"value":5},{"order":1,"value":10},{"order":2,"value":21}] |
+-------------+-----------------------------------------------------------------------+
4 rows in set (0.001 sec)
 
MariaDB [test]> SELECT events.event_title, event_distances  FROM events  WHERE  EXISTS (      SELECT 1      FROM JSON_TABLE(event_distances, '$[*]' COLUMNS (distance_value FLOAT PATH '$.value')) AS distances      WHERE distances.distance_value BETWEEN 20 AND 22 )  ORDER BY events.event_id ASC;
+-------------+-----------------------------------------------------------------------+
| event_title | event_distances                                                       |
+-------------+-----------------------------------------------------------------------+
| event1      | [{"order":0,"value":21}]                                              |
| event2      | [{"order":0,"value":8}]                                               |
| event3      | [{"order":0,"value":10}]                                              |
| event4      | [{"order":0,"value":5},{"order":1,"value":10},{"order":2,"value":21}] |
+-------------+-----------------------------------------------------------------------+

Generated at Thu Feb 08 10:17:39 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.