[MDEV-25149] JSON_TABLE: Inconsistency in implicit data type conversion Created: 2021-03-15  Updated: 2021-04-21  Resolved: 2021-04-17

Status: Closed
Project: MariaDB Server
Component/s: JSON
Affects Version/s: N/A
Fix Version/s: 10.6.0

Type: Bug Priority: Blocker
Reporter: Elena Stepanova Assignee: Alexey Botchkov
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-17399 Add support for JSON_TABLE Closed
relates to MDEV-25377 JSON_TABLE: Wrong value with implicit... Closed

 Description   

This was a part of code review comments / changes. The example is taken from the review comments as is:

select * from
json_table(
  '[{"a":"asd"}, 
    {"a":123},
    {"a":[]},
    {"a":{}}
   ]',
  '$[*]' 
  columns (
    id for ordinality,
    intcol int path '$.a' default '1234' on empty default '5678' on error
  )
) as tt;

At that point, intcol value in the first row was 0, indicating that the value was retrieved, and an implicit string => number conversion was performed. The question was raised on what grounds this conversion occurs, and why there is no warning about it; eventually the result was changed to the default-on-error, which it is now:

bb-10.6-mdev17399-psergey2 8b533cc1d5

+------+--------+
| id   | intcol |
+------+--------+
|    1 |   5678 |
|    2 |    123 |
|    3 |   5678 |
|    4 |   5678 |
+------+--------+
4 rows in set (0.001 sec)

Personally, I am fine with this decision; but to my understanding, the implicit conversion is a general MariaDB approach applied in a variety of cases, so this change may contradict the "official line". Here is the basic example:

MariaDB [test]> select "asd" = 0;
+-----------+
| "asd" = 0 |
+-----------+
|         1 |
+-----------+
1 row in set, 1 warning (0.000 sec)
 
MariaDB [test]> show warnings;
+---------+------+-----------------------------------------+
| Level   | Code | Message                                 |
+---------+------+-----------------------------------------+
| Warning | 1292 | Truncated incorrect DOUBLE value: 'asd' |
+---------+------+-----------------------------------------+
1 row in set (0.000 sec)

I suggest to make sure that serg is okay with the chosen JSON_TABLE behavior, otherwise there is a big chance that later the result will be re-considered and changed for the sake of consistency. In this case it's better to do it right from the beginning than change the behavior later.

Of course the review note about a missing warning was also valid and should be taken into account if the logic changes back to returning the converted value.



 Comments   
Comment by Elena Stepanova [ 2021-03-15 ]

Besides, here is still the same problem with a slightly different example, so the behavior is already inconsistent even within the current JSON_TABLE implementation:

bb-10.6-mdev17399-psergey2 8b533cc1d5

MariaDB [test]> select * from json_table('{"a":"foo"}', '$.a' columns (v char(1) path '$')) t;
+------+
| v    |
+------+
| f    |
+------+
1 row in set (0.000 sec)

MySQL returns NULL which indicates an error:

8.0.23

MySQL [test]> select * from json_table('{"a":"foo"}', '$.a' columns (v char(1) path '$')) t;
+------+
| v    |
+------+
| NULL |
+------+
1 row in set (0.001 sec)

Comment by Elena Stepanova [ 2021-03-23 ]

Raising to a technical blocker. While actual fixing can very well be delayed (even indefinitely), the general approach must be defined, agreed upon, documented and later complied with; otherwise individual fixes upon assorted complaints will only create more discrepancies.

What needs to be decided is:

  • should usual implicit type conversion apply to JSON_TABLE, specifically
    • while casting retrieved values to the defined datatypes
    • while casting defined "default/on error/empty" values to the defined datatypes
  • if implicit type conversion applies, should warnings be issued
  • if implicit type conversion applies, should it be raised to an error when ERROR ON ERROR is set;
  • should implicit type conversion for JSON_TABLE be affected by SQL_MODE STRICT_*_TABLES modes, and if it should, when and by which.
Comment by Sergei Golubchik [ 2021-03-23 ]

The standard specifies is:

  • if the value is an array or object — it's an error "scalar required"
  • if it's a scalar — same rules as CAST(value AS type)

So, in the example from the bug description, the result should be 0, 123, 5678, 5678. In the example of the first comment the correct result is "f".

Perhaps there should be warnings, as with CAST.

Better to actually reuse the CAST code, so that if we'll ever get some strict mode variant where CAST("asd" AS INT) is an error then JSON_TABLE would automatically use the new behavior.

Comment by Sergei Petrunia [ 2021-04-09 ]

Agree with serg's comment. I think there should be a warning.

The problem with reusing the code from CAST function is that there's no single piece of code to reuse: CAST is implemented by many Items:

Item_func_signed
Item_func_unsigned
Item_decimal_typecast
Item_char_typecast
Item_decimal_typecast
Item_real_typecast

also search for create_typecast_item

Comment by Alexey Botchkov [ 2021-04-14 ]

That's the change i propose to begin with

--- a/sql/json_table.cc
+++ b/sql/json_table.cc
@@ -399,7 +399,7 @@ int ha_json_table::rnd_next(uchar *buf)
     they are in).
   */
   cf_orig= table->in_use->count_cuted_fields;
-  table->in_use->count_cuted_fields= CHECK_FIELD_EXPRESSION;
+  table->in_use->count_cuted_fields= CHECK_FIELD_ERROR_FOR_NULL;
   res= fill_column_values(buf, NULL);

Comment by Sergei Petrunia [ 2021-04-14 ]

If we take this patch and run the example from the bug description, we get

+------+--------+
| id   | intcol |
+------+--------+
|    1 |   5678 |
|    2 |    123 |
|    3 |   5678 |
|    4 |   5678 |
+------+--------+
4 rows in set, 1 warning (0.000 sec)

| Warning | 1366 | Incorrect integer value: 'asd' for column ``.`(temporary)`.`intcol` at row 1 |

That is, converting the first "a":"asd" into integer now produces a warning, but the default '5678' on error rule is still fired.

This is not what was described in Serg's comment above.

Comment by Sergei Petrunia [ 2021-04-14 ]

Candidate patch: https://gist.github.com/spetrunia/5f4f13ef5fdf91ca1f343bcb34f3c815

Comment by Alexey Botchkov [ 2021-04-15 ]

Here is my version of the fix.
https://github.com/MariaDB/server/commit/cbe2d62b55b1654af35f702861297dc97d713301
so we can compare.

Comment by Sergei Petrunia [ 2021-04-15 ]

Review input provided on Slack

Comment by Sergei Petrunia [ 2021-04-16 ]

Ok to push

Generated at Thu Feb 08 09:35:31 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.