[MDEV-28947] JSON_TYPE result is turncated, charset max length should be considered Created: 2022-06-25  Updated: 2023-01-31  Resolved: 2022-06-30

Status: Closed
Project: MariaDB Server
Component/s: JSON
Affects Version/s: 10.8.2, 10.8.3, 10.9.1, 10.10.1
Fix Version/s: 10.8.4

Type: Bug Priority: Major
Reporter: ziyitan Assignee: Rucha Deodhar
Resolution: Fixed Votes: 0
Labels: gsoc22
Environment:

Arch Linux x86_64


Attachments: File gdb.log    
Issue Links:
Blocks
blocks MCOL-785 Implement DISTRIBUTED JSON functions Closed
Problem/Incident
causes MCOL-5408 JSON_TYPE() returned incorrect result Open

 Description   

Hi, I'm gsoc-2022 contributor,I'm working on the MCOL-785.
Now I have a problem: My JSON_TYPE implementation return only the first 4 characters:

MariaDB root@(none):test> create table jt_test(i longtext) engine=columnstore;
Query OK, 0 rows affected
Time: 0.288s
MariaDB root@(none):test> insert into jt_test values('{}');
Query OK, 1 row affected
Time: 0.179s
MariaDB root@(none):test> insert into jt_test values("[]");
Query OK, 1 row affected
Time: 0.134s
MariaDB root@(none):test> insert into jt_test values(42);
Query OK, 1 row affected
Time: 0.116s
MariaDB root@(none):test> select json_type(i) from jt_test;
+--------------+
| json_type(i) |
+--------------+
| OBJE         |
| ARRA         |
| INTE         |
+--------------+

The charset used by json_type() Field is utf8_mb3. When the copy into the Field buffer takes place there is a copy length calculation that divides the length 12 bytes(strange but this is what it is) by 3(max byte per char length for utf8_mb3).So finally the result return 4 characters.

I found here that the max length and collation are set:

bool Item_func_json_type::fix_length_and_dec()
{
  collation.set(&my_charset_utf8mb3_general_ci);
  max_length= 12;
  set_maybe_null();
  return FALSE;
}

So I think the max_length should be set 12*collation.collation->mbmaxlen to make json_type in columnstore behave correctly.

  • My JSON_TYPE implementation: func_json_type.cpp
  • Pull Request: has been submitted and passed CI)
  • The Callstack that might be useful (the #0、#1、#2、#3 from_length=6, nchars=4).


 Comments   
Comment by Nayuta Yanagisawa (Inactive) [ 2022-06-25 ]

https://github.com/MariaDB/server/pull/2172

Comment by Rucha Deodhar [ 2022-06-30 ]

Merged to 10.8: https://github.com/MariaDB/server/commit/ba5b2e7b291a9b4bfb97dcdf3c53ca49fc91a4e7

Generated at Thu Feb 08 10:04:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.