[MCOL-785] Implement DISTRIBUTED JSON functions Created: 2017-06-23  Updated: 2022-11-14  Resolved: 2022-09-04

Status: Closed
Project: MariaDB ColumnStore
Component/s: ExeMgr
Affects Version/s: 1.1.0
Fix Version/s: 22.08.1

Type: New Feature Priority: Major
Reporter: Daniel Lee (Inactive) Assignee: ziyitan
Resolution: Fixed Votes: 1
Labels: gsoc22

Issue Links:
Blocks
is blocked by MDEV-28947 JSON_TYPE result is turncated, charse... Closed
Issue split
split to MCOL-5228 Implement JSON_OBJECTAGG Open
split to MCOL-5229 Implement JSON_TABLE Open
PartOf
is part of MCOL-5227 Implement JSON_ARRAYAGG Closed
Relates
relates to MCOL-784 JSON_REPLACE() returned a json_update... Closed
relates to MCOL-599 JSON Functions Closed
relates to MCOL-1358 Json Datatype not support Closed
relates to MCOL-1548 Add feature to load the json/xml file... Closed
relates to MCOL-2099 Document supported JSON functions in KB Closed
relates to MCOL-3479 JSON functions return "can only be us... Closed
Epic Link: Columnstore JSON Compatibility
Assigned for Review: Denis Khalikov Denis Khalikov

 Description   

As of 6.2.2 Columnstore doesn't support [1] JSON_* functions doing Select Handler execution.

We want to add support for the mentioned JSON_* functions into the existing functions and expressions runtime framework(FuncExp).
Here are the phases for the project:

  1. - get familiar with the code base, compilation and bootstrap process [2]
  2. - split the list of JSON_* functions into those suitable for distributed Select Handler execution and those aren't
  3. - get familiar with FuncExp details [3]
  4. - get familiar with MTR testing framework MCS [4] leverages
  5. - implement/borrow tests from MDB MTR test suite
  6. - implement suitable(see phase 2) JSON_* functions

There will be an additional simple task to add JSON data type as an alias for LONGTEXT(flex/bison parser tiny project).

Plz get in touch either here or in MariaDB Zulip chat

1. There are two query execution modes in MCS: fast distributed Select Handler execution and relatively slow but MDB-compatable Table execution.
2. Here is the automated bootstrap script.
3. FuncExp is a number of classes to describe an arbitrary tree of expressions doing projection, e.g. SELECT COS(c1 + c2) or doing filtering, e.g. SELECT c1 from t1 WHERE LENGTH(SUBSTRING(c1,2,2)) > 5. In most cases the implementation of a function looks as simple as this. An abstract part for FuncExp can be found in the MCS repo here.
4. Plz see this for the short info.



 Comments   
Comment by ziyitan [ 2022-04-02 ]

Hello! @Todd Stoffel I am Tan Ziyi from Huazhong University of Science and Technology (HUST, China),
a 1st computer graduate student
I want to work on this issue
Please guide me how to start working, looking forward to your reply!
(I am trying to build this project)

Comment by Roman [ 2022-04-02 ]

Hello lunar_land and nice to meet you.
It is better to move to zulip chat to have an interactive session regarding onboarding and various aspects. My nick there is Roman Nozdrin so plz contact me there.

Comment by Roman [ 2022-08-31 ]

Great job lunar_land, We appreciate your contribution of this major feature a lot!

Comment by Daniel Lee (Inactive) [ 2022-09-02 ]

Build tested: 22.08.1 (#5508)

I verified the following MTR autopilot test cases using MariaDB server with innoDB tables.

The following two cases used to match and continuet to do so.

columnstore/autopilot.mcs6721_json_JSON_ARRAY
columnstore/autopilot.mcs6736_json_JSON_OBJECT

The following 22 cases are known to have issues. With the new implementation of the
JSON functions, they are now matching with results of MariaDB server.
Both test cases and reference results have been udpated. Both files have been renamed
to omit the "_KnownIssue" suffix.

Updated files are in branch dlee_jsonUpdate

columnstore/autopilot.mcs6722_json_JSON_ARRAY_APPEND_KnownIssue
columnstore/autopilot.mcs6723_json_JSON_ARRAY_INSERT_KnownIssue
columnstore/autopilot.mcs6724_json_JSON_COMPACT_KnownIssue
columnstore/autopilot.mcs6725_json_JSON_CONTAINS_KnownIssue
columnstore/autopilot.mcs6726_json_JSON_CONTAINS_PATH_KnownIssue
columnstore/autopilot.mcs6727_json_JSON_DEPTH_KnownIssue
columnstore/autopilot.mcs6729_json_JSON_EXISTS_KnownIssue
columnstore/autopilot.mcs6730_json_JSON_EXTRACT_KnownIssue
columnstore/autopilot.mcs6731_json_JSON_INSERT_KnownIssue
columnstore/autopilot.mcs6732_json_JSON_KEYS_KnownIssue
columnstore/autopilot.mcs6733_json_JSON_LENGTH_KnownIssue
columnstore/autopilot.mcs6734_json_JSON_LOOSE_KnownIssue
columnstore/autopilot.mcs6735_json_JSON_MERGE_KnownIssue
columnstore/autopilot.mcs6737_json_JSON_QUERY_KnownIssue
columnstore/autopilot.mcs6739_json_JSON_REMOVE_KnownIssue
columnstore/autopilot.mcs6740_json_JSON_REPLACE
columnstore/autopilot.mcs6741_json_JSON_SEARCH_KnownIssue
columnstore/autopilot.mcs6742_json_JSON_SET_KnownIssue
columnstore/autopilot.mcs6743_json_JSON_TYPE_KnownIssue
columnstore/autopilot.mcs6744_json_JSON_UNQUOTE_KnownIssue
columnstore/autopilot.mcs6745_json_JSON_VALID_KnownIssue
columnstore/autopilot.mcs6746_json_JSON_VALUE_KnownIssue

The following two cases continue to have issues

columnstore/autopilot.mcs6728.json_JSON_DETAILED_KnownIssue
   error returned
 
 
columnstore/autopilot.mcs6743_json_JSON_TYPE_KnownIssue
   mismatch results returned
 
mdb returns
---
@json
{"A": 1, "B": 2, "C": 3}
JSON_TYPE(@json)
OBJECT
cInt  cVarchar  cText
1 {"A": 1, "B": 2, "C": 3}  {"A": 1, "B": 2, "C": 3}
cVarchar  JSON_TYPE(cVarchar) cText JSON_TYPE(cText)
{"A": 1, "B": 2, "C": 3}  OBJECT  {"A": 1, "B": 2, "C": 3}  OBJECT
---
 
mcs returns
---
@json
{"A": 1, "B": 2, "C": 3}
JSON_TYPE(@json)
OBJECT
cInt  cVarchar  cText
1 {"A": 1, "B": 2, "C": 3}  {"A": 1, "B": 2, "C": 3}
cVarchar  JSON_TYPE(cVarchar) cText JSON_TYPE(cText)
{"A": 1, "B": 2, "C": 3}  OBJE  {"A": 1, "B": 2, "C": 3}  OBJE
---
 
[root@s0 /]# diff mcs.txt mdb.txt 
8c8
< {"A": 1, "B": 2, "C": 3}  OBJE  {"A": 1, "B": 2, "C": 3}  OBJE
---
> {"A": 1, "B": 2, "C": 3}  OBJECT  {"A": 1, "B": 2, "C": 3}  OBJECT

Comment by Roman [ 2022-09-04 ]

I don't see a reason to move this issue into staled. The only issue found by QA is caused by the server code and had been fixed already in 10.11.

Generated at Thu Feb 08 02:23:48 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.