Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17399

Add support for JSON_TABLE

    XMLWordPrintable

    Details

    • Type: Task
    • Status: Stalled (View Workflow)
    • Priority: Critical
    • Resolution: Unresolved
    • Fix Version/s: 10.6
    • Component/s: JSON
    • Labels:
      None

      Description

      MySQL 8.0 has JSON_TABLE. These is also available in Oracle, DB2. (PostgreSQL has a patch under development but it seems it is not in the 12th or 13th release)

      1. Syntax
      2. JSON_TABLE as table function
      3. Implementing a table function
      3.1 DDL considerations
      3.2 Producing the rows
      3.3 Interfacing with the optimizer
      3.3.1 Estimates 
      3.3.2 JSON_TABLE must be after its dependencies
      3.3.3 Can't use Join buffering 
      5. Links
      

      1. Syntax

      SQL Standard defines a lot of features for JSON_TABLE. This task will implement a practically-important subset (roughly the same what MySQL-8 supports)

      The syntax to be implemented is:

        JSON_TABLE(
          json_text,
          path COLUMNS(columns_list) 
        ) [AS] alias
      

      columns_list:  
        column [, column[, ... ]]
      

      column:
          name FOR ORDINALITY 
        | name type PATH path_str [on_empty_on_error]
        | name type EXISTS PATH path_str [on_empty_on_error]
        | NESTED PATH path COLUMNS (columns_list)
      

      on_empty_on_error:
        [ behavior ON EMPTY ] [ behavior ON ERROR ]
      

      behavior:
         ERROR | NULL | DEFAULT <value_expression>
      

      Here, JSON_TABLE(...) is a table function and is allowed wherever a table reference is allowed, except for the context which imply that the referred table is to be modified.

      1.1 Standard compliance

      The above is a subset of the syntax allowed by the SQL Standard, with exception of name type EXISTS PATH path_str which is a non-standard extension in (Oracle database and Oracle MySQL?)

      2. JSON_TABLE as table function

      JSON_TABLE is the first table function - a table whose contents depends on its arguments.

      The arguments may be non-constant (e.g. columns of other tables), which causes the JSON_TABLE table to have LATERAL-like dependencies on its arguments.

      3. Implementing a table function

      3.1 DDL considerations

      A temporary table [definition] will be created with appropriate set of columns with appropriate attributes. This will allow for references to the columns of JSON_TABLE(...) to be resolved using the regular name resolution process.

      3.2 Producing the rows

      In order to produce rows, a Storage Engine will be implemented, ha_json_table. That is, the temporary table will use the ha_json_table engine.
      (This could be generalized to a generic ha_table_function Storage Engine for table function, but it was decided not to do that until we have other examples of table functions).

      The engine will only support full table scans (rnd_init/rnd_next). rnd_init() call will implicitly evaluate the JSON_TABLE's parameters and set the storage engine to produce the rows that come from the parameters.

      3.3 Interfacing with the optimizer

      3.3.1 Estimates

      The optimizer will need to be aware that the table representing the output of JSON_TABLE(...) does not have any contents during the optimization phase. We will still need to provide some numbers for expected #rows and read_time. JSON documents are typically small, so a hard-coded constant describing a reasonably-sized JSON document would be sufficient.

      3.3.2 JSON_TABLE must be after its dependencies

      Second, the join optimizer will need to be aware that
      JSON_TABLE(tbl1.column_x, ... ) can only be accessed when the current row for table tbl1 is available.
      This will be done as follows:
      Table dependencies will be set such that JSON_TABLE(...) depends on tbl1.

      3.3.3 Can't use Join buffering

      Join buffering will be disabled for table JSON_TABLE(...).

      4. Producing JSON_TABLE contents

      4.1 FOR ORDINALITY columns

      The standard says:

      An ordinality column provides a sequential numbering of rows. Row numbering is 1-based.

      The numbering seems to be "global", that is, it doesn't restart across table re-reads.

      4.2 NESTED contents

      5. Links

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              holyfoot Alexey Botchkov
              Reporter:
              diego dupin Diego Dupin
              Votes:
              20 Vote for this issue
              Watchers:
              25 Start watching this issue

                Dates

                Created:
                Updated: