Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23023

Put compatibility functions and data types into namespaces

    XMLWordPrintable

Details

    Description

      This task proposes a simple and consistent way of handling compatibility modes, taking into account the SQL standard.

      It's intended to define implementation directions for:

      • MDEV-10590 sql_mode=ORACLE: Built-in package DBMS_OUTPUT
      • MDEV-19632 Replication aborts with ER_SLAVE_CONVERSION_FAILED upon CREATE ... SELECT in ORACLE mode
      • MDEV-23005 sql_mode mixture: a table with DECODE() in a virtual column refuses to work

      (and other similar tasks and bug fixes)

      Preamble

      Different compatibility modes in MariaDB (e.g. sql_mode=DEFAULT vs sql_mode=ORACLE) translate function and data type names into different implementations.

      Examples:

      • The function DECODE() has two different implementations: the original MySQL DECODE() with sql_mode=DEFAULT, and Oracle-compatible DECODE() with sql_mode=ORACLE.
      • The data type DATE is translated to MariaDB DATE with sql_mode=DEFAULT, or to MariaDB DATETIME with sql_mode=ORACLE.

      The intent of this task is disambiguation: making it possible in any @@sql_mode to specify a certain implementation, independingly of the current @@sql_mode value.

      Proposed syntax for functions

      SELECT DECODE(...); -- sql_mode dependent.
      SELECT mariadb.DECODE(...); -- will always use the MariaDB version of DECODE(), no matter sql_mode is
      SELECT oracle.DECODE(...); -- will always use the Oracle version of DECODE(), no matter sql_mode is
      

      Proposed syntax for data types

      CREATE TABLE t1 (a DATE); -- sql_mode dependent.
      CREATE TABLE t1 (a mariadb.DATE); -- will always refer to the MariaDB DATE, no matter sql_mode is
      CREATE TABLE t1 (a oracle.DATE); -- will always refer to the Oracle-compatible DATE (which is now mapped to MariaDB DATETIME), no matter sql_mode is
      

      Implementations chosen by default

      This statement:

      SET sql_mode=ORACLE;
      

      will switch the order of funciton and data type lookup to: ORACLE,MARIADB.

      This statement:

      SET sql_mode=DEFAULT;
      

      will switch the order of function and data type lookup to: MARIADB,ORACLE (or just to MARIADB).

      Examples:

      SET sql_mode=DEFAULT;
      CREATE TABLE t1 (a DATE DEFAULT DECODE(..)); -- The MariaDB date, and the MariaDB DECODE()
      SET sql_mode=ORACLE;
      CREATE TABLE t1 (a DATE DEFAULT DECODE(..)); -- The Oracle date, and the Oracle DECODE()
      

      SQL Standard compatibility

      Qualified data types and funtions will just look like:

      • user defined data types belonging to a schema
      • functions belonging to a schema

      The proposed syntax for functions will resemble the SQL standard syntax for <schema qualified routine name>.
      From this point of view:

      • DECODE() - is a built-in function, the exact implementation is chosen depending on sql_mode.
      • mariadb.DECODE() - is a function in the schema `mariadb`.
      • oracle.DECODE() - is a function in the schema `oracle`.

      The proposed syntax for data types will resemble the SQL standard for user defined data types:

      <data type> ::=   <predefined type>
                      | <path-resolved user-defined type name>
       
      <path-resolved user-defined type name> ::= <user-defined type name>
       
      <user-defined type name> ::= [ <schema name> <period> ] <qualified identifier>
      

      The idea is that predefined data types "belong" to the entire server, while qualified data types belong to a certain schema.

      From this point of view:

      • DATE - is a predefined data type, the exact implementation is chosen depending on sql_mode.
      • mariadb.DATE - will be a data type in the schema `mariadb` (which maps to the MariaDB DATE)
      • oracle.DATE - will be a data type in the schema `oracle` (which now maps to MariaDB DATETIME as the closest data type, but can have a more Oracle DATE compatible implementation in the future)

      So schemas will naturally play role of namespaces.

      Data type and functions plugins

      The INSTALL PLUGIN statement for data type and function plugins (which were implemented in 10.5) will be extended to optionally specify the database to put the data type or the function to.

      By default, data type and function plugins will register "globally", like built-in types and functions.
      Optionally, it will be possible to put the loaded data type or function pluging into a certain database.

      Compatibility databases

      Compatibility databases, such as `oracle`, will be created at bootstrap time.

      Note, the highly requested Oracle package DBMS_OUTPUT (which will possibly be implemented in 10.6) will reside in the compatibility database `oracle`.

      The advantage of having compatibility namespaces as real databases on disk is that we'll be able to implement compatibility mode specific objects (like functions, procedures and data types) using any ways:

      • hardcoded in the server code
      • using a plugin - a data type plugin, or a function plugin
      • using SQL - a stored data type, or a stored function

      So for example the Oracle DECODE() can eventually become a function plugin or a stored function.

      Similar, the Oracle DATE can become a data type plugin, or a stored data type (when we implement the CREATE TYPE statement).

      The switch between a hardcoded implementation to a database-specific (data type or function) plugin or to a stored (data type or function) object will be very simple for users: syntactically nothing will change on the SQL level.

      Database `mariadb`

      The database `mariadb`, unlike real compatbility databases, does not need any entries in its tables to "register" built-in data types or built-in functions. They can stay hard-coded in the server, as now.
      In the first impelementation we don't even need to create it as bootstap time.

      The order of lookup

      The order of lookup in the first implementation will depend solely on the sql_mode value. So sql_mode=ORACLE will lookup up in the Oracle data types and functions first, then in MariaDB data types and functions.

      Later we can implement the SQL standard SET PATH statement, which will give more flexibility.

      SHOW and I_S

      Statements like:

      • SHOW CREATE TABLE
      • DESCRIBE
      • SELECT * FROM I_S.COLUMNS

      will print qualified identifiers only when it is really necessary, to avoid polluting all dumps with the `mariadb.` qualifier for the built-in data types and functions. So,

      SET sql_mode=DEFAULT;
      CREATE OR REPLACE TABLE t1 (a DATE, b VARCHAR(10) DEFAULT DECODE('a','b'));
      DESCRIBE t1;
      

      will still print non-qualified data types and functions as before:

      +-------+-------------+------+-----+-----------------+-------+
      | Field | Type        | Null | Key | Default         | Extra |
      +-------+-------------+------+-----+-----------------+-------+
      | a     | date        | YES  |     | NULL            |       |
      | b     | varchar(10) | YES  |     | decode('a','b') |       |
      +-------+-------------+------+-----+-----------------+-------+
      

      However, this script:

      SET sql_mode=DEFAULT;
      CREATE OR REPLACE TABLE t1 (a DATE, b VARCHAR(10) DEFAULT DECODE('a','b'));
      SET sql_mode=ORACLE;
      DESCRIBE t1;
      

      will still print qualified versions (in the data type `date` and in the function `decode`):

      +-------+---------------+------+-----+-------------------------+-------+
      | Field | Type          | Null | Key | Default                 | Extra |
      +-------+---------------+------+-----+-------------------------+-------+
      | a     | mariadb.date  | YES  |     | NULL                    |       |
      | b     | varchar(10)   | YES  |     | mariadb.decode('a','b') |       |
      +-------+---------------+------+-----+-------------------------+-------+
      

      Note, `varchar` does not need a qualifier even in compatibility mode, because it refers to the regular MariaDB VARCHAR even in sql_mode=ORACLE.

      The exact compatibility database names

      Database names `mariadb` and `oracle` are subject to a discussion. They can:
      a. stay as such: mariadb, oracle
      b. have an underscore as a prefix: _mariadb, _oracle
      c. have a 'sys_' as a prefix: sys_mariadb, sys_oracle
      and so on.

      Implementation plan

      In 10.5 we need to do only very simple changes:

      1. Add the qualified syntax in the parser, to fix these (and similar) bugs:

      • MDEV-19632 Replication aborts with ER_SLAVE_CONVERSION_FAILED upon CREATE ... SELECT in ORACLE mode
      • MDEV-23005 sql_mode mixture: a table with DECODE() in a virtual column refuses to work

      2. Make the qualified syntax (in the hard-coded way) map as follows:

      • mariadb.TYPE to the built-in TYPE, without sql_mode=ORACLE specific translation
      • mariadb.FUNC() to the built-in FUNC(), without sql_mode=ORACLE specific translation
      • oracle.DATE to the built-in DATETIME
      • oracle.DECODE() to its Oracle version, decode_oracle(). The non-qualified function name decode_oracle() will be deprecated in a later release.

      Advantages

      Whenever we implement a replacement data type or function, e.g. for sql_mode=ORACLE, we won't have to implement disambiguating aliases such as :

      • TYPE_MARIADB
      • TYPE_ORACLE
      • FUNC_MARIADB
      • FUNC_ORACLE

      in sql_yacc.yy for every replaced data type TYPE and function FUNC.

      MariaDB data types and functions will always be available as MARIADB.TYPE and MARIADRIADB.FUNC() for any built-in data type and function.

      Oracle data types and functions will be installed as plugins or stored objects in the corresponding compatibility database `oracle`.

      So no changes in the main server code are needed at all to introduce an sql_mode specific replacement for a data type or a function.

      Eventually, we can add parser plugins (should be very easy to do). So a compatibility mode will just become a set of:

      • parser plugin
      • data type plugins
      • function plugins

      Open questions

      We could use `mysql` as a compatibility database (i.e. as a qualificator) for sql_mode=DEFAULT. But in this case we'll have to change the way how MYSQL.FUNC() calls are processed. For now it always calls a stored function. We'll have to allow to call built-in functions as well. It's not clear what should be the order of resolution. Should it lookup in built-in functions first, and in case of failure switch to SP?

      Users who already have stored function in the database `mysql` with names coinciding
      with built-in functions might have problems. Probably, this is not a very likely scenario.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.