[MDEV-7199] Introduce the concept of data type handlers Created: 2014-11-25  Updated: 2019-04-23  Resolved: 2019-04-23

Status: Closed
Project: MariaDB Server
Component/s: Data types
Fix Version/s: 10.1.8

Type: Task Priority: Major
Reporter: Alexander Barkov Assignee: Alexander Barkov
Resolution: Fixed Votes: 2
Labels: None

Issue Links:
Blocks
blocks MDEV-4912 Data type plugin API version 1 Closed

 Description   

This is a pre-requisite task for "MDEV-4912 Add a plugin to field types (column types)".

While reviewing the prototype patch for MDEV-4912, Sergei suggested
to modify the code to identify data types in memory using Type_handler pointers,
instead of a "enum_field_types" value. This will be quite a huge change per se,
so we'll do it as a separate task and then add the main MDEV-4912 patch on top of that.

This task include a few steps to introduce the concept of the data type handlers.
A few ongoing tasks we'll be done later (but before adding MDEV-4912).

This task will include the following subtasks:

1. Remove IMPOSSIBLE_RESULT from Item_result. It's an internal server thing
and is not needed neither on the client side, nor for the UDF API.

2. Introduce a new "enum Type_cmp", which will have similar
values to Item_result:

enum Item_cmp
{
  STRING_CMP=     STRING_RESULT,
  REAL_CMP=       REAL_RESULT,
  INT_CMP=        INT_RESULT,
  ROW_CMP=        ROW_RESULT,
  DECIMAL_CMP=    DECIMAL_RESULT,
  TIME_CMP=       TIME_RESULT,
  IMPOSSIBLE_CMP
};

3. Change Item::cmp_type() to return
Item_cmp instead of Item_result.

This is needed for stricter type control, to avoid
erroneous confusion between cmp_type() and result_type().

4. Introduce a new base class Type_handler without members, with only three methods at this point:

class Type_handler
{
public:
  virtual enum_field_types field_type() const = 0;
  virtual Item_result result_type() const = 0;
  virtual Item_cmp cmp_type() const = 0;
};

More methods will be added in later tasks.

5. Create classes for all MYSQL_TYPE_XXX data types, for example:

class Type_handler_longlong: public virtual Type_handler
{
public:
  virtual enum_field_types field_type() const { return MYSQL_TYPE_LONGLONG; }
  virtual Item_result result_type() const { return INT_RESULT; }
  virtual Item_cmp cmp_type() const { return INT_CMP; }
};

6. Derive virtually Item from Type_handler:

class Item: public virtual Type_handler
{
...
};

Notice both Type_handler_longlong and Item use "public virtual Type_handler"
This will introduce a so called diamond inheritance.
The idea is to define methods one time in Type_handler_xxx,
and make all Items of type "xxx" reuse Type_handler_xxx.

For example, deriving from Type_handler_longlong
will automatically add proper implementation of the three
mentioned methods (field_type, result_type and cmp_type),
without having to duplicate them every time we need a INT_RESULT
item:

class Item_int: public Item_num, public Type_handler_longlong
{
...
};
 
class Item_int_func: public Item_func, public Type_handler_longlong
{
...
};
 
class Item_func_udf_int :public Item_udf_func,
                         public Type_handler_longlong
{
...
};
 
class Item_exists_subselect :public Item_subselect,
                             public Type_handler_longlong
{
...
};

7. Remove the default implementations for:

  • Item::cmp_type()
  • Item::field_type()
  • Item::result_type()

In diamond inheritance they should stay zero pointers, until a Type_handler_xxx
is added virtually for the Item_xxx class. Otherwise, the compiler will just return an error.

So we'll have temporary rename Item::cmp_type() to Item::cmp_type_from_field_type()
and use the latter for some items not covered yet by this task.
Note, cmp_type_from_field_type() will have gone as soon as
we switch ALL items to use Type_handlers instead of defining
type related methods directly. At the end, all items will use Type_handler_xxx::cmp_type() instead of this.

Things that are not covered by this task, to be done soon separately:

  • Under terms of this task we'll define only three aforementioned methods
    in Type_handler. In a separate task later, we'll move all other data type
    specific methods from Item to Type_handler:

    - val_int
    - val_decimal
    - val_real
    - val_str
    - get_date,
    - save_in_field
    - make_field(Send_field*)
    - hash_sort()

    and some others.
    This will remove A LOT of duplicate code (e.g. implementation of the val_str()
    method look very the same for all INT_RESULT Items).

  • Under terms of this task we'll switch only some of the Items to use Type_handler.
    Complex cases (when an Item has parallel independent result_type() and field_type()
    methods) will be changed in a separate patch, to avoid too many changes a single patch.
    These complex items include at least:
    • Item_func_hybrid_result_type
    • Item_copy
    • Item_sum_hybrid
    • All Items that have references to the actual data type containers (Items or Fields) e.g. Item_func_rollup_const, Item_field.

At the end, when all of the ongoing tasks are done, we'll have:

  • either field_type() depend on result_type()
  • or result_type() depend on field_type()
  • or all type specific methods depend on an Item or Field reference.
    All cases with parallel independent result_type() and field_type() should be removed.

Further tasks:

  • Remove as much direct use of field_type(), cmp_type(), result_type() as possible.
    Move this code inside methods in Type_handler.
  • Change enum_field_type members to "Type_handler*" in all structures and classes,
    e.g. Create_field, Send_field, CAST related classes, sql_yacc.yy tokens, etc.


 Comments   
Comment by Alexander Barkov [ 2019-04-23 ]

The concept of Type_handler was first introduced by MDEV-8865 in 10.1.8 and further developed in 10.2, 10.3 and 10.4 by a numerous subtasks of MDEV-4912.

Generated at Thu Feb 08 07:17:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.