[MDEV-17212] Connect Engine ODBC Table_Type - Discovery Process Created: 2018-09-17  Updated: 2018-11-21  Resolved: 2018-10-15

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Connect
Affects Version/s: 10.3.9
Fix Version/s: 10.0.37, 10.3.11, 10.1.37, 10.2.19

Type: Bug Priority: Trivial
Reporter: Juan Telleria Assignee: Olivier Bertrand
Resolution: Done Votes: 0
Labels: None
Environment:

Windows


Attachments: PNG File MySQL Integer Types.PNG    

 Description   

Some minor issues when using MariaDB's Connect ODBC Table_Type Engine with Microsoft Parallel Data Warehouse (PDW):

table_options: TABNAME

When using a Table Definition such as the following for importing a Table:

USE db_name;
 
CREATE OR REPLACE TABLE `t_name_connect`
  ENGINE=CONNECT
  TABLE_TYPE=ODBC 
  CONNECTION='dsn_name'
 OPTION_LIST='UseDSN=Yes,User=db_user_name,Password=********'
  TABNAME='db_name.schema_name.table_name';
 
CREATE OR REPLACE TABLE t_name_mariadb
AS
  SELECT
    *
  FROM
    t_name_connect;

The Query works perfectly (10/10), and I am able to import my tables easily to MariaDB. However, an observed limitation is that in TABNAME option, I can only specify my table as:

  • db_name.schema_name.table_name.
  • And not [db name].[schema name].[table name], which is common syntax for database / schema / table with spaces in TRANSACT-SQL.

table_options: SRCDEF

With Microsoft PDW, using "SQL Server Native Client 11.0" it is not possible to use SRCDEF table option, as otherwise, we would obtain error:

"SQLNumResultCols: [Microsoft][ODBC Driver 11 for SQL Server][SQL Server]Cursor support is not an implemented feature for SQL Server Parallel DataWarehousing TDS endpoint".

This limitation could be avoided with first method, but we are not able create table definitions as the following by means of the Table Discovery Process:

CREATE OR REPLACE TABLE `t_SQLServer_Table`
  ENGINE=CONNECT
  TABLE_TYPE=ODBC 
  CONNECTION='dns_name'
  OPTION_LIST='UseDSN=Yes,User=user_name,Password=*****'
  SRCDEF='
    SELECT
      *
    FROM
      [db_name].[schema_name].[table_name]
  ';

Previous Table definition would only work if we specified manually the Column Definitions.

Datetime Field Precission

Another issue is Datetime Precission (CONNECT Data Types), which can also be handled with querys such as:

CREATE OR REPLACE TABLE `t_name_connect`
(
  `Col_Name` VARCHAR(255)
)
  ENGINE=CONNECT
  TABLE_TYPE=ODBC 
  CONNECTION='dsn_name'
  OPTION_LIST='UseDSN=Yes,User=user_name,Password=******'
  TABNAME='db_name.sch_namel.tb_name';
 
CREATE OR REPLACE TABLE t_name_mariadb
AS
  SELECT
    STR_TO_DATE(
      `Col_Name`,
      '%Y-%m-%d %H:%i:%S.%f' /* datetime2: YYYY-MM-DD HH:MM:SS.fff */
    ) AS `Col_Name`
  FROM
    `t_name_connect`;

Although a little verbose, works.

So:

  1. Could TABNAME accept `[`, `]` Characters?
  2. Any way to make work SRCDEF with Microsoft PDW and the Discovery Process?
  3. Any more comfortable way to handle dates in the way shown without losing precision?


 Comments   
Comment by Olivier Bertrand [ 2018-09-19 ]

1: Accepting brackets is probably specific to some data source. For instance, this syntax is not accepted by PostgreSQL.
2: This should work, the issue is in defining the table name as db_name.sch_namel.tb_name;
3: Cannot reproduce.

Even in some case the database or schema can be specified in the table name, this is not recommended and is also dependant on the data source. For instance:

create or replace table pgtst (id int not null)
engine=CONNECT table_type=ODBC tabname='test.postgres.tst'
connection='PostgreSQL30'
option_list='UseDSN=1';
select * from pgtst;

This does not work because the DSN was specified to use the public database and PosgreSQL does not implement references between databases. Therefore the database should be specified in the connection string:

create or replace table pgtst (n int not null)
engine=CONNECT table_type=ODBC tabname=tst
connection='DSN=PostgreSQL30;Database=test';
select * from pgtst;

This works alright. Note also that in the connection string the database name can include blanks, which solves problem 1.

Comment by Juan Telleria [ 2018-09-19 ]

1. "Note also that in the connection string the database name can include blanks, which solves problem 1.": Perfect. Solved.

2. Still get Error "SQLNumResultCols: [Microsoft][ODBC Driver 11 for SQL Server][SQL Server]Cursor support is not an implemented feature for SQL Server Parallel DataWarehousing TDS endpoint".

Microsft Parallel Data Warehouse is a special multi-cluster Version of SQL Server, that seems not to support cursors. This Query Fails:

      CREATE OR REPLACE TABLE MasterDataTurbine.vOPEC2_Contract_CONNECT
        ENGINE=CONNECT
        TABLE_TYPE=ODBC 
        CONNECTION='SQLServerParallelDataWarehouse'
        OPTION_LIST='UseDSN=Yes,User=*********,Password=************,Database=MasterDataTurbine'
        TABNAME = 'MasterDataTurbine.General.vOPEC2_Contract'
        SRCDEF='SELECT * FROM MasterDataTurbine.General.vOPEC2_Contract';

But this works:

 
      CREATE OR REPLACE TABLE MasterDataTurbine.vOPEC2_Contract_CONNECT
      (
        `ContractID` int(10) DEFAULT NULL,
        `ContractOwnerID` int(10) DEFAULT NULL,
        `Contract_Start` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
        `Contract_End` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
        `Notes` int(10) DEFAULT NULL,
        `Name` varchar(255) COLLATE latin1_spanish_ci DEFAULT NULL,
        `SWPSID` int(10) DEFAULT NULL,
        `WarrentyDuration` int(10) DEFAULT NULL,
        `WarrantyTypeID` int(10) DEFAULT NULL,
        `ParkWakeEffficiency` double DEFAULT NULL,
        `ProjectElectricalEfficiency` double DEFAULT NULL,
        `SeasonalWeightingFactorID` int(10) DEFAULT NULL,
        `NominalWindDistributionID` int(10) DEFAULT NULL,
        `AdjustedParkYBAVariantID` int(10) DEFAULT NULL,
        `ValidProductionModeVariantID` int(10) DEFAULT NULL,
        `ParkCalcID` int(10) DEFAULT NULL,
        `ContractTypeID` int(10) DEFAULT NULL,
        `ParkID` int(10) DEFAULT NULL,
        `CalcId` int(10) DEFAULT NULL,
        `RDStype` int(10) DEFAULT NULL,
        `RDSpackage` int(10) DEFAULT NULL,
        `RDSservice` int(10) DEFAULT NULL,
        `Extended Site Presence` datetime DEFAULT NULL
      )
        ENGINE=CONNECT
        TABLE_TYPE=ODBC 
        CONNECTION='SQLServerParallelDataWarehouse'
        OPTION_LIST='UseDSN=Yes,User=**************,Password=*****************'
        SRCDEF='SELECT * FROM MasterDataTurbine.General.vOPEC2_Contract';
      

3. The Remote Table has a Single Column of Type "Datetime" or "TimeStamp", but I specify it at the Local Table as Character:

 
CREATE OR REPLACE TABLE `t_name_connect`
(
  `Col_Name` VARCHAR(255) -- It is in the Remote Table a TimeStamp, with decimal precision, but I do not want to loose precision, so I save it as a character.
)
  ENGINE=CONNECT
  TABLE_TYPE=ODBC 
  CONNECTION='dsn_name'
  OPTION_LIST='UseDSN=Yes,User=user_name,Password=******'
  TABNAME='db_name.sch_namel.tb_name';
 
CREATE OR REPLACE TABLE t_name_mariadb
AS
  SELECT
    STR_TO_DATE(
      `Col_Name`,
      '%Y-%m-%d %H:%i:%S.%f' /* datetime2: YYYY-MM-DD HH:MM:SS.fff */
    ) AS `Col_Name`
  FROM
    `t_name_connect`;

And by doing this I do not loose precision, at cost of more column bytes in the Connect Table, and not being able to use MariaDB Datetime Functions.

Comment by Olivier Bertrand [ 2018-09-19 ]

2: Indeed SQLNumResultCols is used inside the discovery process. It is used only to check that the SQLColumns function returned the expected number of columns. Perhaps we could ignore this error. Note that this is the only data source that does not implement that function which, by the way, is not related with cursors.

Note also that in these statements the SRCDEF is meaningless and equivalent to specifying the external table name.

3: Well done. Because the MariaDB data type TIMESTAMP does not include the precision, the STR_TO_DATE function does it. This is not related to the CONNECT engine.

Comment by Juan Telleria [ 2018-09-19 ]

2. Thank you.

3. But I have to use such approach not only for a matter of precision, but also from range:

  • Currently, TYPE_DATE, allows dates that range starting from 1970-01-01 to 2038-01-19 03:14:07, which corresponds to a 4 bytes SIGNED!!! Integer, if this was UNSIGNED, we would have a range which would reach to 2106-02-07 06:28:15.
  • And even more! If we had a 8 bytes SIGNED (BIG)INTEGER, it would reach to 219250468-12-04 15:30:07, and if it was an UNSIGNED (BIG) INTEGER, it would reach 219250468-12-04 15:30:07!!!!!

And with only some small temporary size overhead!

Comment by Olivier Bertrand [ 2018-09-19 ]

FYI: CONNECT also uses a 4 bytes signed integer for dates and handles negative values allowing dates to start around 1902.

Comment by Juan Telleria [ 2018-09-20 ]

3. In C++ 11, we can also have data type "long long int", which ranges:

  • from: -9,223,372,036,854,775,807
  • to: 9,223,372,036,854,775,807
Comment by Olivier Bertrand [ 2018-09-23 ]

I have made the call to SQLNumResultCols conditional. Please test whether it solve problem 2.

Comment by Juan Telleria [ 2018-09-24 ]

(3) Ok, thank you: So this shall be included in "Connect 1.06.0006" when released I guess. Than you!

(2) On the other hand, and as regards dates, it has been concluded that it will remain as is.

So when dealing with MariaDB DATE, DATETIME fields, the best approach will continue to be treating dates as Characters in range: '2038-01-19 03:14:07' to '9999-12-31 23:59:59'.

Thank you for all your help!

Juan

Comment by Olivier Bertrand [ 2018-10-15 ]

What is done is testing the implementation by the data source of the function that was causing a crash when not implemented.

Comment by Juan Telleria [ 2018-10-15 ]

Thank you for the Fix.

I attached the PDF Documents of Connect Engine Handler to the MariaDB Knowledge base Article:

https://mariadb.com/kb/en/library/using-connect-written-documentation/

Hope others find them useful

In addition, on the MariaDB github repository, where is the Connect Internal Data Types Definition located? E.g: TYPE_STRING, TYPE_INT, TYPE_DATE, ...

Comment by Juan Telleria [ 2018-10-15 ]

In addition, on the MariaDB github repository, where is the Connect Internal Data Types Definition located? E.g: TYPE_STRING, TYPE_INT, TYPE_DATE, ...

Seems to be on:
https://github.com/MariaDB/server/tree/10.4/storage/connect

But cannot find the specific file...

Comment by Olivier Bertrand [ 2018-10-15 ]

In global.h

Comment by Juan Telleria [ 2018-11-21 ]

Does the already released MariaDB 10.3.11 implement the fix? Or will we have to wait till 10.3.12?

I think the commit has already been pushed... But not sure.

Thank you!

Juan

Comment by Olivier Bertrand [ 2018-11-21 ]

It's hard to tell. I'm not the one who includes updates in new releases and sometimes CONNECT was forgotten.

Generated at Thu Feb 08 08:34:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.