[MDEV-23799] CREATE .. SELECT wrong result on join versioned table Created: 2020-09-23  Updated: 2020-10-20  Resolved: 2020-10-20

Status: Closed
Project: MariaDB Server
Component/s: Versioned Tables
Affects Version/s: 10.5.5, 10.3, 10.4, 10.5
Fix Version/s: 10.3.26, 10.4.16, 10.5.7

Type: Bug Priority: Major
Reporter: Lucas Schons Assignee: Aleksey Midenkov
Resolution: Fixed Votes: 0
Labels: temporal, temporary
Environment:

OS: Arch Linux x86_64
Kernel: 5.8.10-arch1-1
CPU: Intel i5-6300U


Attachments: File temp_table_bug.sql    
Issue Links:
Relates
relates to MDEV-15391 Server crashes in JOIN::fix_all_split... Closed

 Description   

When creating a temporary table with the following statements, the behavior is as expected:

-- standard tables yield the expected result
create table x (id Int);
create table x_p (elementId Int, pkey varchar(20), pvalue varchar(20));
 
insert into x values (1), (2), (3);
insert into x_p values (1, 'gender', 'male');
insert into x_p values (2, 'gender', 'female');
insert into x_p values (3, 'gender', 'male');
 
create temporary table tmp1
    select xgender.pvalue as gender, xtitle.pvalue as title
    from x
        left join x_p as xgender on x.id = xgender.elementId and xgender.pkey = 'gender'
        left join x_p as xtitle on x.id = xtitle.elementId and xtitle.pkey = 'title';
 
select * from tmp1;

The statements above result in the expected temporary table:

 
gender | title
-------+------
male   | NULL
female | NULL
male   | NULL

But issuing the same statement for bi-temporal tables, wich are similar in respect to the data, result in an empty set:

create or replace table t (
    id INT,
    date_1 DATE,
    date_2 DATE,
    row_start TIMESTAMP(6) AS ROW START INVISIBLE,
    row_end TIMESTAMP(6) AS ROW END INVISIBLE,
    PERIOD FOR application_time(date_1, date_2),
    PERIOD FOR system_time(row_start, row_end))
    WITH SYSTEM VERSIONING;
 
create or replace table t_p (
    elementId INT,
    pkey VARCHAR(20),
    pvalue VARCHAR(20),
    date_1 DATE,
    date_2 DATE,
    row_start TIMESTAMP(6) AS ROW START INVISIBLE,
    row_end TIMESTAMP(6) AS ROW END INVISIBLE,
    PERIOD FOR application_time(date_1, date_2),
    PERIOD FOR system_time(row_start, row_end))
    WITH SYSTEM VERSIONING;
 
insert into t values (1, DATE('1000-01-01'), DATE('9999-01-01')), (2, DATE('1000-01-01'), DATE('9999-01-01')), (3,  DATE('1000-01-01'), DATE('9999-01-01'));
insert into t_p values (1, 'gender', 'female', DATE('1000-01-01'), DATE('9999-01-01'));
insert into t_p values (2, 'gender', 'male', DATE('1000-01-01'), DATE('9999-01-01'));
insert into t_p values (3, 'gender', 'female', DATE('1000-01-01'), DATE('9999-01-01'));
 
create temporary table tmp2
    select tgender.pvalue as gender, ttitle.pvalue as title
    from t
        left join t_p as tgender on t.id = tgender.elementId and tgender.pkey = 'gender'
        left join t_p as ttitle on t.id = ttitle.elementId and ttitle.pkey = 'title';
 
select * from tmp2;

I expect that the two queries yield the same result since they have nothing to do with the temporality of the tables.



 Comments   
Comment by Alice Sherepa [ 2020-09-24 ]

Thanks a lot! Repeatable on MariaDB 10.3-10.5.
Returns correct results while creating a view, but not if creating a temporary table or a normal,base table:

create table t (id int) with system versioning;
insert into t values (1), (2), (3);
 
create table t_p (elementid int, pkey varchar(20), pvalue varchar(20)) with system versioning;
insert into t_p values (1, 'gender', 'female'),(2, 'gender', 'male'),(3, 'gender', 'female');
 
create table t1 as
    select tgender.pvalue as gender, ttitle.pvalue as title
    from t
        left join t_p as tgender on t.id = tgender.elementid and tgender.pkey = 'gender'
        left join t_p as ttitle on t.id = ttitle.elementid and ttitle.pkey = 'title';
 
select * from t1;

MariaDB [test]> create table t1 as
    ->     select tgender.pvalue as gender, ttitle.pvalue as title
    ->     from t
    ->         left join t_p as tgender on t.id = tgender.elementid and tgender.pkey = 'gender'
    ->         left join t_p as ttitle on t.id = ttitle.elementid and ttitle.pkey = 'title';
Query OK, 0 rows affected (0.062 sec)
Records: 0  Duplicates: 0  Warnings: 0
 
MariaDB [test]> create view v1 as
    ->     select tgender.pvalue as gender, ttitle.pvalue as title
    ->     from t
    ->         left join t_p as tgender on t.id = tgender.elementid and tgender.pkey = 'gender'
    ->         left join t_p as ttitle on t.id = ttitle.elementid and ttitle.pkey = 'title';
Query OK, 0 rows affected (0.006 sec)
 
MariaDB [test]> select * from v1;
+--------+-------+
| gender | title |
+--------+-------+
| female | NULL  |
| male   | NULL  |
| female | NULL  |
+--------+-------+
3 rows in set (0.002 sec)
 
MariaDB [test]> select * from t1;
Empty set (0.001 sec)

Comment by Aleksey Midenkov [ 2020-10-15 ]

Reproduce

create or replace table x (id Int) with system versioning;
create or replace table x_p (elementId Int, pkey varchar(20), pvalue varchar(20)) with system versioning;
 
insert into x values (1), (2), (3);
insert into x_p values (1, 'gender', 'male');
insert into x_p values (2, 'gender', 'female');
insert into x_p values (3, 'gender', 'male');
 
create table tmp1
select xgender.pvalue as gender, xtitle.pvalue as title
from x
    left join x_p as xgender on x.id = xgender.elementId and xgender.pkey = 'gender'
    left join x_p as xtitle on x.id = xtitle.elementId and xtitle.pkey = 'title';
 
select * from tmp1;
 
drop table tmp1;
drop tables x, x_p;

Result

No rows selected.

Expected

gender | title
-------+------
male   | NULL
female | NULL
male   | NULL

Cause

For join to work correctly versioning condition must be added to table on_expr. Without that JOIN_CACHE gets expression (1)

trigcond(xtitle.row_end = TIMESTAMP'2038-01-19 06:14:07.999999') and
trigcond(xtitle.elementId = x.`id` and xtitle.pkey = 'title')

instead of (2)

trigcond(xtitle.elementId = x.`id` and xtitle.pkey = 'title')

for join_null_complements(). It is NULL-row of xtitle for complementing the join and the above comparisons of course FALSE, but trigcond (Item_func_trig_cond) makes them TRUE via its trig_var property which is bound to some boolean properties of JOIN_TAB.

Expression (2) evaluated to TRUE because its trig_var is bound to first_inner_tab->not_null_compl. The expression (1) does not evaluate correctly because row_end comparison's trig_var is bound to first_inner->found earlier. As a result JOIN_CACHE::check_match() skipped the row for join_null_complements().

When we add versioning condition to table's on_expr the optimizer in make_join_select() distributes conditions differently. tmp_cond inherits on_expr value and in Good case it is full expression

xgender.elementId = x.`id` and xgender.pkey = 'gender' and
xgender.row_end = TIMESTAMP'2038-01-19 06:14:07.999999'

while in Bad case it is only

xgender.elementId = x.`id` and xgender.pkey = 'gender'.

Later in Good row_end condition is optimized out and we get one trigcond in form of (2).

More analysis

Comment by Aleksey Midenkov [ 2020-10-15 ]

https://github.com/midenok/mariadb/commit/c000b7bbede7cc4ba770db6d1039b9bc0ead7be7

Generated at Thu Feb 08 09:25:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.