[MDEV-2572] LP:823930 - Wrong result with semijoin materialization and blob fields Created: 2011-08-10  Updated: 2015-02-02  Resolved: 2012-10-04

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Timour Katchaounov (Inactive) Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: Launchpad

Attachments: XML File LPexportBug823930.xml    

 Description   

Two test cases from subselect_sj_mat that test materialization
with blobs produce incorrect empty result with semijoin
materialization. The two failing test cases are the ones that test
blobs with size 1024, and 1025. Here I extracted and simplified
a bit the test case marked with:

  1. BLOB == 1024 (group_concat_max_len == 1024).
    For length 1025 the wrong result is the same.


 Comments   
Comment by Timour Katchaounov (Inactive) [ 2011-08-10 ]

Re: Wrong result with semijoin materialization and blob fields
create table t1_1024 (a1 blob(1024), a2 blob(1024));
create table t2_1024 (b1 blob(1024), b2 blob(1024));
create table t3_1024 (c1 blob(1024), c2 blob(1024));

insert into t1_1024 values
(concat('1 - 00', repeat('x', 1018)), concat('2 - 00', repeat('x', 1018)));
insert into t1_1024 values
(concat('1 - 01', repeat('x', 1018)), concat('2 - 01', repeat('x', 1018)));
insert into t1_1024 values
(concat('1 - 02', repeat('x', 1018)), concat('2 - 02', repeat('x', 1018)));

insert into t2_1024 values
(concat('1 - 01', repeat('x', 1018)), concat('2 - 01', repeat('x', 1018)));
insert into t2_1024 values
(concat('1 - 02', repeat('x', 1018)), concat('2 - 02', repeat('x', 1018)));
insert into t2_1024 values
(concat('1 - 03', repeat('x', 1018)), concat('2 - 03', repeat('x', 1018)));

insert into t3_1024 values
(concat('1 - 01', repeat('x', 1018)), concat('2 - 01', repeat('x', 1018)));
insert into t3_1024 values
(concat('1 - 02', repeat('x', 1018)), concat('2 - 02', repeat('x', 1018)));
insert into t3_1024 values
(concat('1 - 03', repeat('x', 1018)), concat('2 - 03', repeat('x', 1018)));
insert into t3_1024 values
(concat('1 - 04', repeat('x', 1018)), concat('2 - 04', repeat('x', 1018)));

– Correct result
set @@optimizer_switch='semijoin=off,materialization=on,in_to_exists=off';

select left(a1,7), left(a2,7)
from t1_1024
where a1 in (select substring(b1,1,1024) from t2_1024 where b1 > '0');

– Wrong result
set @@optimizer_switch='semijoin=on,materialization=on,in_to_exists=off';

select left(a1,7), left(a2,7)
from t1_1024
where a1 in (select substring(b1,1,1024) from t2_1024 where b1 > '0');

Comment by Sergei Petrunia [ 2011-09-05 ]

Re: Wrong result with semijoin materialization and blob fields
Looking at EXPLAIN outputs, I see that

  • The query with "Correct result" settings does not use Materialization (even though optimizer_switch value instructs it to do it)
  • The query with "Wrong result" settings does use Materialization.

It seems, the problem is that SJ-Materialization attempts to handle this case, while it should refuse it, like regular materialization did.

Comment by Sergei Petrunia [ 2011-09-06 ]

Re: Wrong result with semijoin materialization and blob fields
In the case of non-semijoin materialization, we proceed as far as entering
subselect_hash_sj_engine::init(). Inside that function, we execute up to this
code:

if (result_sink->create_result_table(thd, tmp_columns, TRUE,
tmp_create_options,
name, TRUE, TRUE))
DBUG_RETURN(TRUE);

tmp_table= result_sink->table;
result= result_sink;

/*
If the subquery has blobs, or the total key lenght is bigger than
some length, or the total number of key parts is more than the
allowed maximum (currently MAX_REF_PARTS == 16), then the created
index cannot be used for lookups and we can't use hash semi
join. If this is the case, delete the temporary table since it
will not be used, and tell the caller we failed to initialize the
engine.
*/
if (tmp_table->s->keys == 0)
{

And then we find out that "tmp_table->s->keys == 0" which causes us not to use
materialization strategy, and switch to IN->EXISTS instead.

Comment by Sergei Petrunia [ 2011-09-06 ]

Re: Wrong result with semijoin materialization and blob fields
It is difficult to reuse this approach for SJ-Materialization, because SJ-Materialization creates temp.table after join optimization has been finished. At that point, it is not easy to go back and redo the join optimization.

A better option would be to make subquery_types_allow_materialization() do its job properly and detect all cases where materialization is not applicable.

Comment by Rasmus Johansson (Inactive) [ 2011-12-13 ]

Launchpad bug id: 823930

Generated at Thu Feb 08 06:42:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.