[MCOL-3747] Regression in 1.4 working_ssb_compareLogOnly/sub/order_limit_sub Created: 2020-01-28  Updated: 2020-03-26  Resolved: 2020-02-11

Status: Closed
Project: MariaDB ColumnStore
Component/s: ExeMgr, MDB Plugin
Affects Version/s: 1.4.1
Fix Version/s: 1.4.3

Type: Task Priority: Blocker
Reporter: David Hall (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Issue split
split to MCOL-3780 Clone of MCOL-3747 for release 1.5.0 Closed
PartOf
includes MCOL-3762 HEX(<decimal>) gets truncated Stalled
is part of MCOL-3594 Failed tests in the 001 suite in 1.4 Closed
Relates
relates to MCOL-3664 Subquery wrapping affects performance Closed
relates to MCOL-3828 UNION: complains about table not in q... Closed
Sprint: 2020-2

 Description   

working_ssb_compareLogOnly/sub/order_limit_sub, the second query gives the wrong answer:

select * from
(select lo_orderkey, count
from lineorder where lo_orderkey in
(select *
from (select lo_orderkey
from lineorder
join dateinfo
on lo_orderdate = d_datekey
where lo_orderkey <= 900
group by lo_orderkey
order by sum(lo_ordtotalprice), lo_orderkey desc
limit 5
) alias1
)
group by lo_orderkey
order by 1, 2 asc
limit 4
) alias2
order by 2, 1 desc;

---------------------+

lo_orderkey count

---------------------+

4 1
2 1
3 6
1 6

---------------------+

In 1.2 we get the correct answer:
---------------------+

lo_orderkey count

---------------------+

421 1
389 1
228 1
162 1

---------------------+

I ran the interior queries separately and got correct answers. It doesn't seem to break until the final wrapper.



 Comments   
Comment by David Hall (Inactive) [ 2020-01-29 ]

This same bug most probably is responsible for failure in q.9.1.3.sql:
select * from (
select lo_orderkey, count from lineorder
where lo_orderkey in ( select * from
( select lo_orderkey
from lineorder join dateinfo on lo_orderdate = d_datekey
group by lo_orderkey
– order by sum(lo_ordtotalprice) desc
order by sum(lo_ordtotalprice) desc, lo_orderkey
limit 5 ) alias1 )
– group by 1 order by 2 asc limit 4 ) alias2
group by 1 order by 2 asc, 1 limit 4 ) alias2
– order by 2 desc;
order by 2 desc, 1;

Comment by Gregory Dorman (Inactive) [ 2020-01-31 ]

The story here is that in the presence of outermost wrapper, the "order by LIMIT 5" inside alias1 is simply ignored. The result is the same as if you remove it altogether.

What seems to be related also is the "where lo_orderkey in" clause in alias2. Below is what works also, this time with the outer query as well as with inner only. Notice that JOIN is redundant and does not change anything, so I removed it for simplicity of the repro. Also, DESC inside alias1 is not contributing since it is placed on the second column (unique too), so I removed it as well.

select * from
(select lo_orderkey, count
from (select lo_orderkey from lineorder
where lo_orderkey <= 900
group by lo_orderkey
order by sum(lo_ordtotalprice), lo_orderkey
limit 5
) alias1
group by lo_orderkey
order by 1, 2 ASC
limit 4
) alias2
;

Comment by Gregory Dorman (Inactive) [ 2020-02-01 ]

And then - my hunch is that it has something to do with:

MariaDB [ssb]> select lo_orderkey, count
-> FROM lineorder WHERE lo_orderkey IN
-> (select lo_orderkey from lineorder
-> where lo_orderkey <= 900
-> group by lo_orderkey
-> order by sum(lo_ordtotalprice), lo_orderkey desc
-> limit 5
-> ) alias1
-> group by lo_orderkey
-> order by 1, 2 ASC
-> limit 4 ;
ERROR 1235 (42000): This version of MariaDB doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'

If it doesn't support it, why it allows to proceed if you pile up enough select * from over it?

Comment by Roman [ 2020-02-04 ]

B/c we don't run IN-INTO-EXISTS if IN is deeper then two levels down.

Comment by Roman [ 2020-02-04 ]

4QA
Here is the simple test case to reproduce the issue:

create table cs1(key_ bigint) engine=columnstore;
insert into cs1 values (42),(43),(45),(666),(777),(333);
 
select key_, count(*) from cs1 where key_ in (select * from (select key_ from cs1 group by key_ order by key_ limit 2) a1) group by key_;
 
select * from (select key_, count(*) from cs1 where key_ in (select * from (select key_ from cs1 group by key_ order by key_ limit 2) a1) group by key_) a2;

The first query works as expected the second one returns onordered full set from cs1.

Comment by Roman [ 2020-02-04 ]

Here is the explanation.
If you take a look into create_columnstore_select_handler() in dbcon/mysql/ha_mcs_pushdown.cpp there will be a call to select_lex->optimize_unflattened_subqueries(). This method must be run recursively for all derived tables. SELECT_LEX::optimize_unflattened_subqueries() finishes IN-INTO-EXISTS rewrite adding equi-JOIN predicate insert into subquery a2.

Comment by Daniel Lee (Inactive) [ 2020-02-10 ]

Build verified: 1.4.3-1

Build verified: 1.4.3-1 source
server
commit 9bd5e14f4de1402c6cd4a3f81564887c1213c9e1
engine
commit 5efa6a4dc52129be2de49fdfc23e44020401b86b

Build tested: 1.5.0-1

server
commit 57950ded281731263f6aa358d43c7b9d51f3dbfb
engine
commit 46f30be5561f65eec488d61e011b727ca358720b

The fixed is not in 1.5.0-1

Comment by Daniel Lee (Inactive) [ 2020-02-11 ]

Close ticket for 1.4.3-1

Fixed for 1.5 is being tracked on MCOL-3780

Generated at Thu Feb 08 02:45:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.