[MCOL-2137] Query returning different results at different times and from different UMs. Created: 2019-02-04  Updated: 2020-08-25  Resolved: 2020-02-27

Status: Closed
Project: MariaDB ColumnStore
Component/s: ?
Affects Version/s: 1.2.2
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Juan Assignee: David Hall (Inactive)
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

CentOS 2UM/2PM


Sprint: 2020-1, 2020-2, 2020-3

 Description   

The inconsistency (as shown below) happens when running the query from different UMs as well as running it from the same UM at different times, despite running calFlushCache().

mysql (UM1)> 
with 
	table2 as (select id_col,column1,column2 from location where column1='123456789AB-1001-A01-A-A1'), 
	cands as( select 
			ident,sum(if(column2<0.005,abs(a.column1-b.column1),0)) as val1,
			sum(if(column2>=0.005 and column2<0.2,if(abs(a.column1-b.column1)=0,1,0),0)) as val2,
			count(*) as total 
		from 
			test_quorum a,
			table2 b 
		where 
			a.id_col=b.id_col 
		group by ident 
		having val1<3 and val2<3 and total>40), 
	data as ( select 
			column2,
			abs(a.column1-b.column1) as column1,
			b.ident as ident 
		from 
			table2 a,
			test_quorum b 
		where 
			a.id_col=b.id_col 
		and b.ident in (select distinct ident from cands) 
		and column2<0.2 
		order by b.ident,column2), 
	curve as ( select 
			ident,
			column1,
			lag(column1) over(partition by ident order by column2) as lg 
		from data) 
select 
	a.ident,
	sum(if(ifnull(lg,0)<=column1,0,1)) as factor,
	b.val1,
	b.val2 
from 
	curve a,
	cands b 
where 
	a.ident=b.ident 
group by 
	ident,
	val1,
	sum2 
having 
	factor<=1; 
+------------+-------------------+-----------+-----------+
| ident | factor | val1 | val2 |
+------------+-------------------+-----------+-----------+
| 833NN98765 | 1 | 2 | 0 |
| 133NN43210 | 1 | 0 | 1 |
+------------+-------------------+-----------+-----------+
2 rows in set
Time: 0.718s
 
mysql (UM2)> 
with 
	table2 as (select id_col,column1,column2 from location where column1='123456789AB-1001-A01-A-A1'), 
	cands as(select 
			ident,
			sum(if(column2<0.005,abs(a.column1-b.column1),0)) as val1, 
			sum(if(column2>=0.005 and column2<0.2,if(abs(a.column1-b.column1)=0,1,0),0)) as val2,
			count(*) as total 
		from 
			test_quorum a,
			table2 b 
		where a.id_col=b.id_col group by ident having val1<3 and val2<3 and total>40), 
	data as (select 
			column2,
			abs(a.column1-b.column1) as column1,
			b.ident as ident 
		from 
			table2 a,
			test_quorum b 
		where 
			a.id_col=b.id_col 
		and b.ident in (select distinct ident from cands) 
		and column2<0.2 order by b.ident,column2), 
	curve as ( select 
			ident,column1,
			lag(column1) over(partition by ident order by column2) as lg 
		from data) 
select 
	a.ident,
	sum(if(ifnull(lg,0)<=column1,0,1)) as factor,
	b.val1,
	b.val2 
from 
	curve a,
	cands b 
where 
	a.ident=b.ident 
group by 
	ident,
	val1,
	sum2 
having factor<=1; 
+------------+-------------------+-----------+-----------+
| ident | factor | val1 | val2 |
+------------+-------------------+-----------+-----------+
| 133NN43210 | 1 | 0 | 1 |
+------------+-------------------+-----------+-----------+
1 row in set
Time: 0.691s



 Comments   
Comment by David Hall (Inactive) [ 2020-02-27 ]

I have confirmed that this phenomenon occurs in a single server setup.

Comment by David Hall (Inactive) [ 2020-02-27 ]

n the phrase

Consider
curve as
(select ident,column1,lag(column1) over(partition by ident order by column2) as lg

if column2 is not unique, then the order is not deterministic.

If there are two records with a ident the same and column2 the same, depending on the order these end up, it may result in one or two that satisfy lg > column1.

Since the final having clause is looking for records where the count of lg > column1 is 1 or less, when it happens to be two, you get no answer for that ident.

Generated at Thu Feb 08 02:34:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.