[MDEV-5539] Empty results in UNION with Sphinx engine Created: 2014-01-18  Updated: 2015-01-06  Resolved: 2014-12-30

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - SphinxSE
Affects Version/s: 5.3.12, 5.5.34, 10.0.7
Fix Version/s: 10.0.15

Type: Bug Priority: Major
Reporter: Laurent Lafage Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: sphinx, upstream-fixed
Environment:

Debian wheezy



 Description   

Hello,

We have a strange behaviour with a query with 2 UNION.

SELECT r.path,a.count,r.label
FROM
(
SELECT si.branchid AS id,_sph_distinct AS count
FROM sphinxindexad si
WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:branchid;distinct=adid;limit=1000000;maxmatches=500'
) AS a
INNER JOIN reference r ON r.id=a.id
UNION
SELECT r.path,a.count,r.label
FROM
(
SELECT si.criteriaid AS id,_sph_distinct AS count
FROM sphinxindexad si
WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:criteriaid;distinct=adid;limit=1000000;maxmatches=500'
) AS a
INNER JOIN reference r ON r.id=a.id
UNION
SELECT IF(r.path IS NULL,'/',r.path) as path,a.count,IF(r.label IS NULL,'test',r.label)
FROM
(
SELECT si.subcriteriaid AS id,_sph_distinct AS count
FROM sphinxindexad si
WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:subcriteriaid;distinct=adid;limit=1000000;maxmatches=500'
) AS a
INNER JOIN reference r ON r.id=a.id;

The first UNION returns 439 rows.
The last query returns an empty sets.

As a result of this 2 UNION we have an empty set instead of 439 rows.

If we move the last query in first position, it's ok, we have our 439 rows.

SELECT IF(r.path IS NULL,'/',r.path) as path,a.count,IF(r.label IS NULL,'test',r.label)
FROM
(
SELECT si.subcriteriaid AS id,_sph_distinct AS count
FROM sphinxindexad si
WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:subcriteriaid;distinct=adid;limit=1000000;maxmatches=500'
) AS a
INNER JOIN reference r ON r.id=a.id
UNION
SELECT r.path,a.count,r.label
FROM
(
SELECT si.branchid AS id,_sph_distinct AS count
FROM sphinxindexad si
WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:branchid;distinct=adid;limit=1000000;maxmatches=500'
) AS a
INNER JOIN reference r ON r.id=a.id
UNION
SELECT r.path,a.count,r.label
FROM
(
SELECT si.criteriaid AS id,_sph_distinct AS count
FROM sphinxindexad si
WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:criteriaid;distinct=adid;limit=1000000;maxmatches=500'
) AS a
INNER JOIN reference r ON r.id=a.id;



 Comments   
Comment by Sergei Golubchik [ 2014-01-20 ]

Can you try to create a test case for this? I suppose, your database is too large, but, perhaps, you could repeat the bug on a smaller dataset? Or without SphinxSE at all?

Comment by Laurent Lafage [ 2014-01-20 ]

Hello,

Here is a generic test case if you use sphinxsearch:

This query returns around 500 results:

-- empty set
SELECT a.*
FROM
(
SELECT *
FROM sphinxindexad si
WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
) AS a
UNION
-- 500 rows in set 
SELECT b.*
FROM
(
SELECT *
FROM sphinxindexad si
WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
) AS b
;
 
-> 500 rows in set 

if we invert the 2 WHERE clauses, we have an empty result, the following query will return an empty result:

-- 500 rows in set 
SELECT a.*
FROM
(
SELECT *
FROM sphinxindexad si
WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
) AS a
UNION
-- Empty set
SELECT b.*
FROM
(
SELECT *
FROM sphinxindexad si
WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
) AS b
;
 
-> Empty set

Hope this will help.

With best regards

Comment by Elena Stepanova [ 2014-01-28 ]

To reproduce in MTR, put the following test case into suite/sphinx folder:

eval create table ts ( id bigint unsigned not null, w int not null, query varchar(255) not null, index(query) ) engine=sphinx connection="sphinx://127.0.0.1:$SPHINXSEARCH_PORT/*";
 
SET optimizer_switch = REPLACE( @@optimizer_switch, '=on', '=off' );
 
SELECT a.*
FROM
(
SELECT *
FROM ts si
WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
) AS a
UNION
SELECT b.*
FROM
(
SELECT *
FROM ts si
WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
) AS b
;
 
SELECT a.*
FROM
(
SELECT *
FROM ts si
WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
) AS a
UNION
SELECT b.*
FROM
(
SELECT *
FROM ts si
WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
) AS b
;
 
drop table ts;

The first query returns an empty set, and the second one returns 4 rows.
Same happens with views instead of subqueries.
In 5.2, both queries return 4 rows, so I assume the problem is not on sphinx side.

EXPLAINs:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	<derived2>	ALL	NULL	NULL	NULL	NULL	3	100.00	
2	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')
3	UNION	<derived4>	ALL	NULL	NULL	NULL	NULL	3	100.00	
4	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')
NULL	UNION RESULT	<union1,3>	ALL	NULL	NULL	NULL	NULL	NULL	NULL	
Warnings:
Note	1003	select `a`.`id` AS `id`,`a`.`w` AS `w`,`a`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')) `a` union select `b`.`id` AS `id`,`b`.`w` AS `w`,`b`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')) `b`

id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	<derived2>	ALL	NULL	NULL	NULL	NULL	3	100.00	
2	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')
3	UNION	<derived4>	ALL	NULL	NULL	NULL	NULL	3	100.00	
4	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')
NULL	UNION RESULT	<union1,3>	ALL	NULL	NULL	NULL	NULL	NULL	NULL	
Warnings:
Note	1003	select `a`.`id` AS `id`,`a`.`w` AS `w`,`a`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')) `a` union select `b`.`id` AS `id`,`b`.`w` AS `w`,`b`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')) `b`

Comment by Sergei Golubchik [ 2014-02-14 ]

It's a bug in Sphinx. It can only handle one sphinx-query per connection at any given point in time. But in this UNION query you have two. So MariaDB

  1. pushes the first query (say, @* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500) to SphinxSE. It remembers it.
  2. pushes the second query (;mode=extended2;limit=1000000;maxmatches=500) into SphinxSE. It remembers it in the same slot, overwriting the first query.

So, instead of performing two different sphinx searches and merging the results, it only performs one ­— whichever happens to be the last.

Why it worked in 5.2? Perhaps in 5.2 the execution sequence was different, and the server 1) prepared the first part of the UNION, 2) executed it, 3) prepared the second part of the UNION 4) executed it, etc. That would've worked in SphinxSE. And in 5.3 MariaDB first prepares all parts in a UNION, and then executes them all. That doesn't work very well in SphinxSE.

Anyway, this fix seem to require big changes in SphinxSE, we cannot do it. Please, report this bug to Sphinx developers — when they fix it, we will merge a fixed version into MariaDB.

Comment by Elena Stepanova [ 2014-11-16 ]

Both upstream (Sphinx) issues were fixed in 2.1.10. We currently have 2.1.9, the latest downloadable release at sphinxsearch is 2.2.6.

Comment by Michael Widenius [ 2014-12-30 ]

Serg upgraded sphinx to 2.2.6 in MariaDB 10.0, which fixed the issue.

Generated at Thu Feb 08 07:05:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.