Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5539

Empty results in UNION with Sphinx engine

Details

    Description

      Hello,

      We have a strange behaviour with a query with 2 UNION.

      SELECT r.path,a.count,r.label
      FROM
      (
      SELECT si.branchid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:branchid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id
      UNION
      SELECT r.path,a.count,r.label
      FROM
      (
      SELECT si.criteriaid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:criteriaid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id
      UNION
      SELECT IF(r.path IS NULL,'/',r.path) as path,a.count,IF(r.label IS NULL,'test',r.label)
      FROM
      (
      SELECT si.subcriteriaid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:subcriteriaid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id;

      The first UNION returns 439 rows.
      The last query returns an empty sets.

      As a result of this 2 UNION we have an empty set instead of 439 rows.

      If we move the last query in first position, it's ok, we have our 439 rows.

      SELECT IF(r.path IS NULL,'/',r.path) as path,a.count,IF(r.label IS NULL,'test',r.label)
      FROM
      (
      SELECT si.subcriteriaid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:subcriteriaid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id
      UNION
      SELECT r.path,a.count,r.label
      FROM
      (
      SELECT si.branchid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:branchid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id
      UNION
      SELECT r.path,a.count,r.label
      FROM
      (
      SELECT si.criteriaid AS id,_sph_distinct AS count
      FROM sphinxindexad si
      WHERE si.query=';mode=extended2;range=paymin,0,1000000000;groupby=attr:criteriaid;distinct=adid;limit=1000000;maxmatches=500'
      ) AS a
      INNER JOIN reference r ON r.id=a.id;

      Attachments

        Activity

          Can you try to create a test case for this? I suppose, your database is too large, but, perhaps, you could repeat the bug on a smaller dataset? Or without SphinxSE at all?

          serg Sergei Golubchik added a comment - Can you try to create a test case for this? I suppose, your database is too large, but, perhaps, you could repeat the bug on a smaller dataset? Or without SphinxSE at all?
          llafage Laurent Lafage added a comment - - edited

          Hello,

          Here is a generic test case if you use sphinxsearch:

          This query returns around 500 results:

          -- empty set
          SELECT a.*
          FROM
          (
          SELECT *
          FROM sphinxindexad si
          WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
          ) AS a
          UNION
          -- 500 rows in set 
          SELECT b.*
          FROM
          (
          SELECT *
          FROM sphinxindexad si
          WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
          ) AS b
          ;
           
          -> 500 rows in set 

          if we invert the 2 WHERE clauses, we have an empty result, the following query will return an empty result:

          -- 500 rows in set 
          SELECT a.*
          FROM
          (
          SELECT *
          FROM sphinxindexad si
          WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
          ) AS a
          UNION
          -- Empty set
          SELECT b.*
          FROM
          (
          SELECT *
          FROM sphinxindexad si
          WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
          ) AS b
          ;
           
          -> Empty set

          Hope this will help.

          With best regards

          llafage Laurent Lafage added a comment - - edited Hello, Here is a generic test case if you use sphinxsearch: This query returns around 500 results: -- empty set SELECT a.* FROM ( SELECT * FROM sphinxindexad si WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500' ) AS a UNION -- 500 rows in set SELECT b.* FROM ( SELECT * FROM sphinxindexad si WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500' ) AS b ;   -> 500 rows in set if we invert the 2 WHERE clauses, we have an empty result, the following query will return an empty result: -- 500 rows in set SELECT a.* FROM ( SELECT * FROM sphinxindexad si WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500' ) AS a UNION -- Empty set SELECT b.* FROM ( SELECT * FROM sphinxindexad si WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500' ) AS b ;   -> Empty set Hope this will help. With best regards

          To reproduce in MTR, put the following test case into suite/sphinx folder:

          eval create table ts ( id bigint unsigned not null, w int not null, query varchar(255) not null, index(query) ) engine=sphinx connection="sphinx://127.0.0.1:$SPHINXSEARCH_PORT/*";
           
          SET optimizer_switch = REPLACE( @@optimizer_switch, '=on', '=off' );
           
          SELECT a.*
          FROM
          (
          SELECT *
          FROM ts si
          WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
          ) AS a
          UNION
          SELECT b.*
          FROM
          (
          SELECT *
          FROM ts si
          WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
          ) AS b
          ;
           
          SELECT a.*
          FROM
          (
          SELECT *
          FROM ts si
          WHERE si.query='@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500'
          ) AS a
          UNION
          SELECT b.*
          FROM
          (
          SELECT *
          FROM ts si
          WHERE si.query=';mode=extended2;limit=1000000;maxmatches=500'
          ) AS b
          ;
           
          drop table ts;

          The first query returns an empty set, and the second one returns 4 rows.
          Same happens with views instead of subqueries.
          In 5.2, both queries return 4 rows, so I assume the problem is not on sphinx side.

          EXPLAINs:

          id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
          1	PRIMARY	<derived2>	ALL	NULL	NULL	NULL	NULL	3	100.00	
          2	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')
          3	UNION	<derived4>	ALL	NULL	NULL	NULL	NULL	3	100.00	
          4	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')
          NULL	UNION RESULT	<union1,3>	ALL	NULL	NULL	NULL	NULL	NULL	NULL	
          Warnings:
          Note	1003	select `a`.`id` AS `id`,`a`.`w` AS `w`,`a`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')) `a` union select `b`.`id` AS `id`,`b`.`w` AS `w`,`b`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')) `b`

          id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
          1	PRIMARY	<derived2>	ALL	NULL	NULL	NULL	NULL	3	100.00	
          2	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')
          3	UNION	<derived4>	ALL	NULL	NULL	NULL	NULL	3	100.00	
          4	DERIVED	si	ref	query	query	257	const	3	100.00	Using where with pushed condition: (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')
          NULL	UNION RESULT	<union1,3>	ALL	NULL	NULL	NULL	NULL	NULL	NULL	
          Warnings:
          Note	1003	select `a`.`id` AS `id`,`a`.`w` AS `w`,`a`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')) `a` union select `b`.`id` AS `id`,`b`.`w` AS `w`,`b`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')) `b`

          elenst Elena Stepanova added a comment - To reproduce in MTR, put the following test case into suite/sphinx folder: eval create table ts ( id bigint unsigned not null , w int not null , query varchar (255) not null , index (query) ) engine=sphinx connection = "sphinx://127.0.0.1:$SPHINXSEARCH_PORT/*" ;   SET optimizer_switch = REPLACE ( @@optimizer_switch, '=on' , '=off' );   SELECT a.* FROM ( SELECT * FROM ts si WHERE si.query= ';mode=extended2;limit=1000000;maxmatches=500' ) AS a UNION SELECT b.* FROM ( SELECT * FROM ts si WHERE si.query= '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500' ) AS b ;   SELECT a.* FROM ( SELECT * FROM ts si WHERE si.query= '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500' ) AS a UNION SELECT b.* FROM ( SELECT * FROM ts si WHERE si.query= ';mode=extended2;limit=1000000;maxmatches=500' ) AS b ;   drop table ts; The first query returns an empty set, and the second one returns 4 rows. Same happens with views instead of subqueries. In 5.2, both queries return 4 rows, so I assume the problem is not on sphinx side. EXPLAINs: id select_type table type possible_keys key key_len ref rows filtered Extra 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 3 100.00 2 DERIVED si ref query query 257 const 3 100.00 Using where with pushed condition: (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500') 3 UNION <derived4> ALL NULL NULL NULL NULL 3 100.00 4 DERIVED si ref query query 257 const 3 100.00 Using where with pushed condition: (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500') NULL UNION RESULT <union1,3> ALL NULL NULL NULL NULL NULL NULL Warnings: Note 1003 select `a`.`id` AS `id`,`a`.`w` AS `w`,`a`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')) `a` union select `b`.`id` AS `id`,`b`.`w` AS `w`,`b`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')) `b` id select_type table type possible_keys key key_len ref rows filtered Extra 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 3 100.00 2 DERIVED si ref query query 257 const 3 100.00 Using where with pushed condition: (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500') 3 UNION <derived4> ALL NULL NULL NULL NULL 3 100.00 4 DERIVED si ref query query 257 const 3 100.00 Using where with pushed condition: (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500') NULL UNION RESULT <union1,3> ALL NULL NULL NULL NULL NULL NULL Warnings: Note 1003 select `a`.`id` AS `id`,`a`.`w` AS `w`,`a`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = '@* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500')) `a` union select `b`.`id` AS `id`,`b`.`w` AS `w`,`b`.`query` AS `query` from (select `test`.`si`.`id` AS `id`,`test`.`si`.`w` AS `w`,`test`.`si`.`query` AS `query` from `test`.`ts` `si` where (`test`.`si`.`query` = ';mode=extended2;limit=1000000;maxmatches=500')) `b`

          It's a bug in Sphinx. It can only handle one sphinx-query per connection at any given point in time. But in this UNION query you have two. So MariaDB

          1. pushes the first query (say, @* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500) to SphinxSE. It remembers it.
          2. pushes the second query (;mode=extended2;limit=1000000;maxmatches=500) into SphinxSE. It remembers it in the same slot, overwriting the first query.

          So, instead of performing two different sphinx searches and merging the results, it only performs one ­— whichever happens to be the last.

          Why it worked in 5.2? Perhaps in 5.2 the execution sequence was different, and the server 1) prepared the first part of the UNION, 2) executed it, 3) prepared the second part of the UNION 4) executed it, etc. That would've worked in SphinxSE. And in 5.3 MariaDB first prepares all parts in a UNION, and then executes them all. That doesn't work very well in SphinxSE.

          Anyway, this fix seem to require big changes in SphinxSE, we cannot do it. Please, report this bug to Sphinx developers — when they fix it, we will merge a fixed version into MariaDB.

          serg Sergei Golubchik added a comment - It's a bug in Sphinx. It can only handle one sphinx-query per connection at any given point in time. But in this UNION query you have two. So MariaDB pushes the first query (say, @* 123nothingtofind123;mode=extended2;limit=1000000;maxmatches=500 ) to SphinxSE. It remembers it. pushes the second query ( ;mode=extended2;limit=1000000;maxmatches=500 ) into SphinxSE. It remembers it in the same slot, overwriting the first query. So, instead of performing two different sphinx searches and merging the results, it only performs one ­— whichever happens to be the last. Why it worked in 5.2? Perhaps in 5.2 the execution sequence was different, and the server 1) prepared the first part of the UNION, 2) executed it, 3) prepared the second part of the UNION 4) executed it, etc. That would've worked in SphinxSE. And in 5.3 MariaDB first prepares all parts in a UNION, and then executes them all. That doesn't work very well in SphinxSE. Anyway, this fix seem to require big changes in SphinxSE, we cannot do it. Please, report this bug to Sphinx developers — when they fix it, we will merge a fixed version into MariaDB.

          Both upstream (Sphinx) issues were fixed in 2.1.10. We currently have 2.1.9, the latest downloadable release at sphinxsearch is 2.2.6.

          elenst Elena Stepanova added a comment - Both upstream (Sphinx) issues were fixed in 2.1.10. We currently have 2.1.9, the latest downloadable release at sphinxsearch is 2.2.6.

          Serg upgraded sphinx to 2.2.6 in MariaDB 10.0, which fixed the issue.

          monty Michael Widenius added a comment - Serg upgraded sphinx to 2.2.6 in MariaDB 10.0, which fixed the issue.

          People

            Unassigned Unassigned
            llafage Laurent Lafage
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.