[MDEV-6309] incorrect result of EXISTS subquery with InnoDB & FTS Created: 2014-06-05  Updated: 2022-09-12  Resolved: 2022-09-12

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Affects Version/s: 10.0.11
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Oleksandr Byelkin Assignee: Oleksandr Byelkin
Resolution: Won't Fix Votes: 1
Labels: upstream

Sprint: 10.0.20, 10.0.24, 10.0.25, 10.0.26

 Description   

innodb_fts: fulltext.rsult

# should return 0
SELECT count(*) FROM t1 WHERE 
not exists(
SELECT 1 FROM t2, t3
WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
);
count(*)
1

(pay attention to the comment)

test suite:

--source include/have_innodb.inc
 
set optimizer_switch='exists_to_in=off';
 
CREATE TABLE t1 (a int) ENGINE = InnoDB;
INSERT INTO t1 VALUES (1), (2);
 
CREATE TABLE t2 (a int, b2 char(10), FULLTEXT KEY b2 (b2)) ENGINE = InnoDB;
INSERT INTO t2 VALUES (1,'Scargill');
 
CREATE TABLE t3 (a int, b int) ENGINE = InnoDB;
INSERT INTO t3 VALUES (1,1), (2,1);
 
--echo # t2 should use full text index
EXPLAIN
SELECT count(*) FROM t1 WHERE 
  not exists(
   SELECT 1 FROM t2, t3
   WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
  );
 
# INNODB_FTS: INVESTIGATE
--echo # should return 0
SELECT count(*) FROM t1 WHERE 
  not exists(
   SELECT 1 FROM t2, t3
   WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
  );
 
--error ER_FT_MATCHING_KEY_NOT_FOUND
SELECT count(*) FROM t1 WHERE 
  not exists(
   SELECT 1 FROM t2 IGNORE INDEX (b2), t3
   WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
  );
 
DROP TABLE t1,t2,t3;



 Comments   
Comment by Oleksandr Byelkin [ 2014-06-08 ]

With in2exists it returns corrrect result. Incorrect result was written in the test before.

Comment by Oleksandr Byelkin [ 2016-02-10 ]

Result of the query depends on engine implementation of FULLTEXT.

InnoDB return 1, MyISAM return 0

with exists_to_in=on the subquery is materialized so in fact it is MyISAM and so result is 0 (as for MyISAM)

Comment by Oleksandr Byelkin [ 2016-02-10 ]

Somehow results with MyISAM are not correct, despite the fact that simple query with patch work correctly. New test suite:

Original test suite is correct

set optimizer_switch='exists_to_in=off';
 
CREATE TABLE t1 (a int) ENGINE = MyISAM;
INSERT INTO t1 VALUES (1), (2);
 
CREATE TABLE t2 (a int, b2 char(10), FULLTEXT KEY b2 (b2)) ENGINE = MyISAM;
INSERT INTO t2 VALUES (1,'Scargill');
 
CREATE TABLE t3 (a int, b int) ENGINE = MyISAM;
INSERT INTO t3 VALUES (1,1), (2,1);
 
#internal query result exists for value 1
 
SELECT 1 FROM t2, t3
   WHERE t3.a=1 AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE);
 
#
#but following queries still return 0 which ois not correct
#
--echo # t2 should use full text index
EXPLAIN
SELECT count(*) FROM t1 WHERE 
  not exists(
   SELECT 1 FROM t2, t3
   WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
  );
 
# INNODB_FTS: INVESTIGATE
SELECT count(*) FROM t1 WHERE 
  not exists(
   SELECT 1 FROM t2, t3
   WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
  );
 
set optimizer_switch='exists_to_in=on';
--echo # t2 should use full text index
EXPLAIN
SELECT count(*) FROM t1 WHERE 
  not exists(
   SELECT 1 FROM t2, t3
   WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
  );
 
# INNODB_FTS: INVESTIGATE
SELECT count(*) FROM t1 WHERE 
  not exists(
   SELECT 1 FROM t2, t3
   WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
  );
 
 
DROP TABLE t1,t2,t3;

-

Comment by Oleksandr Byelkin [ 2016-02-11 ]

It looks like MATCH itself works more or less OK (at least calculate non zero val_real())

Comment by Oleksandr Byelkin [ 2016-02-15 ]

oops I was wrong. MyISAM result is OK but innoDB is wrong

Comment by Oleksandr Byelkin [ 2016-02-15 ]

So the problem is that in case of not patch myisam return 0, but innodb:
(gdb) p res
$4 = 1.0609979043758178e-314
(gdb)

Comment by Oleksandr Byelkin [ 2016-02-15 ]

So problem is on innodb side. in binary mode it should return clear zero on mismatch.

Innodb create different parser for binary more but how it work is not clear.

Comment by Oleksandr Byelkin [ 2016-02-16 ]

New test suite. Bug repeatable on MySQL 5.7 (set optimizer switch should be removed, but without it 5.7 acts as MariaDB, because 5.7 has no exists2in).

--source include/have_innodb.inc
 
set optimizer_switch='exists_to_in=off';
 
CREATE TABLE t1 (a int) ENGINE = InnoDB;
INSERT INTO t1 VALUES (1), (2);
 
CREATE TABLE t2 (a int, b2 char(10), FULLTEXT KEY b2 (b2)) ENGINE = InnoDB;
INSERT INTO t2 VALUES (1,'Scargill');
CREATE TABLE t21 (a int, b2 char(10), FULLTEXT KEY b2 (b2)) ENGINE = MyISAM;
INSERT INTO t21 VALUES (1,'Scargill');
 
CREATE TABLE t3 (a int, b int) ENGINE = InnoDB;
INSERT INTO t3 VALUES (1,1), (2,1);
 
#
#but following queries still return 0 which ois not correct
#
--echo # t2 should use full text index
EXPLAIN
SELECT count(*) FROM t1 WHERE 
  not exists(
   SELECT 1 FROM t2, t3
   WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
  );
 
# INNODB_FTS: INVESTIGATE
SELECT count(*) FROM t1 WHERE 
  not exists(
   SELECT 1 FROM t2, t3
   WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
  );
 
--echo # t2 should use full text index
EXPLAIN
SELECT count(*) FROM t1 WHERE 
  not exists(
   SELECT 1 FROM t21, t3
   WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
  );
 
# INNODB_FTS: INVESTIGATE
SELECT count(*) FROM t1 WHERE 
  not exists(
   SELECT 1 FROM t21, t3
   WHERE t3.a=t1.a AND MATCH(b2) AGAINST('scargill' IN BOOLEAN MODE)
  );
 
 
DROP TABLE t1,t2,t3,t21;

Comment by Oleksandr Byelkin [ 2016-06-20 ]

Conclusion:

1. if change Scargill -> scargill (or make all upper case) both engine report exact match so boolean match work correctly in this case
2. if change in other way it will mismatch for both engines
3. So difference is inside INNODB and probably related to case sensitive/insensitive.

Comment by Oleksandr Byelkin [ 2017-01-03 ]

The problem as I can see it in incorrect reaction of innodb in boolean mode (should return 1/0 but not something other). (of course there is a probability that I am wrong)

Comment by Marko Mäkelä [ 2017-01-09 ]

I am not very familiar with the InnoDB fulltext search implementation.
I made the following observations so far:

  1. If FT_BOOL is the only way of detecting BOOLEAN search, InnoDB is ignoring it. (This could be the root cause of the whole bug.)
  2. fts_query_calculate_idf() is evaluating log10(1.0001). The result will be assigned to word_freq->idf.
  3. Finally, fts_query_prepare_result() will multiply the reasonable-looking ranking->rank=1 by word_freq->idf*word_freq->idf.

The float data type aka fts_rank_t should have no problem representing the values 0 and 1 accurately. should have no problem representing the values 0 and 1 accurately.

I did not check if or how the numbers would get mangled further on they way towards the server layer.

Comment by Marko Mäkelä [ 2017-01-09 ]

In http://dev.mysql.com/doc/internals/en/full-text-search.html, I cannot see any justification for the calculation

			ranking->rank = static_cast<fts_rank_t>(
				ranking->rank * word_freq->idf * word_freq->idf);

which could be written shorter as

			ranking->rank *= word_freq->idf * word_freq->idf;

I guess we should not only make InnoDB aware of FT_BOOL, but also make all calculations match those of MyISAM. And in the long term, remove all the duplicated code.

Comment by Sergei Golubchik [ 2017-01-10 ]

It doesn't look like InnoDB FTS issue. Set a breakpoint on Item_func_match::val_int — you'll see that both InnoDB and MyISAM return 1.

But why Item_func_match::val_int is called twice for MyISAM and only once for InnoDB?

Comment by Sergei Golubchik [ 2022-09-12 ]

10.0 was EOLed in March 2019

Generated at Thu Feb 08 07:10:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.