[MDEV-3589] LP:777597 - Wrong result with multipart keys, in_to_exists=on, NOT IN in mwl#89 Created: 2011-05-05  Updated: 2015-02-02  Resolved: 2012-10-04

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Philip Stoev (Inactive) Assignee: Timour Katchaounov (Inactive)
Resolution: Fixed Votes: 0
Labels: Launchpad

Attachments: XML File LPexportBug777597.xml    

 Description   

Not repeatable in maria-5.3, maria-5.2. The following query

SELECT * FROM t1 WHERE ( 6 ) NOT IN ( SELECT t2.f3 FROM t2 JOIN t3 ON t3.f10 = t2.f10);;

returns rows when executed with in-to-exists even though the subquery returns (6),(6) which means that the NOT IN predicate should be FALSE, making the entire WHERE condition FALSE.

The following things seem to be required:

  • multipart index
  • 1 row in the other table in the subquery

Innodb is required for this particular test case, but the bug was just observed with MyISAM.

EXPLAIN:

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 ALL NULL NULL NULL NULL 4
2 DEPENDENT SUBQUERY t3 system NULL NULL NULL NULL 1
2 DEPENDENT SUBQUERY t2 ref_or_null f10 f10 10 const,const 2 Using where; Using index

test case:

--source include/have_innodb.inc

CREATE TABLE t1 ( f4 int);
INSERT IGNORE INTO t1 VALUES (2),(2),(2),(2);

CREATE TABLE t2 ( f3 int, f10 int, KEY (f10,f3) ) ENGINE=InnoDB;
INSERT IGNORE INTO t2 VALUES (6, 1), (6, 1);

CREATE TABLE t3 ( f10 int );
INSERT IGNORE INTO t3 VALUES (1);

SET SESSION optimizer_switch='in_to_exists=on,materialization=off';
SELECT * FROM t1 WHERE ( 6 ) NOT IN ( SELECT t2.f3 FROM t2 JOIN t3 ON t3.f10 = t2.f10);

bzr version-info:

revision-id: <email address hidden>
date: 2011-05-05 01:35:03 +0300
build-date: 2011-05-05 08:59:05 +0300
revno: 2981
branch-nick: maria-5.3-mwl89



 Comments   
Comment by Philip Stoev (Inactive) [ 2011-05-05 ]

Re: Wrong result with multipart keys, in_to_exists=on, NOT IN in maria-5.3-mwl89
Bug was just observed with MyISAM. I have corrected the title and the description.

Comment by Timour Katchaounov (Inactive) [ 2011-05-10 ]

Re: Wrong result with multipart keys, in_to_exists=on, NOT IN in maria-5.3-mwl89
Analysis:
--------------------------------------------

In the call chain during optimization of the subquery:

#0 Item_cache::is_null (this=0x7f060c0b8ca8) at item.h:3519
#1 0x00000000005c296f in Item_direct_ref::is_null (this=0x7f060c0b8dd0) at item.cc:6715
#2 0x0000000000730f9d in merge_key_fields (start=0x7f060c0c9468, new_fields=0x7f060c0c9498, end=0x7f060c0c94c8, and_level=3) at sql_select.cc:3490
#3 0x0000000000731dc4 in add_key_fields (join=0x7f060c0c2140, key_fields=0x7fff115817a0, and_level=0x7fff115817c0, cond=0x7f060c0c9370, usable_tables=18446744073709551615, sargables=0x7fff11581830) at sql_select.cc:3880
#4 0x00000000007334bf in update_ref_and_keys (thd=0x7f060c015368, keyuse=0x7fff11581810, join_tab=0x7f060c0b9c58, tables=2, cond=0x7f060c0c9370, normal_tables=18446744073709551615, select_lex=0x7f060c0a24a8, sargables=0x7fff11581830) at sql_select.cc:4382

The call to is_null() returns TRUE, which is wrong. This results in
Item_null replacing the field 'f3' in the KEY_FIELD, then this Item_null
is used for index access, producing a wrong result.

The inspecion of the Item_direct_ref->Item_cache_str object shows that:
value_cached = false,
null_value = true,
value = NULL,
while the cache "source" Item_cache_str::example contains the string 'bbb'.
Therefore somehow the Item_cache_str is not updated from its value.

Let's investigate where/how Item_cache_str is updated.

  • In 5.3 the relevant calls happen in the following order:

outer_join->optimize()
make_join_select()
if (const_cond && !const_cond->val_int())
Item_func_not::val_int
............
Item_in_optimizer::val_int
cache->store(args[0]);
cache->cache_value(); <== This is where the cache is updated
......
Item_in_subselect::exec
JOIN::optimize
make_join_statistics
update_ref_and_keys
....
add_key_fields
merge_key_fields <== At this point the cache is available

  • In 5.3-MWL#89
  • first the optimizer optimizes all subqueries
  • expensive constant conditions are not evaluated during optimization,
    so we don't call Item_in_optimizer::val_int()
  • call make_join_statistics -> add_key_fields -> merge_key_fields,
    at this point the cache is not updated
  • only after that execution will call Item_in_optimizer::val_int
    which will evaluate the cache too late
Comment by Rasmus Johansson (Inactive) [ 2011-12-13 ]

Launchpad bug id: 777597

Generated at Thu Feb 08 06:49:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.