[MDEV-6859] scalar subqueries in a comparison produced unexpected result Created: 2014-10-09  Updated: 2016-02-09  Resolved: 2016-02-09

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 5.3.12, 5.5.40, 10.0.14
Fix Version/s: 5.5.48

Type: Bug Priority: Major
Reporter: Kenneth Cruz Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Attachments: File database_dump_test_mdb.sql     Text File test_results.txt    
Sprint: 5.5.44, 5.5.45, 5.5.47-1, 5.5.48-0

 Description   

A query with scalar subquery in a comparison returns no result when using "ORDER BY" and "LIMIT" in the scalar subquery. (https://mariadb.com/kb/en/sql-99-complete-really/31-searching-with-subqueries/scalar-subqueries/#comment_1325)

Sample SQL statement:

SELECT SQL_NO_CACHE a.project_number FROM projects a WHERE ( SELECT z.country FROM projects_history z WHERE z.project_number = a.project_number AND z.history_date <= '2014-09-01' ORDER BY z.id DESC LIMIT 1 ) IN ( SELECT r.country FROM region r WHERE r.region = 'eame' );

The query was executed in MariaDB (10.0.3), MySQL(5.6.17) and MySQL(5.0.96). The MySQL(5.0.96) returns the expected result.

See attached files for details (db dump, explain, actual result and expected result)



 Comments   
Comment by Elena Stepanova [ 2014-10-09 ]

Test case (the same as in the description, it's perfect as it is, just put the data and query together and removed comments for MTR's benefit):

Test case

CREATE TABLE `projects` (
  `project_number` varchar(50) NOT NULL,
  PRIMARY KEY (`project_number`)
) ENGINE=MyISAM;
 
INSERT INTO `projects` (`project_number`) VALUES ('aaa'),('bbb');
 
CREATE TABLE IF NOT EXISTS `projects_history` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `project_number` varchar(50) NOT NULL,
  `history_date` date NOT NULL,
  `country` varchar(50) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM;
 
INSERT INTO `projects_history` (`id`, `project_number`, `history_date`, `country`) VALUES
(1, 'aaa', '2014-08-09', 'france'),(2, 'aaa', '2014-09-09', 'singapore');
 
CREATE TABLE IF NOT EXISTS `region` (
  `region` varchar(50) NOT NULL,
  `country` varchar(50) NOT NULL
) ENGINE=MyISAM;
 
INSERT INTO `region` (`region`, `country`) VALUES ('apac', 'singapore'),('eame', 'france');
 
SELECT SQL_NO_CACHE a.project_number FROM projects a 
WHERE ( 
  SELECT z.country FROM projects_history z 
  WHERE z.project_number = a.project_number 
    AND z.history_date <= '2014-09-01' 
  ORDER BY z.id DESC LIMIT 1 
) IN ( 
  SELECT r.country FROM region r 
  WHERE r.region = 'eame' 
);

Comment by Sergei Petrunia [ 2014-10-14 ]

MariaDB [test]> SELECT r.country FROM region r WHERE r.region = 'eame';
+---------+
| country |
+---------+
| france  |
+---------+

select *, (SELECT z.country FROM projects_history z    WHERE z.project_number = a.project_number      AND z.history_date <= '2014-09-01'    ORDER BY z.id DESC LIMIT 1  ) as S from projects a;
+----------------+--------+
| project_number | S      |
+----------------+--------+
| aaa            | france |
| bbb            | NULL   |
+----------------+--------+

So, it should have returned "aaa".

Comment by Sergei Petrunia [ 2014-10-14 ]

EXPLAIN:

+------+--------------------+-------------+--------+---------------+--------------+---------+------+------+-------------+
| id   | select_type        | table       | type   | possible_keys | key          | key_len | ref  | rows | Extra       |
+------+--------------------+-------------+--------+---------------+--------------+---------+------+------+-------------+
|    1 | PRIMARY            | a           | index  | NULL          | PRIMARY      | 52      | NULL |    2 | Using index |
|    1 | PRIMARY            | <subquery3> | eq_ref | distinct_key  | distinct_key | 53      | func |    1 | Using where |
|    3 | MATERIALIZED       | r           | ALL    | NULL          | NULL         | NULL    | NULL |    2 | Using where |
|    2 | DEPENDENT SUBQUERY | z           | index  | NULL          | PRIMARY      | 4       | NULL |    1 | Using where |
+------+--------------------+-------------+--------+---------------+--------------+---------+------+------+-------------+

if I set optimizer_switch='materialization=off', I get a correct result.

Comment by Sergei Petrunia [ 2014-10-14 ]

Debugging...

  • SJ-Materialization works ok, 'france' is written into temp.table
  • but then, lookup in the temptable doesnt find it.
Comment by Sergei Petrunia [ 2014-10-14 ]

Indeed the lookup key looks like this:

(gdb) x/8x key
  0x7fff94090b88:	0x32	0x00	0x8f	0x8f	0x8f	0x8f	0x8f	0x8f

stepping through Item::save_in_field also shows it's not a NULL value.

Comment by Sergei Petrunia [ 2014-10-14 ]

This is where we get the lookup value from:

(gdb) wher
  #0  Item_field::str_result (this=0x7fff94008ae8, str=0x7fff9400db50) at /home/psergey/dev2/5.5/sql/item.cc:2602
  #1  0x00000000007f6808 in Item_cache_str::cache_value (this=0x7fff9400da50) at /home/psergey/dev2/5.5/sql/item.cc:9154
  #2  0x00000000007fc1d8 in Item_cache::has_value (this=0x7fff9400da50) at /home/psergey/dev2/5.5/sql/item.h:4124
  #3  0x00000000007f6bb9 in Item_cache_str::save_in_field (this=0x7fff9400da50, field=0x7fff94029988, no_conversions=true) at /home/psergey/dev2/5.5/sql/item.cc:9225
  #4  0x000000000068241c in store_key_item::copy_inner (this=0x7fff94029950) at /home/psergey/dev2/5.5/sql/sql_select.h:1664
  #5  0x000000000068200b in store_key::copy (this=0x7fff94029950) at /home/psergey/dev2/5.5/sql/sql_select.h:1554
  #6  0x000000000067421c in cp_buffer_from_ref (thd=0x2f5bcb0, table=0x7fff9402e268, ref=0x7fff940284f8) at /home/psergey/dev2/5.5/sql/sql_select.cc:20357
  #7  0x0000000000674155 in cmp_buffer_with_ref (thd=0x2f5bcb0, table=0x7fff9402e268, tab_ref=0x7fff940284f8) at /home/psergey/dev2/5.5/sql/sql_select.cc:20339
  #8  0x000000000066d22d in join_read_key2 (thd=0x2f5bcb0, tab=0x7fff940282e8, table=0x7fff9402e268, table_ref=0x7fff940284f8) at /home/psergey/dev2/5.5/sql/sql_select.cc:17522
  #9  0x000000000066d18e in join_read_key (tab=0x7fff940282e8) at /home/psergey/dev2/5.5/sql/sql_select.cc:17500
  #10 0x000000000066bbae in sub_select (join=0x7fff9400c610, join_tab=0x7fff940282e8, end_of_records=false) at /home/psergey/dev2/5.5/sql/sql_select.cc:16874
  #11 0x000000000066c2d7 in evaluate_join_record (join=0x7fff9400c610, join_tab=0x7fff94027fc8, error=0) at /home/psergey/dev2/5.5/sql/sql_select.cc:17096
  #12 0x000000000066bc10 in sub_select (join=0x7fff9400c610, join_tab=0x7fff94027fc8, end_of_records=false) at /home/psergey/dev2/5.5/sql/sql_select.cc:16877
  #13 0x000000000066b477 in do_select (join=0x7fff9400c610, fields=0x2f5f7c0, table=0x0, procedure=0x0) at /home/psergey/dev2/5.5/sql/sql_select.cc:16539
  #14 0x0000000000649dff in JOIN::exec (this=0x7fff9400c610) at /home/psergey/dev2/5.5/sql/sql_select.cc:2873
  #15 0x000000000064a60f in mysql_select (thd=0x2f5bcb0, rref_pointer_array=0x2f5f918, tables=0x7fff94007ab0, wild_num=0, fields=..., conds=0x7fff9400c368, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fff9400c5f0, unit=0x2f5efd0, select_lex=0x2f5f6a8) at /home/psergey/dev2/5.5/sql/sql_select.cc:3094

We copy the data from "z.country" field:

(gdb) p this->field_name
  $125 = 0x7fff94016d20 "country"
(gdb) p this
  $126 = (Item_field *) 0x7fff94008ae8
(gdb) p this->field->table->alias.Ptr
  $127 = 0x7fff94012440 "z"

the problem is, we never executed the subquery. A breakpoint in JOIN::exec() was only hit for select_number==1

Comment by Sergei Petrunia [ 2014-10-14 ]

The store_key_item object was created in setup_sj_materialization_part2:

    for (i= 0; i < tmp_key_parts; i++, cur_key_part++, ref_key++)
    {
      tab_ref->items[i]= emb_sj_nest->sj_subq_pred->left_expr->element_index(i);
      int null_count= test(cur_key_part->field->real_maybe_null());
      *ref_key= new store_key_item(thd, cur_key_part->field,

(gdb) p tab_ref->items[0]
  $142 = (Item_cache_str *) 0x7fff9400db00

Comment by Sergei Petrunia [ 2014-10-14 ]

Can't really understand the code in bool Item_cache_str::cache_value()

Trying this query

MariaDB [test]> SELECT SQL_NO_CACHE a.project_number FROM projects a  WHERE a.project_number > (SELECT z.country FROM projects_history z    WHERE z.project_number = a.project_number      AND z.history_date <= '2014-09-01'    ORDER BY z.id DESC LIMIT 1);

and seeing:

(gdb) print example
  $158 = (Item_singlerow_subselect *) 0x7fff98008838

while in the bug's testcase it is:

(gdb) print example
  $162 = (Item_field *) 0x7fff94008ae8

Comment by Sergei Petrunia [ 2014-10-15 ]

Discussed with sanja. The interesting part is this code in setup_sj_materialization_part2:

emb_sj_nest->sj_subq_pred->left_expr->element_index(i);

Here :

(gdb) p emb_sj_nest->sj_subq_pred->left_expr
  $332 = (Item_singlerow_subselect *) 0x7fff94042810
(gdb) p emb_sj_nest->sj_subq_pred->left_expr->element_index(0)
  $333 = (Item_cache_str *) 0x7fff94045750
(gdb) p emb_sj_nest->sj_subq_pred->left_expr->element_index(0)->example
  $334 = (Item_field *) 0x7fff94008ad8

That is, left_expr is an Item_singlerow_subselect. When one calls left_expr->val_XXX() (provided the subquery returns a single value, not tuple) , the subquery gets re-executed. However, left_expr->element_index() is an Item_cache(Item_field). So, if you just call left_expr->element_index(0)->val_int(), subquery is not executed, and you don't get the new value.

According to Sanja, this is a common problem for "row-like" items, and the convention is to call row_item->bring_value() before calling elements of the row_item.

We didn't have this problem with ref access before, because row items are not used for ref access. (Except for sj-materialization)

Comment by Sergei Petrunia [ 2014-10-15 ]

This patch fixes the problem and passes the testsuite. It's not very elegant, though: subquery code in the middle of eq_ref access code:

=== modified file 'sql/sql_select.cc'
--- sql/sql_select.cc   2014-10-14 11:11:06 +0000
+++ sql/sql_select.cc   2014-10-14 22:20:05 +0000
@@ -17517,8 +17517,18 @@ int join_read_key2(THD *thd, JOIN_TAB *t
       return 1;
     }
   }
-
+  
+  /*
+    The following is needed when one makes ref (or eq_ref) access from row
+    comparisons: one must call row->bring_value() to get the new values.
+  */
+  if (tab && tab->bush_children)
+  {
+    TABLE_LIST *emb_sj_nest= tab->bush_children->start->emb_sj_nest;
+    emb_sj_nest->sj_subq_pred->left_expr->bring_value();
+  }
   /* TODO: Why don't we do "Late NULLs Filtering" here? */
+
   if (cmp_buffer_with_ref(thd, table, table_ref) ||
       (table->status & (STATUS_GARBAGE | STATUS_NO_PARENT | STATUS_NULL_ROW)))
   {

Generated at Thu Feb 08 07:15:07 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.