[MDEV-6231] Wrong result (extra rows) via cursor in a stored procedure Created: 2014-05-12  Updated: 2017-10-11  Resolved: 2017-10-11

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Affects Version/s: 5.5.37, 10.0.11
Fix Version/s: 5.5.49

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Oleksandr Byelkin
Resolution: Cannot Reproduce Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-287 CHEAP SQ: A query with subquery in SE... Closed
Sprint: 5.5.44, 5.5.58

 Description   

Initially reported as http://bugs.skysql.com/show_bug.cgi?id=429

Test case:

DROP TABLE IF EXISTS test_a, test_b, test_result;
DROP PROCEDURE IF EXISTS sp_test;
 
CREATE TABLE test_a (num INT);
CREATE TABLE test_b (num INT);
CREATE TABLE test_result (num INT, cnt INT);
 
--delimiter |
CREATE PROCEDURE sp_test() 
BEGIN
DECLARE v_num  INT;
DECLARE v_done   INT DEFAULT 0;
 
/* Declare cursors */
DECLARE v_cursor CURSOR FOR
    SELECT A.num
    FROM   test_a A
    WHERE  ( SELECT COUNT(*)    FROM   test_b    WHERE  num = A.num ) = 0;
 
DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_done = 1;
 
START TRANSACTION;
 
OPEN v_cursor;
    DELETE FROM test_result;
v_loop: LOOP
 
    FETCH v_cursor INTO v_num;
    /* No more rows*/
    IF v_done = 1 THEN
        LEAVE v_loop;
    END IF;
 
    INSERT INTO test_result VALUES (v_num, (SELECT count(*) FROM test_b where num = v_num));
 
END LOOP v_loop;
CLOSE v_cursor;
COMMIT;
    SELECT * FROM test_result;
END |
--delimiter ;
 
INSERT INTO test_a VALUES (1),(2),(3),(4),(5);
INSERT INTO test_b VALUES (2),(3);
 
CALL sp_test();
CALL sp_test();
 
DROP TABLE IF EXISTS test_a, test_b, test_result;
DROP PROCEDURE IF EXISTS sp_test;

Actual result:

CALL sp_test();
num     cnt
1       0
4       0
5       0
CALL sp_test();
num     cnt
1       0
2       1
3       1
4       0
5       0

Expected result:

CALL sp_test();
num     cnt
1       0
4       0
5       0
CALL sp_test();
num     cnt
1       0
4       0
5       0

The problem appeared in 5.5 tree with the following revision:

            revno: 3402.1.4
            revision-id: timour@askmonty.org-20120529211853-hww47vl7d4u4ae23
            parent: timour@askmonty.org-20120524110828-r0mm8sm1vn8a095e
            committer: timour@askmonty.org
            branch nick: 5.5-lpb944706
            timestamp: Wed 2012-05-30 00:18:53 +0300
            message:
              Patch for mdev-287: CHEAP SQ: A query with subquery in SELECT list, EXISTS, inner joins takes hundreds t
imes longer
              
              Analysis:
              
              The fix for lp:944706 introduces early subquery optimization.
              While a subquery is being optimized some of its predicates may be
              removed. In the test case, the EXISTS subquery is constant, and is
              evaluated to TRUE. As a result the whole OR is TRUE, and thus the
              correlated condition "b = alias1.b" is optimized away. The subquery
              becomes non-correlated.
              
              The subquery cache is designed to work only for correlated subqueries.
              If constant subquery optimization is disallowed, then the constant
              subquery is not evaluated, the subquery remains correlated, and its
              execution is cached. As a result execution is fast.
              
              However, when the constant subquery was optimized away, it was neither
              cached by the subquery cache, nor it was cached by the internal subquery
              caching. The latter was due to the fact that the subquery still appeared
              as correlated to the subselect_XYZ_engine::exec methods, and they
              re-executed the subquery on each call to Item_subselect::exec.
              
             Solution:
              
              The solution is to update the correlated status of the subquery after it has
              been optimized. This status consists of:
              - st_select_lex::is_correlated
              - Item_subselect::is_correlated
              - SELECT_LEX::uncacheable
              - SELECT_LEX_UNIT::uncacheable
              The status is updated by st_select_lex::update_correlated_cache(), and its
              caller st_select_lex::optimize_unflattened_subqueries. The solution relies
              on the fact that the optimizer already called
              st_select_lex::update_used_tables() for each subquery. This allows to
              efficiently update the correlated status of each subquery without walking
              the whole subquery tree.
              
              Notice that his patch is an improvement over MySQL 5.6 and older, where
              subqueries are not pre-optimized, and the above analysis is not possible.



 Comments   
Comment by shin [ 2014-05-15 ]

Is this bug will be resolved on 5.5.3.8 ?
And also this bug is occured on MariaDB 10.

Comment by Sergei Petrunia [ 2014-05-29 ]

Debugging, one can see that st_select_lex::update_correlated_cache() changes the subquery from correlated to uncorrelated on the second CALL statement.

(gdb) p dbug_print_item(sl->master_unit()->item)
$173 = 0x152da20 "(select count(0) from j1.test_b where (j1.A.num = j1.test_b.num))"

Note: making PS instead of cursor and re-running it produces a correct result each time.

Comment by Sergei Petrunia [ 2014-05-29 ]

When we are in st_select_lex::update_correlated_cache() during the first CALL:

(gdb) p ((Item*)join->conds)
  $230 = (Item_func_eq *) 0x7fffcd0f2960
(gdb) p dbug_print_item(join->conds)
  $231 = 0x152da20 "(`j1`.`test_b`.`num` = `j1`.`A`.`num`)"
(gdb) p/x join->conds->used_tables()
  $232 = 0x4000000000000001

Everything is OK: the WHERE condition contains a reference to outside. This
causes the subquery to be correlated.

Some details about the condition:

(gdb) p ((Item*)join->conds)->args[0]
  $233 = (Item_field *) 0x7fffcd0efdd0
(gdb) p dbug_print_item(((Item*)join->conds)->args[0])
  $234 = 0x152da20 "`j1`.`test_b`.`num`"
(gdb) p (((Item*)join->conds)->args[0])->used_tables()
  $235 = 1
(gdb)  
(gdb) p ((Item*)join->conds)->args[1]
  $236 = (Item_field *) 0x7fffcd0efed0
(gdb) p dbug_print_item(((Item*)join->conds)->args[1])
  $237 = 0x152da20 "`j1`.`A`.`num`"
(gdb) p/x (((Item*)join->conds)->args[1])->used_tables()
  $239 = 0x4000000000000000

(I am a little bit surprised that there is an Item_field that has
used_tables()=OUTER_REF_TABLE_BIT. Shouldn't it be an Item_ref?)

Comment by Sergei Petrunia [ 2014-05-29 ]

When we are in st_select_lex::update_correlated_cache() during the second CALL:

(gdb) p/x join->conds->used_tables()
  $217 = 0x1
(gdb) p dbug_print_item(join->conds)
  $218 = 0x152da20 "(j1.A.num = j1.test_b.num)"

The condition's used_tables() doesn't include OUTER_REF_TABLE_BIT. This is
incorrect.

Let's explore the details:

(gdb) p ((Item*)join->conds)->args[0]
  $221 = (Item_field *) 0x7fffcd0efed0
(gdb) p dbug_print_item(((Item*)join->conds)->args[0])
  $222 = 0x152da20 "j1.A.num"
(gdb) p (((Item*)join->conds)->args[0])->used_tables()
  $223 = 1
 
(gdb) p ((Item*)join->conds)->args[1]
  $224 = (Item_field *) 0x7fffcd0efdd0
(gdb) p dbug_print_item(((Item*)join->conds)->args[1])
  $225 = 0x152da20 "j1.test_b.num"
(gdb) p (((Item*)join->conds)->args[1])->used_tables()
  $226 = 1

Comment by Oleksandr Byelkin [ 2017-10-11 ]

It looks like fixed, checking when:

REATE TABLE test_a (num INT);
CREATE TABLE test_b (num INT);
CREATE TABLE test_result (num INT, cnt INT);
CREATE PROCEDURE sp_test() 
BEGIN
DECLARE v_num  INT;
DECLARE v_done   INT DEFAULT 0;
/* Declare cursors */
DECLARE v_cursor CURSOR FOR
SELECT A.num
FROM   test_a A
WHERE  ( SELECT COUNT(*)    FROM   test_b    WHERE  num = A.num ) = 0;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_done = 1;
START TRANSACTION;
OPEN v_cursor;
DELETE FROM test_result;
v_loop: LOOP
FETCH v_cursor INTO v_num;
/* No more rows*/
IF v_done = 1 THEN
LEAVE v_loop;
END IF;
INSERT INTO test_result VALUES (v_num, (SELECT count(*) FROM test_b where num = v_num));
END LOOP v_loop;
CLOSE v_cursor;
COMMIT;
SELECT * FROM test_result;
END |
INSERT INTO test_a VALUES (1),(2),(3),(4),(5);
INSERT INTO test_b VALUES (2),(3);
CALL sp_test();
num	cnt
1	0
4	0
5	0
CALL sp_test();
num	cnt
1	0
4	0
5	0
DROP TABLE test_a, test_b, test_result;
DROP PROCEDURE sp_test;

Comment by Oleksandr Byelkin [ 2017-10-11 ]

It was fixed somewhere before 5.5.49 (can't build earlier version)

Generated at Thu Feb 08 07:10:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.