1. Create a function, which selects data from any table. It is not enough to just return some constant value.
2. Create a view, which uses the function created in step 1.
3. Create a procedure, which runs for-loop. This for loop must query view created on step 2. However, it should not use any columns returned from the query, but a constant, eg. 1, for each row.
4. Call procedure twice from the same connection. First run success normally. The second one crashes the server without any explanation.
Minimal Reproducible Example is on attachment server_crash.sql.
All these details seem to be required to cause the server to crash. Interestingly enough, dropping some of them causes a different issue. If I select any value on cursor definition or skip the view and call the function directly, the first run results an error Table 'test_table' doesn't exist. Example of this is on file table_does_not_exist.sql.
The problem is that we try to iterarte tables till last "own" table (thd->lex->first_not_own_table()) which was created in sp_head::add_used_tables_to_table_list but somehow missed on the second execution in the list of all tables (removed? replaced?).
Oleksandr Byelkin
added a comment - The problem is that we try to iterarte tables till last "own" table (thd->lex->first_not_own_table()) which was created in sp_head::add_used_tables_to_table_list but somehow missed on the second execution in the list of all tables (removed? replaced?).
it looks like open cursor do not do correct prelocking to make prelock slots (the error in first run and crash in the second)
Oleksandr Byelkin
added a comment - - edited it looks like open cursor do not do correct prelocking to make prelock slots (the error in first run and crash in the second)
sp_instr_cursor_copy_struct removes prelocking tables from list and then sp_instr_copen trys to use prelocking tables.
Oleksandr Byelkin
added a comment - sp_instr_cursor_copy_struct removes prelocking tables from list and then sp_instr_copen trys to use prelocking tables.
Usually when we execute instruction, prelock-tables added to the tail of table list and then removed (with savimg its address to add them back and remove on re-execution)
the specific of the cursor is that it use 2 instructions instead of 1:
sp_instr_cursor_copy_struct
sp_instr_copen
sp_instr_cursor_copy_struct uses temporary cursor to fill some data, it add prelocking tables and ten remove them which should roll back everything, but the last table add there left in Query_tables_list structure.
When sp_instr_copen add prelock-tables to the list it uses adress of the last table in Query_tables_list structure changed by sp_instr_cursor_copy_struct, and so add the tables in the wrong place. and real table list is left without added tables which lead to all problem we have including crash on attempt to access NULL pointer on unexisting "prelock-tables-tail".
Oleksandr Byelkin
added a comment - Usually when we execute instruction, prelock-tables added to the tail of table list and then removed (with savimg its address to add them back and remove on re-execution)
the specific of the cursor is that it use 2 instructions instead of 1:
sp_instr_cursor_copy_struct
sp_instr_copen
sp_instr_cursor_copy_struct uses temporary cursor to fill some data, it add prelocking tables and ten remove them which should roll back everything, but the last table add there left in Query_tables_list structure.
When sp_instr_copen add prelock-tables to the list it uses adress of the last table in Query_tables_list structure changed by sp_instr_cursor_copy_struct, and so add the tables in the wrong place. and real table list is left without added tables which lead to all problem we have including crash on attempt to access NULL pointer on unexisting "prelock-tables-tail".
on debug build: fails a debug assertion on the first execution of the SP, which didn't happen before,
on non-debug build: fails same way as it did before, crashing in check_grant.
Also, it's not exactly specific to a view – as often happens, the view in that test case can be replaced with a subquery with the same effect, as in
declare _cur cursorforselect 1 from (select get_name(id) from t1) sq;
So the patch may indeed need further improvement.
Elena Stepanova
added a comment - - edited With the patch the test case with views
on debug build: fails a debug assertion on the first execution of the SP, which didn't happen before,
on non-debug build: fails same way as it did before, crashing in check_grant.
Also, it's not exactly specific to a view – as often happens, the view in that test case can be replaced with a subquery with the same effect, as in
declare _cur cursor for select 1 from ( select get_name(id) from t1) sq;
So the patch may indeed need further improvement.
JFYI rewritten without FOR it still cause error where should not (but does not crash)
CREATE TABLE t1 ( id int, name varchar(24));
INSERT INTO t1 values (1, 'x'), (2, 'y'), (3, 'z');
create function get_name(_id int) returns varchar(24)
return (select name from t1 where id = _id);
select get_name(id) from t1;
delimiter ^^;
create procedure test_proc()
begin
DECLARE done INT DEFAULT FALSE;
DECLARE rec ROW TYPE OF _cur;
declare _cur cursor for select get_name(id) from t1;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
open _cur;
read_loop: LOOP
fetch _cur into rec;
IF done THEN
LEAVE read_loop;
END IF;
select 1;
end LOOP;
end;
^^
delimiter ;^^
--error 1146
call test_proc();
--error 1146
call test_proc();
Oleksandr Byelkin
added a comment - JFYI rewritten without FOR it still cause error where should not (but does not crash)
CREATE TABLE t1 ( id int, name varchar(24));
INSERT INTO t1 values (1, 'x'), (2, 'y'), (3, 'z');
create function get_name(_id int) returns varchar(24)
return (select name from t1 where id = _id);
select get_name(id) from t1;
delimiter ^^;
create procedure test_proc()
begin
DECLARE done INT DEFAULT FALSE;
DECLARE rec ROW TYPE OF _cur;
declare _cur cursor for select get_name(id) from t1;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
open _cur;
read_loop: LOOP
fetch _cur into rec;
IF done THEN
LEAVE read_loop;
END IF;
select 1;
end LOOP;
end;
^^
delimiter ;^^
--error 1146
call test_proc();
--error 1146
call test_proc();
My first patch does not work with views because view also add tables. So I come up with one lined fix which work with views.
Oleksandr Byelkin
added a comment - My first patch does not work with views because view also add tables. So I come up with one lined fix which work with views.
MDEV-26009 Server crash when calling twice procedure using FOR-loop
The problem was that instructions sp_instr_cursor_copy_struct and
sp_instr_copen uses the same lex, adding and removing "tail" of
prelocked tables and forgetting that tail of all tables is kept in
LEX::query_tables_last. If the LEX used only by one instruction
or the query do not have prelocked tables it is not important.
But to work correctly in all cases LEX::query_tables_last should
be reset to make new tables added in the correct list (after last
table in the LEX instead after last table of the prelocking "tail"
which was cut).
Oleksandr Byelkin
added a comment -
commit d8e915a88fcdebd0b42f10b2f6d2165296056dd1 (HEAD -> bb-10.3-MDEV-26009, origin/bb-10.3-MDEV-26009)
Author: Oleksandr Byelkin <sanja@mariadb.com>
Date: Fri Mar 18 11:13:09 2022 +0100
MDEV-26009 Server crash when calling twice procedure using FOR-loop
The problem was that instructions sp_instr_cursor_copy_struct and
sp_instr_copen uses the same lex, adding and removing "tail" of
prelocked tables and forgetting that tail of all tables is kept in
LEX::query_tables_last. If the LEX used only by one instruction
or the query do not have prelocked tables it is not important.
But to work correctly in all cases LEX::query_tables_last should
be reset to make new tables added in the correct list (after last
table in the LEX instead after last table of the prelocking "tail"
which was cut).
Also reproducible on 10.6.2 (Docker).