[MDEV-30081] Crash with splitting from constant mergeable derived table Created: 2022-11-23  Updated: 2023-03-13  Resolved: 2023-01-25

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9
Fix Version/s: 10.11.2, 10.3.38, 10.4.28, 10.5.19, 10.6.12, 10.7.8, 10.8.7, 10.9.5, 10.10.3

Type: Bug Priority: Critical
Reporter: Kyle Hutchinson Assignee: Igor Babaev
Resolution: Fixed Votes: 1
Labels: not-10.10, split_materialized
Environment:

CentOS 7


Attachments: File CS0457258.err     File MDEV-30081.test     File schema.sql    
Issue Links:
Relates
relates to MDEV-27748 Split_materialized: Crash in best_acc... Closed
relates to MDEV-28901 SIGSEGV in optimize_keyuse on INSERT Closed
relates to MDEV-30711 The mysqld service core occurs when t... Confirmed

 Description   

Select from complex view using where clause gets SIGNAL 11 crash with derived_merge=on in versions 10.6 - 10.9. The issue does not happen in 10.5 or 10.10. If derived_merge=off then the issue does not happen. If derived_merge=on and there is no where clause used against the view, the issue does not happen. The statement below causes the issue every time when derived_merge=on.

A customer reported this and does not want to have to turn derived_merge off because they don't want to lose out on query performance. They noticed the issue after upgrading from 10.5 to 10.6. I tested out the other versions to verify the scope of the issue.
Recreation steps:

  1. Load the attached sql script to create the schema, and populate it with data
  2. Run the below select statement
    This should return

    ERROR 2013 (HY000): Lost connection to server during query
    

    In the error log it reports a SIGNAL 11.

SELECT this_.WORKFLOW_ID as y0_, this_.WORKFLOW_TYPE_ID as y1_, this_.COMPLETED_BY as y2_, this_.STATUS as y3_, this_.WORKFLOW as y4_, this_.WORKFLOW_NAME as y5_ 
FROM ABC_CAM_VW_WORKFLOW_TAB this_ 
WHERE this_.WORKFLOW_ID = 7 
ORDER BY y0_ ASC 
LIMIT 1; 



 Comments   
Comment by Alice Sherepa [ 2022-11-25 ]

Could you please attach the error log?
There is some problem with the query - PCS_CAM_VW_WORKFLOW_TAB does not exist (should it be ABC_CAM_VW_WORKFLOW_TAB?) and no backticks ?

Comment by Alice Sherepa [ 2022-11-28 ]

Thank you!
I repeated on 10.6-10.9 with InnoDB, no crash on 10.10.

 
--source include/have_innodb.inc 
 
CREATE TABLE t1 ( id int, id1 int, PRIMARY KEY (id), KEY id1 (id1)) engine=innodb;
insert into t1 values (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t2 ( id int PRIMARY KEY) engine=innodb;
insert into t2 values (3),(4),(7);
 
CREATE TABLE t3 ( id int, abc int, PRIMARY KEY (id), KEY (abc)) engine=innodb;
insert into t3 values (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t4 ( id int, abc int, PRIMARY KEY (id)) engine=innodb;
insert into t4 values (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t5 ( id int, abc int, PRIMARY KEY (id), KEY abc (abc) ) engine=innodb;
insert into t5 values (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t6 ( id int, abc int, PRIMARY KEY (id), KEY abc (abc) ) engine=innodb;
insert into t6 values (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t7 ( id int, wid int, PRIMARY KEY (id), KEY (wid) ) engine=innodb;
insert into t7 values (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t8 ( id int, wid int, PRIMARY KEY (id), KEY wid (wid)) engine=innodb;
insert into t8 values (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t9 ( id int, abc int, wid int, PRIMARY KEY (id), KEY (wid), KEY id (id)) engine=innodb;
insert into t9 values (4,4,6),(7,7,7);
 
CREATE TABLE t10 ( id int, abc int, PRIMARY KEY (id), KEY (abc) ) engine=innodb;
insert into t10 values (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t11 ( id int, id1 int, PRIMARY KEY (id), KEY id1 (id1) ) engine=innodb;
insert into t11 values (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t12 ( id int PRIMARY KEY) engine=innodb;
insert into t12 values (3),(4),(7);
 
CREATE TABLE t13 ( id int,  id1 int, wid int, PRIMARY KEY (id), KEY (id1), KEY (wid)) engine=innodb;
insert into t13 values (4,4,6),(7,7,7);
 
CREATE TABLE t14 ( id int PRIMARY KEY ) engine=innodb;
insert into t14 values (3),(4),(7);
 
CREATE TABLE t15 ( id int, abc int, PRIMARY KEY (id), KEY (abc) ) engine=innodb;
insert into t15 values (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t16 ( id int, id1 int, PRIMARY KEY (id), KEY id1 (id1)) engine=innodb;
insert into t16 values (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t17 ( id int PRIMARY KEY) engine=innodb;
insert into t17 values (3),(4),(7);
 
CREATE TABLE t18 ( wtid int) engine=innodb;
insert into t18 values (3),(4),(7);
 
CREATE TABLE t19 ( wtid int PRIMARY KEY) engine=innodb;
insert into t19 values (1),(2),(7);
 
CREATE TABLE t20 ( wid int, KEY wid (wid)) engine=innodb;
insert into t20 values (42),(13),(25),(5);
 
CREATE TABLE t21 ( wid int, wtid int, otid int, oid int, PRIMARY KEY (wid), KEY (wtid), KEY (otid), KEY (oid) ) engine=innodb;
insert into t21 values (6,30,6,6),(7,17,7,7);
 
create view cte as (SELECT id, id1 FROM t16 GROUP BY id1);
 
select * from (
 SELECT t21.*
 FROM (((((((((((((((((((((t21 
                           JOIN t14 on(t21.oid = t14.id))
                          JOIN t15 on(t21.oid = t15.abc))
                         JOIN t17  on(t21.oid = t17.id))
                        JOIN t19  on(t21.wtid = t19.wtid))
                       JOIN t20 on(t21.wid = t20.wid))
                      JOIN t6 on(t21.oid = t6.id))
                     JOIN t5  on(t21.oid = t5.id))
                    JOIN t10  on(t21.oid = t10.id))
                   JOIN t8  on(t21.wid = t8.wid))
                  JOIN t11  on(t21.oid = t11.id))
                 JOIN t1  on(t21.oid = t1.id ))
                JOIN t13  on(t21.oid = t13.id))
               JOIN t9  on(t21.oid = t9.id))
              JOIN t4  on(t21.oid = t4.id))
            LEFT JOIN (SELECT t3.* FROM ((t3 JOIN t14 on(t3.abc = t14.id)) )) dt on(t21.oid = dt.id AND t21.otid = 14))
            JOIN t2  on(t21.oid = t2.id))
           JOIN t12  on(t21.oid = t12.id))
          JOIN t7  on(t7.wid = t21.wid))
         JOIN cte  on(cte.id1 = t14.id))
       LEFT JOIN cte cte_1 on(cte_1.id1 = dt.abc))
      LEFT JOIN cte cte_2 on(cte_2.id1 = t13.id1)) ) dt1
WHERE dt1.wid = 7;

Version: '10.6.12-MariaDB-debug-log'  
221128 16:42:11 [ERROR] mysqld got signal 11 ;
 
Server version: 10.6.12-MariaDB-debug-log
 
sql/signal_handler.cc:236(handle_fatal_signal)[0x55f9df511a22]
sigaction.c:0(__restore_rt)[0x7f5f4637e420]
sql/sql_select.cc:7360(optimize_keyuse(JOIN*, st_dynamic_array*))[0x55f9ded9225e]
sql/opt_split.cc:776(JOIN::add_keyuses_for_splitting())[0x55f9df33e8c2]
sql/opt_split.cc:802(st_join_table::add_keyuses_for_splitting())[0x55f9df33ed34]
sql/sql_select.cc:5785(make_join_statistics(JOIN*, List<TABLE_LIST>&, st_dynamic_array*))[0x55f9ded83b8d]
sql/sql_select.cc:2506(JOIN::optimize_inner())[0x55f9ded620c9]
sql/sql_select.cc:1845(JOIN::optimize())[0x55f9ded5b11f]
sql/sql_select.cc:5039(mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x55f9ded7caa6]
sql/sql_select.cc:555(handle_select(THD*, LEX*, select_result*, unsigned long))[0x55f9ded4cb01]
sql/sql_parse.cc:6256(execute_sqlcom_select(THD*, TABLE_LIST*))[0x55f9decb0d3c]
sql/sql_parse.cc:3946(mysql_execute_command(THD*, bool))[0x55f9dec9f8ea]
sql/sql_parse.cc:8016(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x55f9decbbf97]
sql/sql_parse.cc:1898(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool))[0x55f9dec92196]
sql/sql_parse.cc:1409(do_command(THD*, bool))[0x55f9dec8eeed]
sql/sql_connect.cc:1415(do_handle_one_connection(CONNECT*, bool))[0x55f9df0fdb85]
sql/sql_connect.cc:1319(handle_one_connection)[0x55f9df0fd4e2]
perfschema/pfs.cc:2203(pfs_spawn_thread)[0x55f9dfd799fc]
nptl/pthread_create.c:478(start_thread)[0x7f5f46372609]
??:0(clone)[0x7f5f45f43133]
 
 
Query (0x62b0000c42a8): select * from (
SELECT t21.*
FROM (((((((((((((((((((((t21 
JOIN t14 on(t21.oid = t14.id))
JOIN t15 on(t21.oid = t15.abc))
JOIN t17  on(t21.oid = t17.id))
JOIN t19  on(t21.wtid = t19.wtid))
JOIN t20 on(t21.wid = t20.wid))
JOIN t6 on(t21.oid = t6.id))
JOIN t5  on(t21.oid = t5.id))
JOIN t10  on(t21.oid = t10.id))
JOIN t8  on(t21.wid = t8.wid))
JOIN t11  on(t21.oid = t11.id))
JOIN t1  on(t21.oid = t1.id ))
JOIN t13  on(t21.oid = t13.id))
JOIN t9  on(t21.oid = t9.id))
JOIN t4  on(t21.oid = t4.id))
LEFT JOIN (SELECT t3.* FROM ((t3 JOIN t14 on(t3.abc = t14.id)) )) dt on(t21.oid = dt.id AND t21.otid = 14))
JOIN t2  on(t21.oid = t2.id))
JOIN t12  on(t21.oid = t12.id))
JOIN t7  on(t7.wid = t21.wid))
JOIN cte  on(cte.id1 = t14.id))
LEFT JOIN cte cte_1 on(cte_1.id1 = dt.abc))
LEFT JOIN cte cte_2 on(cte_2.id1 = t13.id1)) ) dt1
WHERE dt1.wid = 7

Comment by Alice Sherepa [ 2022-11-28 ]

During simplification, I've got a version of the test, that fails on 10.4-10.9, please find MDEV-30081.sql attached
I guess it is also the same as MDEV-27748 (closed as cannot reproduced).
on 10.4-10.5 if fails with sig11:

10.4 8535189f32b8265bb0fa65

Version: '10.4.28-MariaDB-debug-log' 
221128 16:52:27 [ERROR] mysqld got signal 11 ;
 
 
Server version: 10.4.28-MariaDB-debug-log
 
sql/signal_handler.cc:232(handle_fatal_signal)[0x56264483bb27]
sigaction.c:0(__restore_rt)[0x7ff591398420]
sql/sql_select.cc:7088(optimize_keyuse(JOIN*, st_dynamic_array*))[0x5626441807b6]
sql/opt_split.cc:777(JOIN::add_keyuses_for_splitting())[0x5626446719af]
sql/opt_split.cc:803(st_join_table::add_keyuses_for_splitting())[0x562644671e24]
sql/sql_select.cc:5518(make_join_statistics(JOIN*, List<TABLE_LIST>&, st_dynamic_array*))[0x562644171e90]
sql/sql_select.cc:2342(JOIN::optimize_inner())[0x562644150f4e]
sql/sql_select.cc:1685(JOIN::optimize())[0x562644149f65]
sql/sql_select.cc:4781(mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x56264416b021]
sql/sql_select.cc:437(handle_select(THD*, LEX*, select_result*, unsigned long))[0x56264413be26]
sql/sql_parse.cc:6450(execute_sqlcom_select(THD*, TABLE_LIST*))[0x5626440a9f50]
sql/sql_parse.cc:3964(mysql_execute_command(THD*))[0x56264409787d]
sql/sql_parse.cc:7982(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x5626440b332f]
sql/sql_parse.cc:1860(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x562644089c08]
sql/sql_parse.cc:1378(do_command(THD*))[0x562644086708]
sql/sql_connect.cc:1419(do_handle_one_connection(CONNECT*))[0x562644488225]
sql/sql_connect.cc:1324(handle_one_connection)[0x562644487ac9]
perfschema/pfs.cc:1871(pfs_spawn_thread)[0x562645112b7b]
nptl/pthread_create.c:478(start_thread)[0x7ff59138c609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7ff590f5d133]
 
Query (0x62b0000a1290): SELECT *
FROM v1 
WHERE v1.workflow_id = 7
LIMIT 1

on 10.6-10.9 - Assertion `(key_dependent & s->key_dependent) == key_dependent' in best_access_path

Version: '10.6.12-MariaDB-debug-log' 
mariadbd: /10.6/src/sql/sql_select.cc:8324: void best_access_path(JOIN*, JOIN_TAB*, table_map, const POSITION*, uint, bool, double, POSITION*, POSITION*): Assertion `(key_dependent & s->key_dependent) == key_dependent' failed.
221128 16:58:01 [ERROR] mysqld got signal 6 ;
 
Server version: 10.6.12-MariaDB-debug-log
 
??:0(__assert_fail)[0x7fccd7d66fd6]
sql/sql_select.cc:8334(best_access_path(JOIN*, st_join_table*, unsigned long long, POSITION const*, unsigned int, bool, double, POSITION*, POSITION*))[0x561dac57670d]
sql/sql_select.cc:10100(best_extension_by_limited_search(JOIN*, unsigned long long, unsigned int, double, double, unsigned int, unsigned int, unsigned int))[0x561dac580258]
sql/sql_select.cc:9241(greedy_search(JOIN*, unsigned long long, unsigned int, unsigned int, unsigned int))[0x561dac57bf3d]
sql/sql_select.cc:8800(choose_plan(JOIN*, unsigned long long))[0x561dac579c8f]
sql/opt_split.cc:982(st_join_table::choose_best_splitting(double, unsigned long long))[0x561dacb1eb42]
sql/sql_select.cc:7772(best_access_path(JOIN*, st_join_table*, unsigned long long, POSITION const*, unsigned int, bool, double, POSITION*, POSITION*))[0x561dac57318c]
sql/sql_select.cc:10100(best_extension_by_limited_search(JOIN*, unsigned long long, unsigned int, double, double, unsigned int, unsigned int, unsigned int))[0x561dac580258]
sql/sql_select.cc:10204(best_extension_by_limited_search(JOIN*, unsigned long long, unsigned int, double, double, unsigned int, unsigned int, unsigned int))[0x561dac580da8]
sql/sql_select.cc:10204(best_extension_by_limited_search(JOIN*, unsigned long long, unsigned int, double, double, unsigned int, unsigned int, unsigned int))[0x561dac580da8]
sql/sql_select.cc:10204(best_extension_by_limited_search(JOIN*, unsigned long long, unsigned int, double, double, unsigned int, unsigned int, unsigned int))[0x561dac580da8]
sql/sql_select.cc:9241(greedy_search(JOIN*, unsigned long long, unsigned int, unsigned int, unsigned int))[0x561dac57bf3d]
sql/sql_select.cc:8800(choose_plan(JOIN*, unsigned long long))[0x561dac579c8f]
sql/sql_select.cc:5947(make_join_statistics(JOIN*, List<TABLE_LIST>&, st_dynamic_array*))[0x561dac564bf1]
sql/sql_select.cc:2506(JOIN::optimize_inner())[0x561dac5410c9]
sql/sql_select.cc:1845(JOIN::optimize())[0x561dac53a11f]
sql/sql_select.cc:5039(mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x561dac55baa6]
sql/sql_select.cc:555(handle_select(THD*, LEX*, select_result*, unsigned long))[0x561dac52bb01]
sql/sql_parse.cc:6256(execute_sqlcom_select(THD*, TABLE_LIST*))[0x561dac48fd3c]
sql/sql_parse.cc:3946(mysql_execute_command(THD*, bool))[0x561dac47e8ea]
sql/sql_parse.cc:8016(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x561dac49af97]
sql/sql_parse.cc:1898(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool))[0x561dac471196]
sql/sql_parse.cc:1409(do_command(THD*, bool))[0x561dac46deed]
sql/sql_connect.cc:1415(do_handle_one_connection(CONNECT*, bool))[0x561dac8dcb85]
sql/sql_connect.cc:1319(handle_one_connection)[0x561dac8dc4e2]
perfschema/pfs.cc:2203(pfs_spawn_thread)[0x561dad5589fc]
nptl/pthread_create.c:478(start_thread)[0x7fccd8281609]
??:0(clone)[0x7fccd7e52133]
 
Query (0x62b0000c42a8): SELECT *
FROM v1 
WHERE v1.workflow_id = 7
LIMIT 1

Comment by Rex Johnston [ 2022-12-30 ]

There seems to be 2 issues here, both related to table-map handling.

Comment by Rex Johnston [ 2023-01-04 ]

I've found the backport needed fix this issue.

Comment by Sergei Petrunia [ 2023-01-10 ]

The problem with backporting the entire opt_table_elimination.cc file is that you're backporting the MDEV-26278, which is a feature in 10.10... I don't think we should do that.

Comment by Rex Johnston [ 2023-01-11 ]

Hi Sergei. Explanation is in the git commit.

Comment by Sergei Petrunia [ 2023-01-12 ]

Interesting, I thought the first variant of the fix happened to work because it backported MDEV-26278, but now I see that this bug's original testcase doesn't fail on 10.11 and does fail on 10.6 regardless of the optimizer_switch='table_elimination=ON|OFF' setting.

Comment by Sergei Petrunia [ 2023-01-12 ]

Looking at the latest patch, commit dc4a9457487be4b0d5ce8af6ccd32a88dc2fff2a...

This condition looks odd:

        if (tablenr >= join->table_count)
        {
          keyuse->ref_table_rows=100;
        }

Debugging...
Indeed, in optimize_keyuse() we end up with:

join->table_count=1
tablenr=15

and join->table[tablenr] points to garbage data.

tablenr=15, because

keyuse->used_tables == keyuse->val->used_tables() == (1 << 15) 

Avoiding the crash by adding if (tablenr >= join->table_count) is not a good solution. We have an Item and KEYUSE objects with invalid attributes. Need to investigate this and fix.

Comment by Sergei Petrunia [ 2023-01-12 ]

The crash happens here:

  #0  optimize_keyuse (join=0x7fff7caac680, keyuse_array=0x7fff7caac9a0) at /home/psergey/dev-git2/10.6-fix/sql/sql_select.cc:7357
  #1  0x000055555616093d in JOIN::add_keyuses_for_splitting (this=0x7fff7caac680) at /home/psergey/dev-git2/10.6-fix/sql/opt_split.cc:774
  #2  0x0000555556160af5 in st_join_table::add_keyuses_for_splitting (this=0x7fff7cac06b8) at /home/psergey/dev-git2/10.6-fix/sql/opt_split.cc:801
  #3  0x0000555555f0d605 in make_join_statistics (join=0x7fff7caa3ad0, tables_list=..., keyuse_array=0x7fff7caa3df0) at /home/psergey/dev-git2/10.6-fix/sql/sql_select.cc:5776
  #4  0x0000555555f0186e in JOIN::optimize_inner (this=0x7fff7caa3ad0) at /home/psergey/dev-git2/10.6-fix/sql/sql_select.cc:2504
  #5  0x0000555555eff0fe in JOIN::optimize (this=0x7fff7caa3ad0) at /home/psergey/dev-git2/10.6-fix/sql/sql_select.cc:1843
  #6  0x0000555555f0acd7 in mysql_select (thd=0x7fff7c000d78, tables=0x7fff7ca8dec0, fields=..., conds=0x7fff7ca8e758, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select

Frames #3-#5: optimizing the top-level join.
Frame #2: the st_join_table represents table cte_1. We're adding KEYUSE objects for the split_materialized optimization.

The testcase has:

create view cte as (SELECT id, id1 FROM t16 GROUP BY id1);
...
       LEFT JOIN cte cte_1 on(cte_1.id1 = dt.abc))

Frame #0-#1: JOIN::add_keyuses_for_splitting() - here the JOIN object is the child join (this->select_lex->select_number=5)

There is only one keyuse object:

KEYUSE {
  table= t16
  key=1
  keypart=0
  used_tables=32768 = (1 << 15)
  val = Item_direct_view_ref( ref[0]=Item_field(t3.abc))
}

Observations on the above:

  • Generally, the contents agree with the view and query fragment posted..
  • The val->used_tables() (and keyuse->used_tables) is clearly wrong.
  • Why does it use Item_direct_view_ref? Shouldn't it use an Item_outer_ref ? (I'm not familiar with split_materialized code to tell)

A certain peculiarity: The referred-to item is a constant:

(gdb) p dbug_print_item(keyuse->val)
  $127 = 0x555557a549e0 <dbug_item_print_buf> "NULL"

The Item_field is a constant:

(gdb) p keyuse->val->ref[0]->used_tables()
  $125 = 0
(gdb) p keyuse->val->ref[0]->const_item()
  $126 = true 

Howver the Item_direct_view_ref object is not:

(gdb) p keyuse->val->const_item()
  $128 = false

Comment by Sergei Petrunia [ 2023-01-12 ]

For comparison, a random non-crashing testcase with Split-Materialized:

CREATE TABLE t1 (
 n1 int(10) NOT NULL,
 n2 int(10) NOT NULL,
 c1 char(1) NOT NULL,
 KEY c1 (c1),
 KEY n1_c1_n2 (n1,c1,n2)
) ENGINE=InnoDB;
INSERT INTO t1 VALUES (0, 2, 'a'), (1, 3, 'a');
insert into t1 select seq+1,seq+2,'c' from seq_1_to_1000;
ANALYZE TABLE t1;
create table t1i like t1;
insert into t1i select * from t1;
analyze table t1i persistent for all;
 
 
EXPLAIN SELECT t1.n1 FROM t1, (SELECT n1, n2 FROM t1i WHERE c1 = 'a' GROUP BY n1) as t WHERE t.n1 = t1.n1 AND t.n2 = t1.n2 AND c1 = 'a' GROUP BY n1;
+------+-----------------+------------+------+---------------+----------+---------+-------------------+------+---------------------------------------------------------------------+
| id   | select_type     | table      | type | possible_keys | key      | key_len | ref               | rows | Extra                                                               |
+------+-----------------+------------+------+---------------+----------+---------+-------------------+------+---------------------------------------------------------------------+
|    1 | PRIMARY         | t1         | ref  | c1,n1_c1_n2   | c1       | 1       | const             | 2    | Using index condition; Using where; Using temporary; Using filesort |
|    1 | PRIMARY         | <derived2> | ref  | key0          | key0     | 8       | j1.t1.n1,j1.t1.n2 | 2    |                                                                     |
|    2 | LATERAL DERIVED | t1i        | ref  | c1,n1_c1_n2   | n1_c1_n2 | 4       | j1.t1.n1          | 1    | Using where; Using index                                            |
+------+-----------------+------------+------+---------------+----------+---------+-------------------+------+---------------------------------------------------------------------+

(gdb) p dbug_print_item(keyuse->val)
  $207 = 0x555557a549e0 <dbug_item_print_buf> "t1.n1"
(gdb) p/x keyuse->val->used_tables()
  $209 = 0x4000000000000000
(gdb) p/x keyuse->used_tables
  $210 = 0x4000000000000000

used_tables returns OUTER_REF_TABLE_BIT, because:

(gdb) p keyuse->val->depended_from
  $212 = (st_select_lex *) 0x7fffb0017af0

Which is set in TABLE::add_splitting_info_for_key_field():

   if (left_item && right_item)
    {
      right_item->walk(&Item::set_fields_as_dependent_processor,
                       false, join->select_lex);

Comment by Sergei Petrunia [ 2023-01-12 ]

Getting back to the original testcase:

Looking how the code in TABLE::add_splitting_info_for_key_field() works for it:

    Item *right_item= key_field->val->build_clone(thd);

Here, key_field->val is a

 Item_direct_view_ref( ref[0]=Item_field(t3.abc))

and the clone method clones both Item_direct_view_ref and Item_field object.
Good so far.

      right_item->walk(&Item::set_fields_as_dependent_processor,
                       false, join->select_lex);

There is only one non-empty implementation of set_fields_as_dependent_processor:

class Item_field {
  bool set_fields_as_dependent_processor(void *arg) override
  {
    if (!(used_tables() & OUTER_REF_TABLE_BIT))
    {
      depended_from= (st_select_lex *) arg;
      item_equal= NULL;
    }
    return 0;
  }

It gets invoked.

Note that Item_direct_view_ref also has depended_from, it is defined in class Item_ident. But it was and remains NULL.

Comment by Sergei Petrunia [ 2023-01-12 ]

Take another look at:

  table_map Item_direct_view_ref::used_tables() const
  {
    DBUG_ASSERT(fixed());
    
9   if (get_depended_from())                                                                                                                          
      return OUTER_REF_TABLE_BIT;
  
    if (view->is_merged_derived() || view->merged || !view->table)
    {
      table_map used= (*ref)->used_tables();
=>    return (used ?                                                                                                                                  
              used :
              ((null_ref_table != NO_NULL_TABLE) ?
               null_ref_table->map :
               (table_map)0 ));
    }
    return view->table->map;
  } 

.
We have

  • get_depended_from()==NULL
  • view->is_merged_derived()==TRUE

However, as long as (*ref)->used_tables() returns OUTER_REF_TABLE_BIT, this function will return it.

But in this testcase, *ref becomes a constant item, with used_tables()=0.
We have (null_ref_table != NO_NULL_TABLE), so we return null_ref_table->map, even if it's a table from the outer select.

Comment by Sergei Petrunia [ 2023-01-12 ]

Variant 2 of the fix (totally different from Rex' patches): https://github.com/MariaDB/server/commit/c8e7288e7332962bdebc5ae658bbd839d0b26254

sanja please review.

Comment by Igor Babaev [ 2023-01-13 ]

Observation:
The crash can be easily reproduced on the current 10.3 (debug build) with the original test case from alice and with the following simplification of this test case:

CREATE TABLE t3 ( id int, abc int, PRIMARY KEY (id), KEY (abc) );
INSERT INTO t3 VALUES (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t13 (
  id int,  id1 int, wid int, PRIMARY KEY (id), KEY (id1), KEY (wid)
);
INSERT INTO t13 VALUES (4,4,6),(7,7,7);
 
CREATE TABLE t14 ( id int PRIMARY KEY );
INSERT INTO t14 VALUES (3),(4),(7);
 
CREATE TABLE t16 ( id int, id1 int, PRIMARY KEY (id), KEY id1 (id1));
INSERT INTO t16 VALUES (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t21 (
  wid int, wtid int, otid int, oid int,
  PRIMARY KEY (wid), KEY (wtid), KEY (otid), KEY (oid)
);
INSERT INTO t21 VALUES (6,30,6,6),(7,17,7,7);
 
CREATE VIEW v1 AS (SELECT id1 FROM t16 GROUP BY id1);
SELECT t21.*, t14.id AS t14_id, t13.id AS t13_id, dt.*, v1.*
FROM 
  t14, t13, t21
  LEFT JOIN
  (SELECT t3.* FROM t3 WHERE t3.abc=3) dt
  ON t21.oid = dt.id AND t21.otid = 14
  LEFT JOIN v1
  ON (v1.id1 = dt.abc)
WHERE t21.oid = t14.id AND t21.oid = t13.id AND t21.wid = 7;
 
DROP VIEW v1;
DROP TABLE t3,t13,t14,t16,t21;

The crash always happens in the function optimize_keyuse() when we try to build a plan with LATERAL derived tables. The crash happens in this code:

    if (keyuse->used_tables &
	(map= (keyuse->used_tables & ~join->const_table_map &
	       ~OUTER_REF_TABLE_BIT)))
    {
      uint n_tables= my_count_bits(map);
      if (n_tables == 1)			// Only one table
      {
        DBUG_ASSERT(!(map & PSEUDO_TABLE_BITS)); // Must be a real table
        Table_map_iterator it(map);
        int tablenr= it.next_bit();
        DBUG_ASSERT(tablenr != Table_map_iterator::BITMAP_END);
	TABLE *tmp_table=join->table[tablenr];
        if (tmp_table) // already created
          keyuse->ref_table_rows= MY_MAX(tmp_table->file->stats.records, 100);
      }
 }
 

in the line

          keyuse->ref_table_rows= MY_MAX(tmp_table->file->stats.records, 100);

when keyuse is the KEYUSE produced by the equality v1.id1 = dt.abc injected into v1. For such a keyuse the field 'used_tables should contain only OUTER_REF_TABLE_BIT. The crash happens when:

 
(gdb) p/x keyuse->used_tables
$134 = 0x8
(gdb) p tablenr
$136 = 3

and

join->table[tablenr]

points to some memory outside of the array join->table because v1 contains only one table.

Comment by Igor Babaev [ 2023-01-13 ]

Switching to MyISAM tables after running

analyze table t3,t13,t14,t16,t21;

causes a crash in the same line of code as for InnoDB tables.

Comment by Igor Babaev [ 2023-01-16 ]

For the injected equality v1.id1 = dt.abc keyuse->used_tables is set in add_ext_keyuse_for_splitting() in the line:

  keyuse_ext.used_tables= keyuse_ext.val->used_tables();

where keyuse_ext.val is the item for dt.abc. This item is of the class Item_direct_view_ref as dt is a mergeable derived table. So Item_direct_view_ref::used_tables()

table_map Item_direct_view_ref::used_tables() const
{
  DBUG_ASSERT(fixed);
 
  if (get_depended_from())
    return OUTER_REF_TABLE_BIT;
 
  if (view->is_merged_derived() || view->merged || !view->table)
  {
    table_map used= (*ref)->used_tables();
    return (used ?
            used :
            ((null_ref_table != NO_NULL_TABLE) ?
             null_ref_table->map :
             (table_map)0 ));
  }
  return view->table->map;
}

is called to get the value for keyuse_ext.used_tables.
As get_depended_from() is NULL we come to the line

    table_map used= (*ref)->used_tables();

and Item_field::used_tables() is called for the the field abc of the table t16 (dt has been merged merged!):

table_map Item_field::used_tables() const
{
  if (field->table->const_table)
    return 0;					// const item
  return (get_depended_from() ? OUTER_REF_TABLE_BIT : field->table->map);
}

As with the data set of the test case table t16 happens to be a constant table the call returns 0.
At the same time t16 happens to be an inner table of an outer join and null_ref_table != NO_NULL_TABLE for this table. As a result 0x8 is returned by the call of Item_direct_view_ref::used_tables() though OUTER_REF_TABLE_BIT is expected.

Comment by Igor Babaev [ 2023-01-16 ]

Note that if we change the used data set:

INSERT INTO t14 VALUES (6);
INSERT INTO t13 VALUES (6,4,6);
INSERT INTO t16 SELECT id+10, id1+10 FROM t16;
INSERT INTO t16 SELECT id+20, id1+20 FROM t16;
INSERT INTO t16 SELECT id+40, id1+40 FROM t16;
ANALYZE TABLE t3,t13,t14,t16,t21;

and run the query:

SELECT t21.*, t14.id AS t14_id, t13.id AS t13_id, dt.*, v1.*
FROM 
  t14, t13, t21
  LEFT JOIN
  (SELECT t3.* FROM t3 WHERE t3.abc < 40) dt
  ON t21.oid = dt.id AND t21.otid = 14
  LEFT JOIN v1
  ON (v1.id1 = dt.abc)
WHERE t21.oid = t14.id AND t21.oid = t13.id AND t21.wid <= 7;

we have no problems as t16 is not constant in the join and the calls of Item_field::used_tables() and of
Item_direct_view_ref::used_tables() returns expected OUTER_REF_TABLE_BIT for dt.abc

MariaDB [test]> SELECT t21.*, t14.id AS t14_id, t13.id AS t13_id, dt.*, v1.*
    -> FROM 
    ->   t14, t13, t21
    ->   LEFT JOIN
    ->   (SELECT t3.* FROM t3 WHERE t3.abc < 40) dt
    ->   ON t21.oid = dt.id AND t21.otid = 14
    ->   LEFT JOIN v1
    ->   ON (v1.id1 = dt.abc)
    -> WHERE t21.oid = t14.id AND t21.oid = t13.id AND t21.wid <= 7;
+-----+------+------+------+--------+--------+------+------+------+
| wid | wtid | otid | oid  | t14_id | t13_id | id   | abc  | id1  |
+-----+------+------+------+--------+--------+------+------+------+
|   6 |   30 |    6 |    6 |      6 |      6 | NULL | NULL | NULL |
|   7 |   17 |    7 |    7 |      7 |      7 | NULL | NULL | NULL |
+-----+------+------+------+--------+--------+------+------+------+
 
MariaDB [test]> EXPLAIN
    -> SELECT t21.*, t14.id AS t14_id, t13.id AS t13_id, dt.*, v1.*
    -> FROM 
    ->   t14, t13, t21
    ->   LEFT JOIN
    ->   (SELECT t3.* FROM t3 WHERE t3.abc < 40) dt
    ->   ON t21.oid = dt.id AND t21.otid = 14
    ->   LEFT JOIN v1
    ->   ON (v1.id1 = dt.abc)
    -> WHERE t21.oid = t14.id AND t21.oid = t13.id AND t21.wid <= 7;
+------+-----------------+------------+--------+---------------+---------+---------+--------------+------+------------------------------------+
| id   | select_type     | table      | type   | possible_keys | key     | key_len | ref          | rows | Extra                              |
+------+-----------------+------------+--------+---------------+---------+---------+--------------+------+------------------------------------+
|    1 | PRIMARY         | t21        | range  | PRIMARY,oid   | PRIMARY | 4       | NULL         |    2 | Using index condition; Using where |
|    1 | PRIMARY         | t13        | eq_ref | PRIMARY       | PRIMARY | 4       | test.t21.oid |    1 | Using index                        |
|    1 | PRIMARY         | t14        | eq_ref | PRIMARY       | PRIMARY | 4       | test.t21.oid |    1 | Using index                        |
|    1 | PRIMARY         | t3         | eq_ref | PRIMARY,abc   | PRIMARY | 4       | test.t21.oid |    1 | Using where                        |
|    1 | PRIMARY         | <derived3> | ref    | key0          | key0    | 5       | test.t3.abc  |    2 | Using where                        |
|    3 | LATERAL DERIVED | t16        | ref    | id1           | id1     | 5       | test.t3.abc  |    1 | Using index                        |
+------+-----------------+------------+--------+---------------+---------+---------+--------------+------+------------------------------------+

Comment by Igor Babaev [ 2023-01-16 ]

At the first glance the patch
https://github.com/MariaDB/server/commit/c8e7288e7332962bdebc5ae658bbd839d0b26254
fixes the problem as the call of Item_direct_view_ref::used_tables() returns the expected OUTER_REF_TABLE_BIT. However if we consider this simple query where no mergeable derived table is used :

select * from t3, v1 where t3.id=7 and v1.id1=t3.abc;

keyuse_ext.used_tables is set to 0 rather than to OUTER_REF_TABLE_BIT as for the equality v1.id1=dt.abc. This is not good.

Comment by Igor Babaev [ 2023-01-17 ]

The function add_ext_keyuse_for_splitting() is called indirectly when make_join_statistics() is executed for the top join of the query. It is called at the moment when all constant tables have been already substituted in the join. It means that that all references to the columns of these tables have been already replaced with its values. So the return value 0 rather than OUTER_REF_TABLE_BIT by the function Item_field::used_tables() for the fields from constant tables looks quite correct. Now we have to ensure that if such Item_field item is wrapped into Item_direct_view_ref then the method used_tables() for the wrapper would return 0 as well. This can be achieved with the following patch:

diff --git a/sql/item.cc b/sql/item.cc
index 02220fd..9ecbefa 100644
--- a/sql/item.cc
+++ b/sql/item.cc
@@ -10838,7 +10838,7 @@ table_map Item_direct_view_ref::used_tables() const
     table_map used= (*ref)->used_tables();
     return (used ?
             used :
-            ((null_ref_table != NO_NULL_TABLE) ?
+            (null_ref_table != NO_NULL_TABLE && !null_ref_table->const_table ?
              null_ref_table->map :
              (table_map)0 ));
   }

Comment by Igor Babaev [ 2023-01-17 ]

It would be better if the test case provided with the fix looked like this:

  CREATE TABLE t1 ( id int PRIMARY KEY ) ENGINE=MyISAM;
INSERT INTO t1 VALUES (3),(4),(7);
 
CREATE TABLE t2 (
  id int,  id1 int, wid int, PRIMARY KEY (id), KEY (id1), KEY (wid)
) ENGINE=MyISAM;
INSERT INTO t2 VALUES (4,4,6),(7,7,7);
 
CREATE TABLE t3 (
  wid int, wtid int, otid int, oid int,
  PRIMARY KEY (wid), KEY (wtid), KEY (otid), KEY (oid)
) ENGINE=MyISAM;
INSERT INTO t3 VALUES (6,30,6,6),(7,17,7,7);
 
CREATE TABLE t4 ( id int, a int, PRIMARY KEY (id), KEY (a) )  ENGINE=MyISAM;
INSERT INTO t4 VALUES (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
CREATE TABLE t5 (
  id int, id1 int, PRIMARY KEY (id), KEY id1 (id1)
) ENGINE=MyISAM ;
INSERT INTO t5 VALUES (1,17),(2,15),(3,49),(4,3),(5,45),(6,38),(7,17);
 
ANALYZE TABLE t1,t2,t3,t4,t5;
 
CREATE VIEW v1 AS (SELECT id1 FROM t5 GROUP BY id1);
 
SELECT t3.*, t1.id AS t1_id, t2.id AS t2_id, dt.*, v1.*
FROM 
  t1, t2, t3
  LEFT JOIN
  (SELECT t4.* FROM t4 WHERE t4.a=3) dt
  ON t3.oid = dt.id AND t3.otid = 14
  LEFT JOIN v1
  ON (v1.id1 = dt.a)
WHERE t3.oid = t1.id AND t3.oid = t2.id AND t3.wid = 7;
 
EXPLAIN
SELECT t3.*, t1.id AS t1_id, t2.id AS t2_id, dt.*, v1.*
FROM 
  t1, t2, t3
  LEFT JOIN
  (SELECT t4.* FROM t4 WHERE t4.a=3) dt
  ON t3.oid = dt.id AND t3.otid = 14
  LEFT JOIN v1
  ON (v1.id1 = dt.a)
WHERE t3.oid = t1.id AND t3.oid = t2.id AND t3.wid = 7;
 
DROP VIEW v1;
DROP TABLE t1,t2,t3,t4,t5;

Comment by Igor Babaev [ 2023-01-17 ]

psergei
See my analysis in my comments. If you don't mind I could take this bug.

Comment by Sergei Petrunia [ 2023-01-18 ]

igor yes, please take over.

Comment by Oleksandr Byelkin [ 2023-01-24 ]

OK to push

Comment by Igor Babaev [ 2023-01-25 ]

A fix for this bug was pushed into 10.3. It should be merged upstream as it is.

Generated at Thu Feb 08 10:13:30 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.