[MDEV-23723] Crash when test_if_skip_sort_order() is checked for derived table subject to split Created: 2020-09-11  Updated: 2021-09-02  Resolved: 2021-04-30

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.3.21, 10.3.23, 10.5.5, 10.3, 10.4, 10.5
Fix Version/s: 10.3.29, 10.4.19, 10.5.10, 10.6.1

Type: Bug Priority: Blocker
Reporter: Rob Schwyzer Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: split_materialized
Environment:

Tested on both AWS CentOS 7 VM and MariaDB Docker container


Issue Links:
Duplicate
is duplicated by MDEV-25442 Lateral derived optimization causes s... Closed
Relates
relates to MDEV-23192 Crash in row_search_mvcc() probably r... Closed

 Description   

Partial stack trace:

Thread 1 (Thread 0x7fbf90110700 (LWP 12354)):
#0  0x00007fbf973e5aa1 in __pthread_kill (threadid=<optimized out>, signo=11) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:59
        __arg2 = 12354
        _a3 = 11
        _a1 = 12335
        resultvar = <optimized out>
        __arg3 = 11
        __arg1 = 12335
        _a2 = 12354
        pd = <optimized out>
        tid = 12354
        val = <optimized out>
#1  0x0000556c5e03967e in handle_fatal_signal (sig=11) at /usr/src/debug/MariaDB-10.5.5/src_0/sql/signal_handler.cc:330
        curr_time = 1599787104
        tm = {tm_sec = 24, tm_min = 18, tm_hour = 1, tm_mday = 11, tm_mon = 8, tm_year = 120, tm_wday = 5, tm_yday = 254, tm_isdst = 0, tm_gmtoff = 0, tm_zone = 0x556c603ab540 "UTC"}
        thd = 0x7fbf58000da8
        print_invalid_query_pointer = false
#2  <signal handler called>
No locals.
#3  row_search_mvcc (buf=<optimized out>, mode=PAGE_CUR_GE, prebuilt=0x7fbf582b3430, match_mode=<optimized out>, direction=0) at /usr/src/debug/MariaDB-10.5.5/src_0/storage/innobase/row/row0sel.cc:4747
        heap = 0x0
        set_also_gap_locks = <optimized out>
        err = DB_SUCCESS
        offsets_ = {300, 0 <repeats 23 times>, 65535, 65535, 65535, 32751, 0, 0, 0, 0, 2112, 22541, 32703, 0, 65535, 65535, 65535, 32751, 3496, 22528, 32703, 0, 65535, 65535, 65535, 32751, 0, 0, 0, 0, 65535, 65535, 65535, 32751, 0, 0, 0, 16368, 29576, 22541, 32703, 0, 65535, 65535, 65535, 32751, 0, 0, 0, 0, 14544, 22541, 32703, 0 <repeats 13 times>, 56416, 36880, 32703, 0, 14544, 22541, 32703, 0, 56320, 36880, 32703, 0, 16024, 22541, 0, 0, 0, 0, 0, 0, 0, 0, 21868, 0, 56512, 22572, 32703, 0, 65533, 65535, 65535, 32751, 0, 0, 0, 16404, 0, 0, 0, 0, 62243, 37008, 30037, 16600, 0, 0, 60416, 16557, 0, 0, 0, 0, 0, 0, 0, 0, 4353, 24303, 21868, 0, 1, 0 <repeats 11 times>, 513, 0, 0, 0, 4528, 24303, 21868, 0, 0, 0, 0, 0, 0, 0, 0, 0, 55809, 36880, 32703, 0, 56160, 36880, 32703, 0, 56160, 36880, 32703, 0, 14544, 22541, 32703, 0, 2, 0, 0, 0, 3, 0, 0, 0, 56512, 22572, 32703, 0, 15976, 22541, 32703, 0, 56768, 36880, 32703, 0, 58923, 24040, 21868, 0, 16024...}
        offsets = 0x7fbf9010d9b0
        index = 0x7fbf58069690
        clust_index = <optimized out>
        mtr_has_extra_clust_latch = 0
        moves_up = <optimized out>
        table_lock_waited = 0
        search_tuple = <optimized out>
        pcur = <optimized out>
        trx = 0x7fbf7d91c1c0
        rec = 0x0
        vrow = 0x0
        unique_search = <optimized out>
        same_user_rec = 140460732503440
        need_vrow = <optimized out>
        did_semi_consistent_read = false
        comp = 1
        clust_rec = 0x7fbf9010d600 "؍\016X\277\177"
        next_buf = 0x0
        spatial_search = <optimized out>
        mtr = {m_last = 0x0, m_last_offset = 0, m_log_mode = 0, m_modifications = 0, m_made_dirty = 0, m_inside_ibuf = 0, m_freed_in_system_tablespace = 0, m_trim_pages = 0, m_memo = {m_heap = 0x0, m_list = {<ilist<mtr_buf_t::block_t, void>> = {sentinel_ = {next = 0x7fbf9010dc48, prev = 0x7fbf9010dc48}}, size_ = 1}, m_size = 0, m_first_block = {<ilist_node<void>> = {next = 0x7fbf9010dc28, prev = 0x7fbf9010dc28}, m_data = "\200\215\062^lU\000\000 \336\020\220\277\177\000\000\022 3^lU\000\000\310\344\rX\277\177\000\000\060\064+X\277\177\000\000\060\064+X\277\177\000\000x\237\aX\277\177\000\000\300\334\020\220\277\177\000\000C\275\364]lU\000\000\300\344\rX\277\177\000\000 \347\rX\277\177\000\000\260f\000X\277\177\000\000*\353`^lU\000\000\000\335\020\220\277\177\000\000\272\355\364]lU\000\000@\263\rX\277\177\000\000`\240\rX\277\177\000\000\000\335\020\220\277\177\000\000\033\354\364]lU\000\000\200\253\rX\277\177\000\000`\240\rX\006\000\000\000\060\337\020\220\277\177\000\000\252\377\347]lU\000\000\060\335\020\220\277\177\000\000\b\343\rX\277\177\000\000"..., m_used = 0}}, m_log = {m_heap = 0x0, m_list = {<ilist<mtr_buf_t::block_t, void>> = {sentinel_ = {next = 0x7fbf9010de78, prev = 0x7fbf9010de78}}, size_ = 1}, m_size = 0, m_first_block = {<ilist_node<void>> = {next = 0x7fbf9010de58, prev = 0x7fbf9010de58}, m_data = "04+X\277\177\000\000\060\064+X\277\177\000\000x\237\aX\277\177\000\000\060\064+X\277\177\000\000\002\000\000\000\000\000\000\000\220\226\006X\277\177\000\000P\022+X\277\177\000\000\320\340\020\220\277\177\000\000\ngA^lU\000\000|\000\000\000\277\177\000\000\220e\006X\277\177\000\000\000\000\000\000\277\177\000\000\000\000\000\000\000\000\000\000\003\000\000\000\000\000\000\300\370\067+X\277\177\000\000\070\314\rX\277\177\000\000\f\000\000\000\000\000\000\000\360\065+X\277\177\000\000\001", '\000' <repeats 23 times>, "\070\314\rX\277\177\000\000\r\271\361]lU\000\000\020\341\020\220\277\177\000\000\200\215\062^lU\000\000"..., m_used = 0}}, m_user_space = 0x0, m_commit_lsn = 0, m_freed_pages = 0x0}
        result_rec = <optimized out>
        row_sel_get_clust_rec_for_mysql = {cached_clust_rec = 0x0, cached_old_vers = 0x0}
        next_offs = <optimized out>

Part of query which crashes performs a JOIN of a real InnoDB table with a temporary table generated via CTE earlier in the query. The ON condition directly compares like columns from the real table to like columns from the temporary table with =. There are four such comparisons in the ON clause, joined together via AND logic.

Note that removing any one of the comparisons from the ON conditional returns NULL and avoids a crash. Due to the nature of AND, we expect the result of all four conditions being evaluated to also be NULL. Database is crashing while attempting to evaluate a condition it does not need to return a valid result for the given query.



 Comments   
Comment by Elena Stepanova [ 2020-09-14 ]

--source include/have_innodb.inc
 
CREATE TABLE t1 (a INT, b INT, KEY (a), KEY (a,b)) ENGINE=InnoDB;
CREATE TABLE t2 (c INT, KEY (c)) ENGINE=InnoDB;
 
SELECT * FROM t1 t1a JOIN t1 t1b;
 
INSERT INTO t2 VALUES (1),(2);
INSERT INTO t1 VALUES (1,2),(3,4),(5,6),(7,8),(9,10),(11,12);
 
WITH cte AS
   (SELECT t1.a, t1.b FROM t1 JOIN t2 ON t1.b = t2.c GROUP BY t1.a, t1.b)
 SELECT * FROM t1 JOIN cte ON t1.a = cte.a;
 
# Cleanup
DROP TABLE t1, t2;

10.3 non-debug 7e07e38c

#3  <signal handler called>
#4  row_search_mvcc (buf=buf@entry=0x7f3af4114898 "\377", mode=mode@entry=PAGE_CUR_GE, prebuilt=0x7f3af40d8868, match_mode=match_mode@entry=1, direction=direction@entry=0) at /data/src/10.3/storage/innobase/include/rem0rec.ic:577
#5  0x000055d8b6f078d9 in ha_innobase::index_read (this=0x7f3af40d5530, buf=0x7f3af4114898 "\377", key_ptr=<optimized out>, key_len=<optimized out>, find_flag=<optimized out>) at /data/src/10.3/storage/innobase/handler/ha_innodb.cc:9235
#6  0x000055d8b6d5b11d in handler::ha_index_read_map (this=0x7f3af40d5530, buf=0x7f3af4114898 "\377", key=0x7f3af411d6b8 "", keypart_map=0, find_flag=find_flag@entry=HA_READ_KEY_EXACT) at /data/src/10.3/sql/handler.cc:2904
#7  0x000055d8b6bac20c in join_read_always_key (tab=0x7f3af411b9c0) at /data/src/10.3/sql/sql_select.cc:20472
#8  0x000055d8b6b9818e in sub_select (end_of_records=false, join_tab=0x7f3af411b9c0, join=0x7f3af4014428) at /data/src/10.3/sql/sql_select.cc:19711
#9  sub_select (join=0x7f3af4014428, join_tab=0x7f3af411b9c0, end_of_records=false) at /data/src/10.3/sql/sql_select.cc:19649
#10 0x000055d8b6bc236e in do_select (procedure=<optimized out>, join=0x7f3af4014428) at /data/src/10.3/sql/sql_select.cc:19254
#11 JOIN::exec_inner (this=this@entry=0x7f3af4014428) at /data/src/10.3/sql/sql_select.cc:4116
#12 0x000055d8b6bc25e7 in JOIN::exec (this=this@entry=0x7f3af4014428) at /data/src/10.3/sql/sql_select.cc:3910
#13 0x000055d8b6bc2732 in mysql_select (thd=thd@entry=0x7f3af4000c48, tables=0x7f3af4010750, wild_num=0, fields=..., conds=0x7f3af411ab90, og_num=2, order=0x0, group=0x7f3af4012088, having=0x0, proc_param=0x0, select_options=2416184064, result=0x7f3af4014340, unit=0x7f3af400fd30, select_lex=0x7f3af400f910) at /data/src/10.3/sql/sql_select.cc:4315
#14 0x000055d8b6b244ac in mysql_derived_fill (thd=<optimized out>, lex=0x7f3af4004890, derived=<optimized out>) at /data/src/10.3/sql/sql_derived.cc:1156
#15 0x000055d8b6b2417c in mysql_handle_single_derived (lex=0x7f3af4004890, derived=derived@entry=0x7f3af4012b60, phases=phases@entry=96) at /data/src/10.3/sql/sql_derived.cc:199
#16 0x000055d8b6b97f50 in st_join_table::preread_init (this=this@entry=0x7f3af411b120) at /data/src/10.3/sql/sql_select.cc:12840
#17 0x000055d8b6b98130 in sub_select (end_of_records=false, join_tab=0x7f3af411b120, join=0x7f3af4013dd8) at /data/src/10.3/sql/sql_select.cc:19756
#18 sub_select (join=0x7f3af4013dd8, join_tab=0x7f3af411b120, end_of_records=false) at /data/src/10.3/sql/sql_select.cc:19649
#19 0x000055d8b6b8a479 in evaluate_join_record (join=join@entry=0x7f3af4013dd8, join_tab=join_tab@entry=0x7f3af411ad90, error=<optimized out>) at /data/src/10.3/sql/sql_select.cc:19934
#20 0x000055d8b6b981a3 in sub_select (end_of_records=false, join_tab=0x7f3af411ad90, join=0x7f3af4013dd8) at /data/src/10.3/sql/sql_select.cc:19714
#21 sub_select (join=0x7f3af4013dd8, join_tab=0x7f3af411ad90, end_of_records=false) at /data/src/10.3/sql/sql_select.cc:19649
#22 0x000055d8b6bc236e in do_select (procedure=<optimized out>, join=0x7f3af4013dd8) at /data/src/10.3/sql/sql_select.cc:19254
#23 JOIN::exec_inner (this=this@entry=0x7f3af4013dd8) at /data/src/10.3/sql/sql_select.cc:4116
#24 0x000055d8b6bc25e7 in JOIN::exec (this=this@entry=0x7f3af4013dd8) at /data/src/10.3/sql/sql_select.cc:3910
#25 0x000055d8b6bc2732 in mysql_select (thd=0x7f3af4000c48, tables=0x7f3af40124b8, wild_num=1, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7f3af4013db0, unit=0x7f3af4004950, select_lex=0x7f3af40050d8) at /data/src/10.3/sql/sql_select.cc:4315
#26 0x000055d8b6bc306b in handle_select (thd=thd@entry=0x7f3af4000c48, lex=lex@entry=0x7f3af4004890, result=result@entry=0x7f3af4013db0, setup_tables_done_option=setup_tables_done_option@entry=0) at /data/src/10.3/sql/sql_select.cc:370
#27 0x000055d8b6b53a81 in execute_sqlcom_select (thd=0x7f3af4000c48, all_tables=0x7f3af40124b8) at /data/src/10.3/sql/sql_parse.cc:6286
#28 0x000055d8b6b61556 in mysql_execute_command (thd=0x7f3af4000c48) at /data/src/10.3/sql/sql_parse.cc:3812
#29 0x000055d8b6b642f3 in mysql_parse (thd=0x7f3af4000c48, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>, is_com_multi=<optimized out>, is_next_command=<optimized out>) at /data/src/10.3/sql/sql_parse.cc:7810
#30 0x000055d8b6b666dd in dispatch_command (command=COM_QUERY, thd=0x7f3af4000c48, packet=<optimized out>, packet_length=<optimized out>, is_com_multi=<optimized out>, is_next_command=<optimized out>) at /data/src/10.3/sql/sql_class.h:1137
#31 0x000055d8b6b6867d in do_command (thd=0x7f3af4000c48) at /data/src/10.3/sql/sql_parse.cc:1393
#32 0x000055d8b6c4f016 in do_handle_one_connection (connect=connect@entry=0x55d8b98b86c8) at /data/src/10.3/sql/sql_connect.cc:1403
#33 0x000055d8b6c4f1ef in handle_one_connection (arg=arg@entry=0x55d8b98b86c8) at /data/src/10.3/sql/sql_connect.cc:1308
#34 0x000055d8b7224a06 in pfs_spawn_thread (arg=0x55d8b9b8d3c8) at /data/src/10.3/storage/perfschema/pfs.cc:1869
#35 0x00007f3b49ec5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#36 0x00007f3b49aba103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

10.3 debug 7e07e38c

mysqld: /data/src/10.3/sql/sql_select.cc:10057: bool create_ref_for_key(JOIN*, JOIN_TAB*, KEYUSE*, bool, table_map): Assertion `length > 0' failed.
200915  1:58:00 [ERROR] mysqld got signal 6 ;
 
#7  0x00007fa3d5187f36 in __GI___assert_fail (assertion=0x55ba1f72f245 "length > 0", file=0x55ba1f72da10 "/data/src/10.3/sql/sql_select.cc", line=10057, function=0x55ba1f72f250 "bool create_ref_for_key(JOIN*, JOIN_TAB*, KEYUSE*, bool, table_map)") at assert.c:101
#8  0x000055ba1eacff25 in create_ref_for_key (join=0x7fa3780177d0, j=0x7fa3780737c8, org_keyuse=0x7fa3780b41d8, allow_full_scan=false, used_tables=4611686018427387904) at /data/src/10.3/sql/sql_select.cc:10057
#9  0x000055ba1eaf05d4 in test_if_skip_sort_order (tab=0x7fa3780737c8, order=0x7fa3780155a8, select_limit=18446744073709551615, no_changes=false, map=0x7fa3780a9728) at /data/src/10.3/sql/sql_select.cc:22549
#10 0x000055ba1eab9859 in JOIN::optimize_stage2 (this=0x7fa3780177d0) at /data/src/10.3/sql/sql_select.cc:2604
#11 0x000055ba1eab5966 in JOIN::optimize (this=0x7fa3780177d0) at /data/src/10.3/sql/sql_select.cc:1490
#12 0x000055ba1ea1fa42 in mysql_derived_optimize (thd=0x7fa378000d90, lex=0x7fa378004b98, derived=0x7fa378015f08) at /data/src/10.3/sql/sql_derived.cc:939
#13 0x000055ba1ea1dc15 in mysql_handle_single_derived (lex=0x7fa378004b98, derived=0x7fa378015f08, phases=4) at /data/src/10.3/sql/sql_derived.cc:199
#14 0x000055ba1eba063e in TABLE_LIST::handle_derived (this=0x7fa378015f08, lex=0x7fa378004b98, phases=4) at /data/src/10.3/sql/table.cc:8363
#15 0x000055ba1ea37f8c in LEX::handle_list_of_derived (this=0x7fa378004b98, table_list=0x7fa378015860, phases=4) at /data/src/10.3/sql/sql_lex.h:4004
#16 0x000055ba1ea442c2 in st_select_lex::handle_derived (this=0x7fa3780053e0, lex=0x7fa378004b98, phases=4) at /data/src/10.3/sql/sql_lex.cc:4159
#17 0x000055ba1eab76bd in JOIN::optimize_stage2 (this=0x7fa378017180) at /data/src/10.3/sql/sql_select.cc:2001
#18 0x000055ba1eab756d in JOIN::optimize_inner (this=0x7fa378017180) at /data/src/10.3/sql/sql_select.cc:1977
#19 0x000055ba1eab599a in JOIN::optimize (this=0x7fa378017180) at /data/src/10.3/sql/sql_select.cc:1497
#20 0x000055ba1eabfa8e in mysql_select (thd=0x7fa378000d90, tables=0x7fa378015860, wild_num=1, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fa378017158, unit=0x7fa378004c58, select_lex=0x7fa3780053e0) at /data/src/10.3/sql/sql_select.cc:4301
#21 0x000055ba1eab114e in handle_select (thd=0x7fa378000d90, lex=0x7fa378004b98, result=0x7fa378017158, setup_tables_done_option=0) at /data/src/10.3/sql/sql_select.cc:370
#22 0x000055ba1ea77cfc in execute_sqlcom_select (thd=0x7fa378000d90, all_tables=0x7fa378015860) at /data/src/10.3/sql/sql_parse.cc:6286
#23 0x000055ba1ea6e463 in mysql_execute_command (thd=0x7fa378000d90) at /data/src/10.3/sql/sql_parse.cc:3812
#24 0x000055ba1ea7c066 in mysql_parse (thd=0x7fa378000d90, rawbuf=0x7fa378012ab8 "WITH cte AS\n(SELECT t1.a, t1.b FROM t1 JOIN t2 ON t1.b = t2.c GROUP BY t1.a, t1.b)\nSELECT * FROM t1 JOIN cte ON t1.a = cte.a", length=124, parser_state=0x7fa3ca2fb5c0, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:7810
#25 0x000055ba1ea68827 in dispatch_command (command=COM_QUERY, thd=0x7fa378000d90, packet=0x7fa378008f11 "WITH cte AS\n(SELECT t1.a, t1.b FROM t1 JOIN t2 ON t1.b = t2.c GROUP BY t1.a, t1.b)\nSELECT * FROM t1 JOIN cte ON t1.a = cte.a", packet_length=124, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:1847
#26 0x000055ba1ea671c7 in do_command (thd=0x7fa378000d90) at /data/src/10.3/sql/sql_parse.cc:1393
#27 0x000055ba1ebe4768 in do_handle_one_connection (connect=0x55ba214e1430) at /data/src/10.3/sql/sql_connect.cc:1403
#28 0x000055ba1ebe44c4 in handle_one_connection (arg=0x55ba214e1430) at /data/src/10.3/sql/sql_connect.cc:1308
#29 0x000055ba1f5ac2b1 in pfs_spawn_thread (arg=0x55ba214e8b00) at /data/src/10.3/storage/perfschema/pfs.cc:1869
#30 0x00007fa3d5699609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#31 0x00007fa3d5273103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Reproducible on 10.3-10.5 (also older releases of 10.3).
Not reproducible on 10.2.
The intermediate SELECT isn't there by mistake, on some reason it is important for reproducible the problem.

Comment by Marko Mäkelä [ 2020-11-30 ]

The test case is crashing for me already outside InnoDB:

10.5 e34e53b554e8c3e05e3bd918204cf7ba494b1ae3

Version: '10.5.9-MariaDB-debug-log'  socket: '/dev/shm/10.5m/mysql-test/var/tmp/mysqld.1.sock'  port: 16000  Source distribution
mariadbd: /mariadb/10.5m/sql/sql_select.cc:10719: bool create_ref_for_key(JOIN *, JOIN_TAB *, KEYUSE *, bool, table_map): Assertion `length > 0' failed.

The test passes for Aria, MyISAM, and RocksDB. For InnoDB, the `length` will remain 0 because of the following:

  else
  {
    keyparts=length=0;
    uint found_part_ref_or_null= 0;
    /*
      Calculate length for the used key
      Stop if there is a missing key part or when we find second key_part
      with KEY_OPTIMIZE_REF_OR_NULL
    */
    do
    {
      if (!(~used_tables & keyuse->used_tables) &&
          (!keyuse->validity_ref || *keyuse->validity_ref) &&
	  j->keyuse_is_valid_for_access_in_chosen_plan(join, keyuse))
 

The condition does not hold, because keyuse->validity_ref == false.

I do not know this code, so it needs to be debugged by an optimizer developer.

Comment by Sergei Petrunia [ 2020-12-01 ]

Working on Elena's testcase from https://jira.mariadb.org/browse/MDEV-23723?focusedCommentId=165953&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-165953 :

It fails an assert; it fails in the same way if I switch from using CTE to using a derived table:

SELECT * 
FROM 
  t1 JOIN 
  (SELECT t1.a, t1.b 
   FROM t1 JOIN t2 ON t1.b = t2.c 
   GROUP BY t1.a, t1.b
   ) as cte ON t1.a = cte.a;

Comment by Sergei Petrunia [ 2020-12-01 ]

Optimizer trace contents when the crash happens is:
https://gist.github.com/spetrunia/e6db650bb2d5cadc67027b376ce004f1

Comment by Sergei Petrunia [ 2020-12-01 ]

The problem is with the "Split Materialized" optimization.

The optimizer is trying to construct this query plan :

+------+-----------------+------------+-------+---------------+------+---------+-----------+------+----------------------------------------------+
| id   | select_type     | table      | type  | possible_keys | key  | key_len | ref       | rows | Extra                                        |
+------+-----------------+------------+-------+---------------+------+---------+-----------+------+----------------------------------------------+
|    1 | PRIMARY         | t1         | index | a,a_2         | a_2  | 10      | NULL      | 5    | Using where; Using index                     |
|    1 | PRIMARY         | <derived2> | ref   | key0          | key0 | 5       | test.t1.a | 2    |                                              |
|    2 | LATERAL DERIVED | t1         | ref   | a,a_2         | a    | 5       | test.t1.a | 1    | Using where; Using temporary; Using filesort |
|    2 | LATERAL DERIVED | t2         | ref   | c             | c    | 5       | test.t1.b | 1    | Using index                                  |
+------+-----------------+------------+-------+---------------+------+---------+-----------+------+----------------------------------------------+

Let's look at the table t1 in the subquery.
It has indexes:

CREATE TABLE t1 (a INT, b INT, KEY (a), KEY (a,b));

Split-Materialized code decides to use the index KEY(a) (I am not sure why it's not KEY(a,b), let's just accept the choice).

Then, subquery optimization arrives into test_if_skip_sort_order() function, where it tries to use the index KEY(a,b). However, the injected KEYUSE objects for that index are disabled (KEYUSE::validity_ref points to 0), so it fails to do that. This causes the assertion failure.

Comment by Sergei Petrunia [ 2020-12-01 ]

Questions:

  • Why does LATERAL DERIVED optimization pick KEY(a), not KEY(a,b) ?
  • Why does it need to do sorting at all? The subquery enumerating one GROUPBY group, so sorting should not be necessary at all?
  • Should sorting code avoid using keys for which KEYUSE objects are disabled? Or should all KEYUSEs be enabled?
Comment by Sergei Petrunia [ 2020-12-01 ]

On a side note, optimizer trace does not cover LATERAL DERIVED optimization at all. This is a gap in the optimizer trace and it should be fixed. (Filed MDEV-24325 for this)

Comment by Sergei Petrunia [ 2021-04-29 ]

Taking another look at this.

Why does LATERAL DERIVED optimization pick KEY(a), not KEY(a,b) ?

This is because column a can be "bound" by the top select with the equality "t1.a=cte.a", while column "b" cannot be.

Why does it need to do sorting at all? The subquery enumerating one GROUPBY group, so sorting should not be necessary at all?

As t1.b is not bound, the subquery will enumerate multiple GROUP BY groups.

Sorting is not necessary though, as grouping is done using temp.table (and not by the sort-then-group algorithm)

Should sorting code avoid using keys for which KEYUSE objects are disabled? Or should all KEYUSEs be enabled?

This query tries to switch from KEY(a) to KEY(a,b) while the plan to do splitting assumes KEY(a) is used.

We should either A. make adjustments to the splitting choice, or B. disallow changing the used index in such cases.

Comment by Sergei Petrunia [ 2021-04-29 ]

Patch implementing B: https://github.com/MariaDB/server/commit/3c564ae0e1235990794f2d0ab6c9b5760386d19f.diff

Igor, please review

Comment by Igor Babaev [ 2021-04-29 ]

Ok to push into 10.3 after addressing the notes in the review feedback sent by email in:
"Review feedback for the fix of MDEV-23723".

Generated at Thu Feb 08 09:24:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.