[MDEV-28268] Server crashes in Expression_cache_tracker::fetch_current_stats Created: 2022-04-08  Updated: 2022-04-29  Resolved: 2022-04-28

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.2, 10.3, 10.4, 10.5, 10.6, 10.7
Fix Version/s: 10.2.44, 10.3.35, 10.4.25, 10.5.16, 10.6.8, 10.7.4, 10.8.3

Type: Bug Priority: Blocker
Reporter: Elena Stepanova Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: regression

Issue Links:
Relates
relates to MDEV-28432 Subquery Items have wrong used_tables... Open

 Description   

colored text

CREATE TABLE t (a INT, b INT);
INSERT INTO t VALUES (1,2),(3,4);
 
ANALYZE FORMAT=JSON SELECT DISTINCT (SELECT MIN(a) FROM t WHERE b <= ANY (SELECT a FROM t)) AS f FROM t;
 
# Cleanup
DROP TABLE t;

10.2 debug 5a8766a9

#3  <signal handler called>
#4  0x0000557cd775fad1 in Expression_cache_tracker::fetch_current_stats (this=0x7f938418e870) at /data/src/10.2/sql/sql_expression_cache.h:97
#5  0x0000557cd77591aa in Explain_node::print_explain_json_cache (this=0x7f9384098448, writer=0x7f939490d4c0, is_analyze=true) at /data/src/10.2/sql/sql_explain.cc:685
#6  0x0000557cd77598f4 in Explain_select::print_explain_json (this=0x7f9384098448, query=0x7f9384017600, writer=0x7f939490d4c0, is_analyze=true) at /data/src/10.2/sql/sql_explain.cc:825
#7  0x0000557cd775911b in Explain_node::print_explain_json_for_children (this=0x7f938418e9c8, query=0x7f9384017600, writer=0x7f939490d4c0, is_analyze=true) at /data/src/10.2/sql/sql_explain.cc:671
#8  0x0000557cd775a321 in Explain_basic_join::print_explain_json_interns (this=0x7f938418e9c8, query=0x7f9384017600, writer=0x7f939490d4c0, is_analyze=true) at /data/src/10.2/sql/sql_explain.cc:1019
#9  0x0000557cd7759d49 in Explain_select::print_explain_json (this=0x7f938418e9c8, query=0x7f9384017600, writer=0x7f939490d4c0, is_analyze=true) at /data/src/10.2/sql/sql_explain.cc:913
#10 0x0000557cd7757594 in Explain_query::print_explain_json (this=0x7f9384017600, output=0x7f938418f010, is_analyze=true) at /data/src/10.2/sql/sql_explain.cc:224
#11 0x0000557cd77572d1 in Explain_query::send_explain (this=0x7f9384017600, thd=0x7f9384000d90) at /data/src/10.2/sql/sql_explain.cc:167
#12 0x0000557cd75d7b89 in execute_sqlcom_select (thd=0x7f9384000d90, all_tables=0x7f93840155a0) at /data/src/10.2/sql/sql_parse.cc:6283
#13 0x0000557cd75ce663 in mysql_execute_command (thd=0x7f9384000d90) at /data/src/10.2/sql/sql_parse.cc:3582
#14 0x0000557cd75db7e2 in mysql_parse (thd=0x7f9384000d90, rawbuf=0x7f9384012708 "ANALYZE FORMAT=JSON SELECT DISTINCT (SELECT MIN(a) FROM t WHERE b <= ANY (SELECT a FROM t)) AS f FROM t", length=103, parser_state=0x7f939490e5e0, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:7793
#15 0x0000557cd75c9b85 in dispatch_command (command=COM_QUERY, thd=0x7f9384000d90, packet=0x7f9384008b61 "", packet_length=103, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:1827
#16 0x0000557cd75c869e in do_command (thd=0x7f9384000d90) at /data/src/10.2/sql/sql_parse.cc:1381
#17 0x0000557cd77204a4 in do_handle_one_connection (connect=0x557cdab9c630) at /data/src/10.2/sql/sql_connect.cc:1336
#18 0x0000557cd7720218 in handle_one_connection (arg=0x557cdab9c630) at /data/src/10.2/sql/sql_connect.cc:1241
#19 0x0000557cd7f286eb in pfs_spawn_thread (arg=0x557cdab883d0) at /data/src/10.2/storage/perfschema/pfs.cc:1869
#20 0x00007f939ac76ea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#21 0x00007f939a87bdef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

10.2 relwithdebinfo asan 5a8766a9

==3523040==ERROR: AddressSanitizer: use-after-poison on address 0x6290000d7f78 at pc 0x564d49ef342c bp 0x7f6302886a30 sp 0x7f6302886a28
READ of size 8 at 0x6290000d7f78 thread T5
    #0 0x564d49ef342b in Expression_cache_tracker::fetch_current_stats() /data/src/10.2/sql/sql_expression_cache.h:97
    #1 0x564d49ef342b in Explain_node::print_explain_json_cache(Json_writer*, bool) /data/src/10.2/sql/sql_explain.cc:685
    #2 0x564d49effd21 in Explain_select::print_explain_json(Explain_query*, Json_writer*, bool) /data/src/10.2/sql/sql_explain.cc:825
    #3 0x564d49ef2c05 in Explain_node::print_explain_json_for_children(Explain_query*, Json_writer*, bool) /data/src/10.2/sql/sql_explain.cc:671
    #4 0x564d49effb3d in Explain_basic_join::print_explain_json_interns(Explain_query*, Json_writer*, bool) /data/src/10.2/sql/sql_explain.cc:1019
    #5 0x564d49f00174 in Explain_select::print_explain_json(Explain_query*, Json_writer*, bool) /data/src/10.2/sql/sql_explain.cc:913
    #6 0x564d49eeff49 in Explain_query::print_explain_json(select_result_sink*, bool) /data/src/10.2/sql/sql_explain.cc:215
    #7 0x564d49ef0899 in Explain_query::send_explain(THD*) /data/src/10.2/sql/sql_explain.cc:167
    #8 0x564d49b75959 in execute_sqlcom_select /data/src/10.2/sql/sql_parse.cc:6283
    #9 0x564d49b9bd2d in mysql_execute_command(THD*) /data/src/10.2/sql/sql_parse.cc:3582
    #10 0x564d49ba4537 in mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) /data/src/10.2/sql/sql_parse.cc:7793
    #11 0x564d49baaabe in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) /data/src/10.2/sql/sql_parse.cc:1827
    #12 0x564d49bae95d in do_command(THD*) /data/src/10.2/sql/sql_parse.cc:1381
    #13 0x564d49e91e76 in do_handle_one_connection(CONNECT*) /data/src/10.2/sql/sql_connect.cc:1336
    #14 0x564d49e925aa in handle_one_connection /data/src/10.2/sql/sql_connect.cc:1241
    #15 0x564d4b023284 in pfs_spawn_thread /data/src/10.2/storage/perfschema/pfs.cc:1869
    #16 0x7f630d492ea6 in start_thread nptl/pthread_create.c:477
    #17 0x7f630d097dee in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfddee)
 
0x6290000d7f78 is located 3448 bytes inside of 16352-byte region [0x6290000d7200,0x6290000db1e0)
allocated by thread T5 here:
    #0 0x7f630da09e8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x564d4b0db842 in my_malloc /data/src/10.2/mysys/my_malloc.c:101
    #2 0x564d4b0c76c3 in alloc_root /data/src/10.2/mysys/my_alloc.c:243
    #3 0x564d4b0c7ae7 in multi_alloc_root /data/src/10.2/mysys/my_alloc.c:312
    #4 0x564d49cb9816 in make_join_statistics /data/src/10.2/sql/sql_select.cc:3951
    #5 0x564d49cb9816 in JOIN::optimize_inner() /data/src/10.2/sql/sql_select.cc:1597
    #6 0x564d49cc7015 in JOIN::optimize() /data/src/10.2/sql/sql_select.cc:1127
    #7 0x564d49cc7015 in JOIN::optimize() /data/src/10.2/sql/sql_select.cc:1119
    #8 0x564d49ccf1ff in mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) /data/src/10.2/sql/sql_select.cc:3835
    #9 0x564d49ccff6a in handle_select(THD*, LEX*, select_result*, unsigned long) /data/src/10.2/sql/sql_select.cc:361
    #10 0x564d49b7628f in execute_sqlcom_select /data/src/10.2/sql/sql_parse.cc:6271
    #11 0x564d49b9bd2d in mysql_execute_command(THD*) /data/src/10.2/sql/sql_parse.cc:3582
    #12 0x564d49ba4537 in mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) /data/src/10.2/sql/sql_parse.cc:7793
    #13 0x564d49baaabe in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) /data/src/10.2/sql/sql_parse.cc:1827
    #14 0x564d49bae95d in do_command(THD*) /data/src/10.2/sql/sql_parse.cc:1381
    #15 0x564d49e91e76 in do_handle_one_connection(CONNECT*) /data/src/10.2/sql/sql_connect.cc:1336
    #16 0x564d49e925aa in handle_one_connection /data/src/10.2/sql/sql_connect.cc:1241
    #17 0x564d4b023284 in pfs_spawn_thread /data/src/10.2/storage/perfschema/pfs.cc:1869
    #18 0x7f630d492ea6 in start_thread nptl/pthread_create.c:477
 
Thread T5 created by T0 here:
    #0 0x7f630d9b52a2 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:214
    #1 0x564d4b02850a in spawn_thread_v1 /data/src/10.2/storage/perfschema/pfs.cc:1919
    #2 0x564d4999fe6f in inline_mysql_thread_create /data/src/10.2/include/mysql/psi/mysql_thread.h:1246
    #3 0x564d4999fe6f in create_thread_to_handle_connection(CONNECT*) /data/src/10.2/sql/mysqld.cc:6580
    #4 0x564d499b020c in create_new_thread /data/src/10.2/sql/mysqld.cc:6650
    #5 0x564d499b020c in handle_connections_sockets() /data/src/10.2/sql/mysqld.cc:6908
    #6 0x564d499b273f in mysqld_main(int, char**) /data/src/10.2/sql/mysqld.cc:6199
    #7 0x7f630cfc0d09 in __libc_start_main ../csu/libc-start.c:308
 
SUMMARY: AddressSanitizer: use-after-poison /data/src/10.2/sql/sql_expression_cache.h:97 in Expression_cache_tracker::fetch_current_stats()
Shadow bytes around the buggy address:
  0x0c5280012f90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c5280012fa0: 00 00 00 00 00 00 00 00 00 00 00 f7 f7 f7 f7 f7
  0x0c5280012fb0: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
  0x0c5280012fc0: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
  0x0c5280012fd0: f7 f7 f7 f7 f7 f7 f7 00 00 00 00 00 00 00 00 00
=>0x0c5280012fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f7[f7]
  0x0c5280012ff0: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
  0x0c5280013000: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
  0x0c5280013010: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 00 00 00 00 f7
  0x0c5280013020: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
  0x0c5280013030: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==3523040==ABORTING

The failure apparently started happening on 10.2 after this commit:

commit 3a52569499e2f0c4d1f25db1e81617a9d9755400
Author: Sergei Petrunia
Date:   Thu Feb 10 14:23:20 2022 +0300
 
    MDEV-25636: Bug report: abortion in sql/sql_parse.cc:6294
    
    The asserion failure was caused by this query

Set to critical as a recent regression.



 Comments   
Comment by Sergei Petrunia [ 2022-04-26 ]

Investigation.

in void Expression_cache_tmptable::init(), execution takes this branch:

    if (items.elements == 0)
    {
=>    DBUG_PRINT("info", ("All parameters were removed by optimizer."));
      DBUG_VOID_RETURN;
    }

Because of that Expression_cache_tmptable::cache_table is never initialized and remains NULL.

Then, in Expression_cache_tmptable::~Expression_cache_tmptable()
execution takes the else branch here:

  if (cache_table)
    disable_cache();
  else
  {
    update_tracker();
    tracker= NULL;
  }

Note that tracker->cache points to the Expression_cache_tmptable object.

The if-branch calls disable_cache() which clears tracker->cache:

void Expression_cache_tmptable::disable_cache()
{
  if (tracker)
    tracker->cache= NULL;
  ...
}

while the else branch doesn't.

Eventually we arrive at Explain_node::print_explain_json_cache which
calls

cache_tracker->fetch_current_stats();

which tries to access the cache which is deleted already:

  void fetch_current_stats()
  {
    if (cache)
      cache->update_tracker();
  }

Comment by Sergei Petrunia [ 2022-04-26 ]

https://github.com/MariaDB/server/commit/3c54d8f13a140723d74bc4686c95a093465f55a3

sanja, please review.

Comment by Sergei Petrunia [ 2022-04-26 ]

The above patch is useful (and needs a review), but how come a fix for MDEV-25636 has caused a crash like this?

Debugging this MDEV's testcase again, looking at the code touched by patch for MDEV-25636

CREATE TABLE t (a INT, b INT);
ANALYZE FORMAT=JSON
SELECT DISTINCT (
  SELECT MIN(a) FROM t WHERE b <= ANY (SELECT a FROM t)
) AS f FROM t;

and in st_select_lex::optimize_unflattened_subqueries(), I see that the subselect

  SELECT MIN(a) FROM t WHERE b <= ANY (SELECT a FROM t)

changes from being uncorrelated to being correlated!
Narrowing down the point where things go wrong:
they do in SELECT_LEX::update_used_tables():

  $15 = 0x5555570e3a40 <dbug_item_print_buf> "<nop>(<in_optimizer>(t.b,(subquery#3) >= <cache>(t.b)))"
(gdb) p/x join->conds->used_tables()
  $16 = 0x1
(gdb) next

this executes:

    join->conds->update_used_tables();

and then

(gdb) p/x join->conds->used_tables()
  $17 = 0x4000000000000001

Comment by Sergei Petrunia [ 2022-04-27 ]

This looks like another bug... to be fixed outside of the scope of this MDEV.

Comment by Oleksandr Byelkin [ 2022-04-27 ]

OK to push

Generated at Thu Feb 08 09:59:24 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.