[MDEV-26249] Crash in in Explain_node::print_explain_for_children while writing to the slow query log Created: 2021-07-27  Updated: 2022-04-21  Resolved: 2022-01-20

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.3.22, 10.3, 10.4, 10.5, 10.6
Fix Version/s: 10.3.33, 10.4.23, 10.5.14, 10.6.6

Type: Bug Priority: Critical
Reporter: Valerii Kravchuk Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: excluded-select-and-explain, split_materialized

Issue Links:
Relates
relates to MDEV-26987 Server crashes in Explain_node::print... Closed
relates to MDEV-27957 Select from view with subselect fails... Closed

 Description   

The following crash happens:

210721 1:33:14 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.3.22-MariaDB-log
key_buffer_size=33554432
read_buffer_size=2097152
max_used_connections=205
max_threads=65546
thread_count=28
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 402770140 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f7d1cb5d4e8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f86307edd58 thread_stack 0x49000
mysys/stacktrace.c:270(my_print_stacktrace)[0x7f865dddcc5b]
sql/signal_handler.cc:209(handle_fatal_signal)[0x7f865d880387]
/lib64/libpthread.so.0(+0x3de160f7e0)[0x7f865cc5e7e0]
sql/sql_explain.cc:640(Explain_node::print_explain_for_children(Explain_query*, select_result_sink*, unsigned char, bool))[0x7f865d7aac1c]
sql/sql_explain.cc:825(Explain_select::print_explain(Explain_query*, select_result_sink*, unsigned char, bool))[0x7f865d7afea6]
sql/sql_explain.cc:262(Explain_query::print_explain_str(THD*, String*, bool))[0x7f865d7aa9fe]
sql/log.cc:3075(MYSQL_QUERY_LOG::write(THD*, long, char const*, unsigned long, unsigned long long, unsigned long long, bool, char const*, unsigned long))[0x7f865d93e897]
sql/log.cc:1086(Log_to_file_event_handler::log_slow(THD*, my_hrtime_t, char const*, unsigned long, unsigned long long, unsigned long long, bool, char const*, unsigned long))[0x7f865d93ec80]
sql/log.cc:1358(LOGGER::slow_log_print(THD*, char const*, unsigned long, unsigned long long))[0x7f865d9399bd]
sql/sql_parse.cc:2559(log_slow_statement(THD*))[0x7f865d6b638f]
sql/sql_class.h:1825(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x7f865d6c3de2]
sql/sql_parse.cc:1404(do_command(THD*))[0x7f865d6c560e]
sql/threadpool_common.cc:359(threadpool_process_request)[0x7f865d7c97c8]
sql/threadpool_generic.cc:1601(worker_main)[0x7f865d856438]
perfschema/pfs.cc:1872(pfs_spawn_thread)[0x7f865d9f90bd]
/lib64/libpthread.so.0(+0x3de1607aa1)[0x7f865cc56aa1]
/lib64/libc.so.6(clone+0x6d)[0x7f865bbd5aad]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f7d1cb45a50): SELECT * FROM ( SELECT ...  FROM t WHERE ... GROUP BY c1, c2 ) tmp WHERE 1=0 

EXPLAIN for the same SELECT does not give a crash. The table t is a partitioned InnoDB table, not a view, no union involved. So, this may be different from MDEV-23160



 Comments   
Comment by Alice Sherepa [ 2021-07-27 ]

I've repeated on 10.3-10.6, not reproducible on 10.2

-- source include/have_innodb.inc
 
SET GLOBAL slow_query_log = 1;
SET long_query_time = 0.000001;
SET log_slow_verbosity = 'explain';
 
CREATE TABLE t1 ( id varchar(50), KEY (id)) engine=innodb;
 
SELECT * FROM (SELECT id FROM t1 GROUP BY id) dt WHERE 1=0;
 
select 1;

10.3 f52d39369a6e3f1a9ce8918

210727 10:03:36 [ERROR] mysqld got signal 11 ;
 
Server version: 10.3.31-MariaDB-debug-log
 
mysys/stacktrace.c:174(my_print_stacktrace)[0x564982463b45]
sql/signal_handler.cc:221(handle_fatal_signal)[0x5649810bc898]
sigaction.c:0(__restore_rt)[0x7fd1441463c0]
sql/sql_explain.cc:640(Explain_node::print_explain_for_children(Explain_query*, select_result_sink*, unsigned char, bool))[0x564980d8ba3d]
sql/sql_explain.cc:824(Explain_select::print_explain(Explain_query*, select_result_sink*, unsigned char, bool))[0x564980d8cd94]
sql/sql_explain.cc:208(Explain_query::print_explain(select_result_sink*, unsigned char, bool))[0x564980d88ae1]
sql/sql_explain.cc:262(Explain_query::print_explain_str(THD*, String*, bool))[0x564980d894b4]
sql/sql_explain.cc:247(print_explain_for_slow_log(LEX*, THD*, String*))[0x564980d89218]
sql/log.cc:3071(MYSQL_QUERY_LOG::write(THD*, long, char const*, unsigned long, unsigned long long, unsigned long long, bool, char const*, unsigned long))[0x56498139380e]
sql/log.cc:1084(Log_to_file_event_handler::log_slow(THD*, my_hrtime_t, char const*, unsigned long, unsigned long long, unsigned long long, bool, char const*, unsigned long))[0x56498138697d]
sql/log.cc:1357(LOGGER::slow_log_print(THD*, char const*, unsigned long, unsigned long long))[0x5649813886ce]
sql/log.cc:6550(slow_log_print(THD*, char const*, unsigned int, unsigned long long))[0x5649813a981f]
sql/sql_parse.cc:2555(log_slow_statement(THD*))[0x564980940a03]
sql/sql_parse.cc:2429(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x56498093fb87]
sql/sql_parse.cc:1398(do_command(THD*))[0x564980939018]
sql/sql_connect.cc:1403(do_handle_one_connection(CONNECT*))[0x564980d08403]
sql/sql_connect.cc:1309(handle_one_connection)[0x564980d07cbd]
perfschema/pfs.cc:1871(pfs_spawn_thread)[0x56498233b92f]
nptl/pthread_create.c:478(start_thread)[0x7fd14413a609]
x86_64/clone.S:97(__GI___clone)[0x7fd144061293]
 
Query (0x62b000000290): SELECT * FROM (SELECT id FROM t1 GROUP BY id) dt WHERE 1=0

Comment by Sergei Petrunia [ 2021-11-23 ]

Analysis:

The crash happens, because EXPLAIN data structure has the data for the select with id=1. It has a pointer to the child, select with id=2. However, there's no query plan for the select with id=2.

How do we end up in this situation?

  • Optimization starts for select with id=2.
    • The select is set for two-phase optimization, because it has a potential LATERAL DERIVED plan.
  • Optimization starts for select with id=1.
    • The optimizer arrives at "Impossible WHERE" and short-cuts the optimization.

Thus, phase #2 of the optimization of select with id=2 is never done.

Comment by Sergei Petrunia [ 2021-11-23 ]

If I change "1=0" in the WHERE to some condition that can be satisfied (e.g. 1>0), I can see the optimization of the second select to be finished here:

(gdb) p select_lex->select_number
  $23 = 2
(gdb) wher
  #0  JOIN::optimize_stage2 (this=0x7fff98017338) at /home/psergey/dev-git/10.3/sql/sql_select.cc:2024
  #1  0x0000555555c96de8 in JOIN::optimize (this=0x7fff98017338) at /home/psergey/dev-git/10.3/sql/sql_select.cc:1512
  #2  0x0000555555c0103d in mysql_derived_optimize (thd=0x7fff98000d50, lex=0x7fff98004b40, derived=0x7fff98016300) at /home/psergey/dev-git/10.3/sql/sql_derived.cc:962
  #3  0x0000555555bff1ab in mysql_handle_single_derived (lex=0x7fff98004b40, derived=0x7fff98016300, phases=4) at /home/psergey/dev-git/10.3/sql/sql_derived.cc:193
  #4  0x0000555555d82116 in TABLE_LIST::handle_derived (this=0x7fff98016300, lex=0x7fff98004b40, phases=4) at /home/psergey/dev-git/10.3/sql/table.cc:8414
  #5  0x0000555555c19810 in LEX::handle_list_of_derived (this=0x7fff98004b40, table_list=0x7fff98016300, phases=4) at /home/psergey/dev-git/10.3/sql/sql_lex.h:4028
  #6  0x0000555555c259d0 in st_select_lex::handle_derived (this=0x7fff98005398, lex=0x7fff98004b40, phases=4) at /home/psergey/dev-git/10.3/sql/sql_lex.cc:4180
  #7  0x0000555555c98b85 in JOIN::optimize_stage2 (this=0x7fff98016ce8) at /home/psergey/dev-git/10.3/sql/sql_select.cc:2027
  #8  0x0000555555c98a41 in JOIN::optimize_inner (this=0x7fff98016ce8) at /home/psergey/dev-git/10.3/sql/sql_select.cc:2003

That is, JOIN::optimize_stage2() is called for select with id=1, and it makes this call:

  if (select_lex->handle_derived(thd->lex, DT_OPTIMIZE))
    DBUG_RETURN(1);

which calls JOIN::optimize() for the subquery which figures that phase #1 of the optimization is done and finishes with phase #2.

Comment by Sergei Petrunia [ 2021-11-23 ]

There are two ways one can fix this.

When the select is optimized to "Impossible WHERE" (or some other reason why the select is degenerate), we can:

A. "Complete" the optimization of derived tables.
B. Do not put the links to derived table suqueries into the EXPLAIN data structure.

B. seems to make more sense.

Comment by Sergei Petrunia [ 2021-11-23 ]

bb-10.3-mdev26249. http://lists.askmonty.org/pipermail/commits/2021-November/014782.html

Comment by Sergei Krivonos (Inactive) [ 2021-11-24 ]

Reviewing it. Which test causes the stacktrace?

Comment by Sergei Petrunia [ 2021-11-25 ]

sergei.krivonos Ok, I've accidentally committed a wrong testcase, it has " 1>0" instead of "1=0".

Now pushed the fixed one: bb-10.3-mdev26249 , http://lists.askmonty.org/pipermail/commits/2021-November/014786.html

Comment by Sergei Petrunia [ 2021-11-25 ]

Got this input:

Igor> re MDEV-26249 it's not clear why if we have regular subqueries (i.g. in WHERE)

Igor> we produce plans for them even in the case when we have 'Impossible
where' in the main query. And you suggest not to produced plans subqueries in
FROM lists

Igor> Looks asymetrical.

My response is that we already do not show:

  • hanging CTEs
  • eliminated tables
  • subqueries that are in the ON clauses of eliminated tables.

It seems to me that the direction "do not show unused parts of the query" (listed above as B) is better than A.

Comment by Sergei Petrunia [ 2021-11-25 ]

=== PostgreSQL ===
 
test=# create table t10 (a int, b int);
CREATE TABLE
test=# insert into t10 values (1,1),(2,2),(3,3);
INSERT 0 3
 
test=# create table t11 (a int, b int);
CREATE TABLE
test=# insert into t11 values (1,1),(2,2),(3,3);
INSERT 0 3
test=# explain select * from t10 where 1>2 and t10.a > (select a from t11 where t11.b> t10.b)
test-# ;
                QUERY PLAN                
------------------------------------------
 Result  (cost=0.00..0.00 rows=0 width=0)
   One-Time Filter: false
(2 rows)
 
 
=== MySQL 8 ====
 
create table t10 (a int, b int);
create table t11 (a int, b int);
insert into t10 values (1,1),(2,2),(3,3);
insert into t11 values (1,1),(2,2),(3,3);
 
mysql> explain select * from t10 where 1>2 and t10.a > (select a from t11 where t11.b> t10.b);
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra            |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
|  1 | SIMPLE      | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | Impossible WHERE |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
1 row in set, 2 warnings (0.00 sec)
 
mysql> explain format=json select * from t10 where 1>2 and t10.a > (select a from t11 where t11.b> t10.b);
+----------------------------------------------------------------------------------+
| EXPLAIN                                                                          |
+----------------------------------------------------------------------------------+
| {
  "query_block": {
    "select_id": 1,
    "message": "Impossible WHERE"
  }
} |
+----------------------------------------------------------------------------------+
1 row in set, 2 warnings (0.00 sec)

Comment by Sergei Krivonos (Inactive) [ 2021-12-20 ]

looks ok to push

Comment by Sergei Petrunia [ 2022-01-19 ]

bb-10.3-mdev26249-v2, http://lists.askmonty.org/pipermail/commits/2022-January/014858.html - this is an alternative patch based on the the review input. igor, please review.

Comment by Sergei Petrunia [ 2022-01-19 ]

Updated patch: http://lists.askmonty.org/pipermail/commits/2022-January/014860.html

Comment by Igor Babaev [ 2022-01-19 ]

Ok to push into 10.3

Generated at Thu Feb 08 09:43:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.