[MDEV-26446] Crash on st_join_table::save_explain_data Created: 2021-08-20  Updated: 2023-09-05  Resolved: 2023-07-31

Status: Closed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.4.21, 10.5.12, 10.6.4, 10.4, 10.5, 10.6
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Przemek Assignee: Valerii Kravchuk
Resolution: Incomplete Votes: 1
Labels: None

Attachments: File msandbox.err     File mycrash_1.sql    
Issue Links:
Problem/Incident
is caused by MDEV-27262 Unexpected index intersection with fu... Closed
Relates
relates to MDEV-19720 Server crash in st_join_table::save_e... Stalled
relates to MDEV-30651 SIGSEGV in st_join_table::save_explai... Open

 Description   

Simple SELECT and also just EXPLAIN crash MariaDB server in certain data condition.
Example crash trace:

stack_bottom = 0x7f982069adb0 thread_stack 0x49000
mysys/stacktrace.c:213(my_print_stacktrace)[0x55972f6abc1e]
sql/signal_handler.cc:224(handle_fatal_signal)[0x55972f088aa7]
sigaction.c:0(__restore_rt)[0x7f98364b73c0]
sql/sql_select.cc:26822(st_join_table::save_explain_data(Explain_table_access*, unsigned long long, bool, st_join_table*))[0x55972eecb64f]
sql/sql_select.cc:27378(JOIN::save_explain_data_intern(Explain_query*, bool, bool, bool, char const*))[0x55972eecd18b]
sql/sql_select.cc:4256(JOIN::save_explain_data(Explain_query*, bool, bool, bool, bool))[0x55972eecd2f0]
sql/sql_select.cc:1617(JOIN::build_explain())[0x55972eecd54c]
sql/sql_select.cc:1674(JOIN::optimize())[0x55972eee2091]
sql/sql_select.cc:4758(mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x55972eee2198]
sql/sql_select.cc:27511(mysql_explain_union(THD*, st_select_lex_unit*, select_result*))[0x55972eee2ff2]
sql/sql_parse.cc:6256(execute_sqlcom_select(THD*, TABLE_LIST*))[0x55972ed45807]
sql/sql_parse.cc:4005(mysql_execute_command(THD*))[0x55972ee87fce]
sql/sql_parse.cc:8100(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55972ee8a35c]
sql/sql_parse.cc:1951(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x55972ee8cd5a]
sql/sql_parse.cc:1375(do_command(THD*))[0x55972ee8e07f]
sql/sql_connect.cc:1418(do_handle_one_connection(CONNECT*, bool))[0x55972ef797e2]
sql/sql_connect.cc:1318(handle_one_connection)[0x55972ef79ab4]
perfschema/pfs.cc:2204(pfs_spawn_thread)[0x55972f30087d]
nptl/pthread_create.c:478(start_thread)[0x7f98364ab609]
??:0(clone)[0x7f9836099293]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f97a4012940): EXPLAIN SELECT COUNT(*) FROM mycrash WHERE (((p = 'foo' AND er != 4) OR er = 4 )) AND ((es >= 4)) limit 1

Consistently reproducible with the example SQL dump attached. Sometimes the EXPLAIN needs to be run 3-4 times before it starts crashing.
Reproduced on latest 10.4, 10.5 and 10.6 versions with:

$ dbdeployer deploy single 10.6.4
$ msb_10_6_4/use test < mycrash_1.sql
$ msb_10_6_4/use test 
mysql [localhost:10604] {msandbox} (test) > SELECT COUNT(*) FROM mycrash WHERE (((p = 'foo' AND er != 4) OR er = 4 )) AND ((es >= 4)) limit 1;
ERROR 2013 (HY000): Lost connection to server during query

Problem similar I guess to MDEV-19720 and MDEV-22160.
I am very interested if there is any workaround to avoid these crashes.



 Comments   
Comment by Sami Ahlroos [ 2021-08-20 ]

set optimizer_switch="index_merge_sort_intersection=on"; seems to avoid this crash:

mysql> set optimizer_switch="index_merge_sort_intersection=on";
Query OK, 0 rows affected (0.000 sec)
 
mysql> SELECT COUNT(*) FROM mycrash WHERE (((p = 'foo' AND er != 4) OR er = 4 )) AND ((es >= 4)) limit 1;
+----------+
| COUNT(*) |
+----------+
|     2456 |
+----------+
1 row in set (0.027 sec)

Comment by Alice Sherepa [ 2021-08-24 ]

Thanks!
I repeated as described on 10.4-10.6:

10.4 2b66cd249384221e2a892f

 
mysqld: /10.4/src/sql/sql_select.cc:1744: bool JOIN::make_range_rowid_filters(): Assertion `sel->quick' failed.
210824 11:46:37 [ERROR] mysqld got signal 6 ;
 
Server version: 10.4.22-MariaDB-debug-log
 
:0(__GI___assert_fail)[0x7ffb50852f36]
sql/sql_select.cc:1746(JOIN::make_range_rowid_filters())[0x55ec5c78bbbd]
sql/sql_select.cc:2366(JOIN::optimize_stage2())[0x55ec5c792818]
sql/sql_select.cc:2342(JOIN::optimize_inner())[0x55ec5c792245]
sql/sql_select.cc:1659(JOIN::optimize())[0x55ec5c78ae38]
sql/sql_select.cc:4740(mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x55ec5c7abb5f]
sql/sql_select.cc:436(handle_select(THD*, LEX*, select_result*, unsigned long))[0x55ec5c77ce9a]
sql/sql_parse.cc:6449(execute_sqlcom_select(THD*, TABLE_LIST*))[0x55ec5c6ebdea]
sql/sql_parse.cc:3963(mysql_execute_command(THD*))[0x55ec5c6d96b7]
sql/sql_parse.cc:7995(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55ec5c6f52d7]
sql/sql_parse.cc:1860(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x55ec5c6cba53]
sql/sql_parse.cc:1373(do_command(THD*))[0x55ec5c6c84d0]
sql/sql_connect.cc:1420(do_handle_one_connection(CONNECT*))[0x55ec5cac1b39]
sql/sql_connect.cc:1317(handle_one_connection)[0x55ec5cac1292]
perfschema/pfs.cc:1871(pfs_spawn_thread)[0x55ec5e192df9]
nptl/pthread_create.c:478(start_thread)[0x7ffb50d69609]
x86_64/clone.S:97(__GI___clone)[0x7ffb5093e293]
 
Query (0x62b0000a1290): SELECT COUNT(*) FROM mycrash WHERE (((p = 'foo' AND er != 4) OR er = 4 )) AND ((es >= 4)) limit 1

Comment by Igor Babaev [ 2021-12-14 ]

With the setting

set optimizer_switch="index_merge_sort_intersection=on";

the optimizer chooses for the query

SELECT COUNT(*) FROM mycrash WHERE (((p = 'foo' AND er != 4) OR er = 4 )) AND ((es >= 4)) limit 1;

the following execution plan

MariaDB [test]> EXPLAIN EXTENDED SELECT COUNT(*) FROM mycrash WHERE ((p = 'foo' AND er != 4) OR er = 4) AND (es >= 4) limit 1;
+------+-------------+---------+-------------+---------------+-------+---------+------+------+----------+------------------------------------------+
| id   | select_type | table   | type        | possible_keys | key   | key_len | ref  | rows | filtered | Extra                                    |
+------+-------------+---------+-------------+---------------+-------+---------+------+------+----------+------------------------------------------+
|    1 | SIMPLE      | mycrash | index_merge | es,er,p       | er,es | 0,1     | NULL | 1831 |   100.00 | Using sort_intersect(er,es); Using where |
+------+-------------+---------+-------------+---------------+-------+---------+------+------+----------+------------------------------------------+

This already looks strange because the condition

 
 ((p = 'foo' AND er != 4) OR er = 4)

is equivalent to the condition

 ((p = 'foo' OR er=4) AND (er != 4 OR er = 4))

The latter is equivalent to

(p = 'foo' OR er=4) AND TRUE

Thus using index er does not make sense for index intersection.
Such plan we have for 10.2 and and higher versions.

Comment by Igor Babaev [ 2021-12-20 ]

Here's the explanation how we come to an assertion failure when executing

SELECT COUNT(*) FROM mycrash WHERE (((p = 'foo' AND er != 4) OR er = 4 )) AND ((es >= 4)) limit 1;

with the optimizer switch 'rowids_filter' set to 'on' .

First the optimizer tries to estimate the cardinality of ranges that could be used in ranges scans for all indexes. Due to the bug MDEV-27262 the full index scan for the index er is considered as a possible range scan. Due to a defect of the implementation of records_in_range() for InnoDB the cardinality of this range is estimated as 50% of the number records in the table (it is always so for big ranges). As a result the optimizer chooses to use a range filter over er. When trying to build this filter it discovers that there could be no range scan over er: with only one index er enabled the code of Range Optimizer does not consider the full index scan as range scan.

Comment by Igor Babaev [ 2021-12-25 ]

I will not close this bug until MDEV-27262 is merged into 10.4. After this I plan to add a test case showing that with rowid_filter='on' the optimizer te successfully produces a plan using a rowid filter for the above query.

Comment by Alice Sherepa [ 2023-01-13 ]

Curently no rowid filter is used for this query:

10.4 71e8e4934db06c02db1b51716

MariaDB [test]> explain extended SELECT COUNT(*) FROM mycrash WHERE (((p = 'foo' AND er != 4) OR er = 4 )) AND ((es >= 4)) limit 1;
+------+-------------+---------+-------+---------------+------+---------+------+------+----------+------------------------------------+
| id   | select_type | table   | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                              |
+------+-------------+---------+-------+---------------+------+---------+------+------+----------+------------------------------------+
|    1 | SIMPLE      | mycrash | range | es,er,p       | es   | 1       | NULL | 2947 |   100.00 | Using index condition; Using where |
+------+-------------+---------+-------+---------------+------+---------+------+------+----------+------------------------------------+
1 row in set, 1 warning (0,001 sec)
 
Note (Code 1003): select count(0) AS `COUNT(*)` from `test`.`mycrash` where (`test`.`mycrash`.`p` = 'foo' and `test`.`mycrash`.`er` <> 4 or `test`.`mycrash`.`er` = 4) and `test`.`mycrash`.`es` >= 4 limit 1

Comment by Roel Van de Paar [ 2023-02-14 ]

See newly linked related bugs

Generated at Thu Feb 08 09:45:23 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.