[MDEV-25269] Crash: exception 0xc0000005 Created: 2021-03-26  Updated: 2021-12-14  Resolved: 2021-12-14

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.5.9
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Clark Merchant Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates MDEV-24953 10.5.9 crashes with large IN() list Closed
duplicates MDEV-24995 Server crash on query Closed

 Description   

Getting a reproducible crash that can be generated with a specific query that we make on a proprietary set of database data (so I can't post details here yet). Storage engine is InnoDB. Appears to be a regression in 10.5.9, as the query works fine in 10.5.8 and in our current production release 10.1.10.

Query DOES have multiple large IN / NOT IN statements - possibly same thing as MDEV-24953 which presumably was related to whatever fix was pushed for MDEV-9750?

Should be noted that this particular query works fine for smaller data-sets (again, smaller IN/NOT IN) values but for a larger database, it crashes every time.

From the .err log:

Server version: 10.5.9-MariaDB
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=22
max_threads=65537
thread_count=3
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 154448 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x1b98a3bd028
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
server.dll!and_all_keys()[opt_range.cc:9816]
server.dll!key_and_with_limit()[opt_range.cc:10066]
server.dll!and_range_trees()[opt_range.cc:9151]
server.dll!tree_and()[opt_range.cc:9263]
server.dll!Item_cond_and::get_mm_tree()[opt_range.cc:8315]
server.dll!SQL_SELECT::test_quick_select()[opt_range.cc:2880]
server.dll!get_quick_record_count()[sql_select.cc:4763]
server.dll!make_join_statistics()[sql_select.cc:5494]
server.dll!JOIN::optimize_inner()[sql_select.cc:2256]
server.dll!JOIN::optimize()[sql_select.cc:1629]
server.dll!st_select_lex_unit::optimize()[sql_union.cc:2126]
server.dll!st_select_lex_unit::exec()[sql_union.cc:2157]
server.dll!mysql_union()[sql_union.cc:41]
server.dll!handle_select()[sql_select.cc:433]
server.dll!execute_sqlcom_select()[sql_parse.cc:6282]
server.dll!mysql_execute_command()[sql_parse.cc:3978]
server.dll!mysql_parse()[sql_parse.cc:8067]
server.dll!dispatch_command()[sql_parse.cc:1892]
server.dll!do_command()[sql_parse.cc:1370]
server.dll!threadpool_process_request()[threadpool_common.cc:363]
server.dll!tp_callback()[threadpool_common.cc:194]
ntdll.dll!RtlInitializeCriticalSection()
ntdll.dll!RtlReleaseSRWLockExclusive()
KERNEL32.DLL!BaseThreadInitThunk()
ntdll.dll!RtlUserThreadStart()

If we get time, we can try to generate a sanitized version of the dataset, but we're trying to get a relatively important release out the door. Can offer a remote session if it helps someone debug more quickly (and can show the query and results directly).



 Comments   
Comment by Alice Sherepa [ 2021-03-26 ]

This is the same problem as MDEV-24953 (MDEV-24995).
There is a workaround (to set optimizer_max_sel_arg_weight ), mentioned in MDEV-24953, that might help.

Comment by Clark Merchant [ 2021-03-26 ]

Can confirm that setting optimizer_max_sel_arg_weight to a number less than 16000 does work. I chose (arbitrarily) 15000, although some guidance as to a reasonable value would be helpful.

So it seems this is indeed a duplicate bug. Can you provide guidance on what feature was fixed/added for 10.5.9 that caused this to break? It is not clear in the linked bugs why this broke, and I want to hopefully not stumble into a different problem in 10.5.8 that this was ostensibly attempting to fix.

Generated at Thu Feb 08 09:36:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.