[MDEV-9127] Crash reporter often fails to show the query that caused the crash Created: 2015-11-12  Updated: 2016-07-23  Resolved: 2016-07-12

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Affects Version/s: 10.1
Fix Version/s: 10.1.16, 10.0.27

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Vicențiu Ciorbaru
Resolution: Fixed Votes: 0
Labels: None


 Description   

For example, see

The first example has:

Query (0x7f97ab53b020): is an invalid pointer
Connection ID (thread ID): 591
Status: NOT_KILLED

The second one has:

Nov 18 16:22:21 thorstenknbl1 mysqld: Query (0x7f4ecd82d020): is an invalid pointer
Nov 18 16:22:21 thorstenknbl1 mysqld: Connection ID (thread ID): 6
Nov 18 16:22:21 thorstenknbl1 mysqld: Status: NOT_KILLED

Note that the second one is just an assertion failure. Stack or THD structure
are not corrupted. However, the query pointer is still not printed.



 Comments   
Comment by Sergei Petrunia [ 2015-11-12 ]

Interesting.. I rolled back the tree to revision f9448bcb2153aac614ecb260a4dd7f27b04f17a2 , running testcase from MDEV-7316.

Debug build crashes like this:

Server version: 10.1.3-MariaDB-wsrep-debug-log
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467319 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7f44f2361070
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f450f3bfe80 thread_stack 0x48000
addr2line: './mysqld': No such file
./mysqld(my_print_stacktrace+0x38)[0x7f450ed46a4f]
./mysqld(handle_fatal_signal+0x351)[0x7f450e713720]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10d10)[0x7f450c745d10]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f450bd89267]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7f450bd8aeca]
/lib/x86_64-linux-gnu/libc.so.6(+0x2e03d)[0x7f450bd8203d]
/lib/x86_64-linux-gnu/libc.so.6(+0x2e0f2)[0x7f450bd820f2]
./mysqld(+0x4dbf57)[0x7f450e53ff57]
./mysqld(_Z11choose_planP4JOINy+0x225)[0x7f450e53f447]
./mysqld(+0x4d47d7)[0x7f450e5387d7]
./mysqld(_ZN4JOIN14optimize_innerEv+0xfcd)[0x7f450e52e9a1]
./mysqld(_ZN4JOIN8optimizeEv+0x26)[0x7f450e52d8c6]
./mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x2c9)[0x7f450e535b16]
./mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x17c)[0x7f450e52baa7]
./mysqld(+0x4992f6)[0x7f450e4fd2f6]
./mysqld(_Z21mysql_execute_commandP3THD+0x160f)[0x7f450e4f3ada]
./mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x255)[0x7f450e500889]
./mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xcba)[0x7f450e4eff27]
./mysqld(_Z10do_commandP3THD+0x719)[0x7f450e4eed4c]
./mysqld(_Z24do_handle_one_connectionP3THD+0x1e6)[0x7f450e61d018]
./mysqld(handle_one_connection+0x33)[0x7f450e61cd70]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76aa)[0x7f450c73c6aa]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f450be5aeed]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f44de422088): SELECT * FROM t1, t2 WHERE b IN ( SELECT 2 UNION SELECT 3 ) AND a <> 'USARussian' AND c IS NULL
Connection ID (thread ID): 4
Status: NOT_KILLED

That is, the crashing query is visible.

Comment by Sergei Petrunia [ 2015-11-12 ]

Same on release build:

Server version: 10.1.3-MariaDB-wsrep-log
key_buffer_size=134217728
read_buffer_size=131072
....
Query (0x7efbd681f020): select * from t1, t2, t3  where t1.child_user_id=t3.id and t1.child_group_id is null and t2.lower_group_name='foo' and t1.parent_id=t2.id and t2.directory_id=10
Connection ID (thread ID): 4
Status: NOT_KILLED

Comment by Sergei Petrunia [ 2015-11-25 ]

A new example: MDEV-9187 . It is on RHEL. The crash is not easily repoducible, though.

Comment by Sergei Petrunia [ 2015-11-25 ]

For some reason, people dislike making core dumps in production (is it because mysqld process usually allocates a lot of memory and writing core dump will take a lot of time?).

Then, I've had an idea about attaching gdb as soon as we caught the SIGSEGV. I am not the first, see: https://github.com/pozdnychev/sigattach/blob/master/sigattach.c . Didn't try this, yet.

Comment by Sergei Petrunia [ 2015-12-03 ]

About debug info not being available for some packages: see MDEV-4646 (for RPMs) and MDEV-658 (for debs).

The issue with debug info is, however, orthogonal to the problem of why we keep getting Query (0x...): is an invalid pointer so often.

Comment by Vicențiu Ciorbaru [ 2016-05-30 ]

Hi Sergei,

Can you review a patch for this problem? The fix is targeted for 10.0. But I think it should work for any version as we're not introducing anything that is "unstable".

http://lists.askmonty.org/pipermail/commits/2016-May/009405.html

Thanks!

CC: psergey

Comment by Vicențiu Ciorbaru [ 2016-07-12 ]

Fixed with:
7d4a7d8c5861e6587176052ea71c30ab12a49084

Comment by Elena Stepanova [ 2016-07-22 ]

I don't see any improvement, we are still getting the same "invalid pointer" in reports from the real-life world:

Server version: 10.1.16-MariaDB-1~trusty
key_buffer_size=16777216
read_buffer_size=131072
max_used_connections=23
max_threads=502
thread_count=13
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1118998 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7f57a83d6008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f61375fedf0 thread_stack 0x80000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f65391b28fe]
/usr/sbin/mysqld(handle_fatal_signal+0x2d5)[0x7f6538cd9e75]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f653722f330]
/usr/sbin/mysqld(+0x4812d4)[0x7f6538baa2d4]
/usr/sbin/mysqld(_ZN13st_select_lex5printEP3THDP6String15enum_query_type+0x4b2)[0x7f6538baac02]
/usr/sbin/mysqld(_ZN14Item_subselect5printEP6String15enum_query_type+0x5f)[0x7f6538d69faf]
/usr/sbin/mysqld(_ZN22Item_func_conv_charset5printEP6String15enum_query_type+0x44)[0x7f6538d57e64]
/usr/sbin/mysqld(_ZN9Item_func10print_argsEP6Stringj15enum_query_type+0x4f)[0x7f6538d3a9df]
/usr/sbin/mysqld(_ZN9Item_func5printEP6String15enum_query_type+0x4d)[0x7f6538d3aabd]
/usr/sbin/mysqld(_ZN9Item_func10print_argsEP6Stringj15enum_query_type+0x4f)[0x7f6538d3a9df]
/usr/sbin/mysqld(_ZN9Item_func5printEP6String15enum_query_type+0x4d)[0x7f6538d3aabd]
/usr/sbin/mysqld(_ZN4Item17print_item_w_nameEP6String15enum_query_type+0x1c)[0x7f6538cf0fdc]
/usr/sbin/mysqld(_ZN13st_select_lex5printEP3THDP6String15enum_query_type+0x226)[0x7f6538baa976]
/usr/sbin/mysqld(_ZN18st_select_lex_unit5printEP6String15enum_query_type+0x57)[0x7f6538b4c267]
/usr/sbin/mysqld(_Z29mysqld_show_create_get_fieldsP3THDP10TABLE_LISTP4ListI4ItemEP6String+0x302)[0x7f6538bb9eb2]
/usr/sbin/mysqld(_Z18mysqld_show_createP3THDP10TABLE_LIST+0xc0)[0x7f6538bba8e0]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x3fbc)[0x7f6538b5aecc]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x26d)[0x7f6538b6073d]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x2460)[0x7f6538b63a80]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x169)[0x7f6538b64239]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x18a)[0x7f6538c2889a]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x7f6538c28a70]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7f6537227184]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f653674637d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f57a70fe020): is an invalid pointer
Connection ID (thread ID): 478

Comment by Vicențiu Ciorbaru [ 2016-07-23 ]

Hi elenst,

Can you paste the full crash message? The added string printing is at the end. No change in behaviour is done at the Query (address): <string>.

Generated at Thu Feb 08 07:32:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.