[MDEV-6464] MariaDB 10.0.12 crash while using Zabbix Created: 2014-07-20  Updated: 2015-06-04  Resolved: 2015-06-04

Status: Closed
Project: MariaDB Server
Component/s: Partitioning
Affects Version/s: 10.0.12
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Sergii Assignee: Elena Stepanova
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Oracle Linux 6.5;
3.8.13-35.1.2.el6uek.x86_64 x86_64

mysqld
Ver 10.0.12-MariaDB for Linux on x86_64 (MariaDB Server);

PHP 5.4.30 (cli) (built: Jun 27 2014 11:59:31)
Copyright (c) 1997-2014 The PHP Group
Zend Engine v2.4.0, Copyright (c) 1998-2014 Zend Technologies

CPU 2x8 Xeon, 4 HDDx10k - RAID10, 64Gb RAM;
included partitioning.


Attachments: File my.cnf.single     File schema.sql    
Issue Links:
Relates
relates to MDEV-6970 MariaDB 10.0.13 crash while using par... Closed

 Description   

140719  4:00:01 [ERROR] mysqld got signal 11 ;
...
Server version: 10.0.12-MariaDB
key_buffer_size=2097152
read_buffer_size=2097152
max_used_connections=218
max_threads=258
thread_count=94
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1592516 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7f8d72554008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f8cbf002d10 thread_stack 0x48000
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xb6c14b]
/usr/sbin/mysqld(handle_fatal_signal+0x398)[0x7255f8]
/lib64/libpthread.so.0[0x32c300f710]
/usr/sbin/mysqld(_ZN7handler15read_range_nextEv+0xa7)[0x72a9f7]
/usr/sbin/mysqld[0xb46d65]
/usr/sbin/mysqld(_ZN7handler21multi_range_read_nextEPPv+0xb2)[0x6bc222]
/usr/sbin/mysqld(_ZN18QUICK_RANGE_SELECT8get_nextEv+0x52)[0x7f3762]
/usr/sbin/mysqld[0x810668]
/usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x1b2)[0x5f56a2]
/usr/sbin/mysqld[0x60c76d]
/usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0x6da)[0x61f26a]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x11)[0x621441]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x1dd)[0x61e00d]
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x28d)[0x62179d]
/usr/sbin/mysqld[0x5c9346]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x4c4f)[0x5d400f]
/usr/sbin/mysqld[0x5d5b12]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1b20)[0x5d7cd0]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x453)[0x6950e3]
/usr/sbin/mysqld(handle_one_connection+0x42)[0x6951b2]
/lib64/libpthread.so.0[0x32c30079d1]
/lib64/libc.so.6(clone+0x6d)[0x32c28e8b5d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f8d95058020): is an invalid pointer
Connection ID (thread ID): 9559312
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,
firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache
=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on



 Comments   
Comment by Elena Stepanova [ 2014-07-20 ]

Hi,

Is it reproducible?
Would you be able to locate the query that causes the crash?
(If it's reproducible, you can enable general_log for one time, and after the crash occurs, the last query inside the indicated connection before the crash would most likely be the one that caused it).

Comment by Sergii [ 2014-07-20 ]

Hi

I think that I can't reproduce this situation again. up to now, the error is not repeated more.
Thank you. I will enable general_log and watch.

Comment by Elena Stepanova [ 2014-07-20 ]

Please be aware that if the problem is not easily reproducible, enabling the general log can be expensive, especially if you have a high-traffic server. It logs every query that goes to the server, so it might affect performance and will also grow big. It would be great if you could do it and catch the offending query this way, but most people can't afford it in production unless it's just for very short time.

Comment by Elena Stepanova [ 2014-08-24 ]

Closing for now as Incomplete, if you have new information, please comment to re-open the report.

Comment by Sergii [ 2014-10-06 ]

Hi, Elena. Again there was a crash. I assume that this is due to partitioning that occurs at 4 am. Part of a log file after the crush:

141004  4:00:03 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.0.12-MariaDB
key_buffer_size=2097152
read_buffer_size=2097152
max_used_connections=346
max_threads=502
thread_count=186
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3096700 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7f34401de008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f330f2dad10 thread_stack 0x48000
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xb6c14b]
/usr/sbin/mysqld(handle_fatal_signal+0x398)[0x7255f8]
/lib64/libpthread.so.0[0x32c300f710]
/usr/sbin/mysqld(_ZN7handler15read_range_nextEv+0xa7)[0x72a9f7]
/usr/sbin/mysqld[0xb46d65]
/usr/sbin/mysqld(_ZN7handler21multi_range_read_nextEPPv+0xb2)[0x6bc222]
/usr/sbin/mysqld(_ZN18QUICK_RANGE_SELECT8get_nextEv+0x52)[0x7f3762]
/usr/sbin/mysqld[0x810668]
/usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x1b2)[0x5f56a2]
/usr/sbin/mysqld[0x60c76d]
/usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0x6da)[0x61f26a]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x11)[0x621441]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x1dd)[0x61e00d]
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x28d)[0x62179d]
/usr/sbin/mysqld[0x5c9346]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x4c4f)[0x5d400f]
/usr/sbin/mysqld[0x5d5b12]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1b20)[0x5d7cd0]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x453)[0x6950e3]
/usr/sbin/mysqld(handle_one_connection+0x42)[0x6951b2]
/lib64/libpthread.so.0[0x32c30079d1]
/lib64/libc.so.6(clone+0x6d)[0x32c28e8b5d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f33ecfee020): is an invalid pointer
Connection ID (thread ID): 23641383
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_
merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mr
r_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_i
n=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
141004 04:00:10 mysqld_safe Number of processes running now: 0
141004 04:00:10 mysqld_safe mysqld restarted

Comment by Elena Stepanova [ 2014-10-06 ]

Hi Sergii,

What exactly do you mean by "partitioning that occurs at 4 am"? What is being partitioned, and how?
Did you happen to have the general log enabled when the crash occurred?

Thanks.

Comment by Sergii [ 2014-10-06 ]

We are using partitioning for some tables. Partitioning is started by event scheduler at 4 am. Procedure that we use, you can see following the link: https://www.zabbix.org/wiki/Docs/howto/mysql_partitioning
unfortunately, we turn off logging due to lack of crash for a few month.

Comment by Elena Stepanova [ 2014-10-14 ]

Hi Sergii,

Do you have some other activity on the server at this time? Maybe some kind of monitoring, statistics, analytics,...? Are there active external connections?
The server reported lots of running threads at the moment of the crash (94 and 186), but I don't know whether they were active or idle at the moment.

Obviously, the fact that both times it happened at 04:00:0x cannot be a coincidence; but I don't see right away what in the described partitioned procedure could cause a crash like that, there aren't so many SELECTs in there. Besides, if it were just the partitioning, it would be probably happening more often. So, something else might be playing a role.

Would you be able to provide a structure of the schema (better still, the full dump if possible, but at least the structures)?
Also, could you please attach your cnf file(s) or SHOW VARIABLES output from the crashing server?

And btw, which of the described partitioning procedures are you using – the one via the event/SPs, or via an external script?

Thanks.

Comment by Sergii [ 2014-10-15 ]

Hi, Elena.
The main activity is zabbix server. Additional activity - scripts for monitoring state of database and scripts for status synchronization of equipment's interfaces. Most of them work through the Zabbix API. By the way, connection from zabbix server are external. Similar incidents happened more than a few times and all of them happened at 04:00:0x.

Unfortunately, we can't reproduce this situation again due to the fact that now we use MariaDB v10.0.13 with galera cluster.

We are using one via the event.

See attachment.

Comment by Sergii [ 2014-10-15 ]

Config and schema.

Comment by Elena Stepanova [ 2015-06-04 ]

Since MDEV-6970 contains more information, I'm closing this issue as its duplicate, even though this one was actually created earlier.

I re-attached the cnf and schema.sql to MDEV-6970, and also added a comment with a link to the partitioning procedure.

Generated at Thu Feb 08 07:12:06 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.