[MDEV-8737] Crash -- MDL_lock Created: 2015-09-03  Updated: 2020-03-12  Resolved: 2020-03-12

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.0.21-galera
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Trevor Nelson Assignee: Jan Lindström (Inactive)
Resolution: Won't Fix Votes: 1
Labels: None
Environment:

Ubuntu 12.04 x86_64
deb http://nyc2.mirrors.digitalocean.com/mariadb/repo/10.0/ubuntu precise main
deb http://repo.percona.com/apt precise main (for percona-xtrabackup)

3x mariadb-galera-server-10.0 cluster


Attachments: File cluster.cnf     File my.conf    
Issue Links:
Duplicate
duplicates MDEV-10857 mysqld got signal 11 (MariaDB 10.1.17... Closed
Relates
relates to MDEV-10264 mariadb crash randomly in MDL_lock::T... Closed
relates to MDEV-21898 POSIBLE BUG Closed
Sprint: 10.1.8-3, 10.1.8-4, 10.2.2-1, 10.2.2-2, 10.2.2-3

 Description   

The issue has been sporadic and I am unable to replicate the crash at will.

No errors were reported in the log prior to the start of the signal 11.

150903  8:28:06 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.0.21-MariaDB-1~precise-wsrep-log
key_buffer_size=8388608
read_buffer_size=1048576
max_used_connections=204
max_threads=802
thread_count=179
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1666845 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7fb6fb389008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fb7146dadd0 thread_stack 0x48000
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xba3a0b]
/usr/sbin/mysqld(handle_fatal_signal+0x398)[0x747ac8]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7fb7a92fdcb0]
/usr/sbin/mysqld(_ZN8MDL_lock11Ticket_list13remove_ticketEP10MDL_ticket+0x11)[0x6aec21]
/usr/sbin/mysqld(_ZN8MDL_lock13remove_ticketEMS_NS_11Ticket_listEP10MDL_ticket+0x41)[0x6af181]
/usr/sbin/mysqld(_ZN11MDL_context27release_locks_stored_beforeE17enum_mdl_durationP10MDL_ticket+0x3a)[0x6b001a]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x88d)[0x5ddd4d]
/usr/sbin/mysqld[0x5e5797]
/usr/sbin/mysqld[0x5e6173]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x199b)[0x5e804b]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x25a)[0x5e898a]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x55b)[0x6a5e9b]
/usr/sbin/mysqld(handle_one_connection+0x42)[0x6a5f92]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7fb7a92f5e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fb7a799038d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7fb715966020): SELECT `data` FROM `tablename` WHERE `id` = '0000000000000000000000000000'
Connection ID (thread ID): 26904
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived
_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,
mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_t
o_in=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file

(gdb) bt full
#0  0x00007fb7a92faf8c in pthread_kill () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x0000000000747af2 in handle_fatal_signal ()
No symbol table info available.
#2  <signal handler called>
No symbol table info available.
#3  0x00000000006aec21 in MDL_lock::Ticket_list::remove_ticket(MDL_ticket*) ()
No symbol table info available.
#4  0x00000000006af181 in MDL_lock::remove_ticket(MDL_lock::Ticket_list MDL_lock::*, MDL_ticket*) ()
No symbol table info available.
#5  0x00000000006b001a in MDL_context::release_locks_stored_before(enum_mdl_duration, MDL_ticket*) ()
No symbol table info available.
#6  0x00000000005ddd4d in mysql_execute_command(THD*) ()
No symbol table info available.
#7  0x00000000005e5797 in ?? ()
No symbol table info available.
#8  0x00000000005e6173 in ?? ()
No symbol table info available.
#9  0x00000000005e804b in dispatch_command(enum_server_command, THD*, char*, unsigned int) ()
No symbol table info available.
#10 0x00000000005e898a in do_command(THD*) ()
No symbol table info available.
#11 0x00000000006a5e9b in do_handle_one_connection(THD*) ()
No symbol table info available.
#12 0x00000000006a5f92 in handle_one_connection ()
No symbol table info available.
#13 0x00007fb7a92f5e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#14 0x00007fb7a799038d in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#15 0x0000000000000000 in ?? ()
No symbol table info available.
(gdb)



 Comments   
Comment by Elena Stepanova [ 2015-09-04 ]

tnelson,

You've set affected version to galera, is it what you are really using, are you running the galera cluster?

Is it a production server? How busy is it, and how much information can you share?
E.g. would you be able to enable general log temporarily till the next crash and then provide it? (General log might require a lot of disk space, can affect performance somewhat, and also will contain queries which are run on the server, including inserted values etc.)

Would you be able to provide a datadump of the table which is being selected from when the server crashes? Is it always the same table?

All information that you can share can be uploaded to our ftp.askmonty.org/private, only MariaDB developers will have access to it.

Please also attach your cnf file(s).

Comment by Trevor Nelson [ 2015-09-09 ]

This is a production mariadb-galera-server-10.0 cluster. Because of that I am going to be limited on what information I can share out but, I will do my best to get you guys everything that you need.

With this in mind I have temporarily enabled the general log for the next crash and I have also added some more information to the environment section and will add the configuration files with sensitive information omitted.

The crash is happening to all servers in the cluster but only one will have the crash happen at a time. It has not been all three at once yet thankfully.

Comment by Elena Stepanova [ 2015-09-09 ]

nirbhay_c,

MDL, 10.0, Galera – chances are it's galera-related, could you please take a look?

Comment by Trevor Nelson [ 2015-09-09 ]

I lost the core file on this latest crash below. But, a common trend appears to be around CodeIgniter sessions (http://www.codeigniter.com)

150908 12:40:08 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.0.21-MariaDB-1~precise-wsrep-log
key_buffer_size=8388608
read_buffer_size=1048576
max_used_connections=251
max_threads=802
thread_count=195
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1666845 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7f45f6f90008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f45ff248dd0 thread_stack 0x48000
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xba3a0b]
/usr/sbin/mysqld(handle_fatal_signal+0x398)[0x747ac8]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f46aaca2cb0]
/usr/sbin/mysqld(_ZN11MDL_context12release_lockE17enum_mdl_durationP10MDL_ticket+0x32)[0x6affa2]
/usr/sbin/mysqld(_ZN22Item_func_release_lock7val_intEv+0x112)[0x79b582]
/usr/sbin/mysqld(_ZN4Item4sendEP8ProtocolP6String+0x16c)[0x75a03c]
/usr/sbin/mysqld(_ZN8Protocol19send_result_set_rowEP4ListI4ItemE+0xf5)[0x561d85]
/usr/sbin/mysqld(_ZN11select_send9send_dataER4ListI4ItemE+0x60)[0x5ac990]
/usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0x1215)[0x631c45]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x11)[0x633231]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x1dd)[0x62fe8d]
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x28d)[0x63358d]
/usr/sbin/mysqld[0x5d6df9]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x55d8)[0x5e2a98]
/usr/sbin/mysqld[0x5e5797]
/usr/sbin/mysqld[0x5e6173]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x199b)[0x5e804b]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x25a)[0x5e898a]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x55b)[0x6a5e9b]
/usr/sbin/mysqld(handle_one_connection+0x42)[0x6a5f92]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f46aac9ae9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f46a933538d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f4604f81020): SELECT RELEASE_LOCK('fdbd732aafd99fdbd732aafd99fdbd732aafd99') AS ci_session_lock
Connection ID (thread ID): 145846
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
 

Comment by Trevor Nelson [ 2015-09-15 ]

150915 13:58:20 870419 Connect  db_user@cluster.hostname.com as anonymous on db_name
                870419 Query    SET NAMES utf8
                870419 Query    SELECT GET_LOCK('7a17062fbf8ea96786c3e7643b5a74138c72aaaf', 300) AS ci_session_lock
                870419 Query    SELECT `data`
FROM `table_name`
WHERE `id` = '7a17062fbf8ea96786c3e7643b5a74138c72aaaf'
 

table_name, db_name, db_user were replaced obviously.

// Some comments here
public String getFoo()
{
    return foo;150915 13:58:20 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.0.21-MariaDB-1~precise-wsrep-log
key_buffer_size=8388608
read_buffer_size=1048576
max_used_connections=266
max_threads=802
thread_count=181
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1666845 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7f6e3ceef008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f6eb6874dd0 thread_stack 0x48000
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xba3a0b]
/usr/sbin/mysqld(handle_fatal_signal+0x398)[0x747ac8]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f6ee5d0fcb0]
/usr/sbin/mysqld(_ZN8MDL_lock11Ticket_list13remove_ticketEP10MDL_ticket+0x11)[0x6aec21]
/usr/sbin/mysqld(_ZN8MDL_lock13remove_ticketEMS_NS_11Ticket_listEP10MDL_ticket+0x41)[0x6af181]
/usr/sbin/mysqld(_ZN11MDL_context27release_locks_stored_beforeE17enum_mdl_durationP10MDL_ticket+0x3a)[0x6b001a]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x88d)[0x5ddd4d]
/usr/sbin/mysqld[0x5e5797]
/usr/sbin/mysqld[0x5e6173]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x199b)[0x5e804b]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x25a)[0x5e898a]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x55b)[0x6a5e9b]
/usr/sbin/mysqld(handle_one_connection+0x42)[0x6a5f92]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f6ee5d07e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f6ee43a238d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f6eb4a33020): is an invalid pointer
Connection ID (thread ID): 870419
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
 
}

Comment by Trevor Nelson [ 2015-09-15 ]

Table structure:
http://www.codeigniter.com/user_guide/libraries/sessions.html

CREATE TABLE IF NOT EXISTS `ci_sessions` (
        `id` varchar(40) NOT NULL,
        `ip_address` varchar(45) NOT NULL,
        `timestamp` int(10) unsigned DEFAULT 0 NOT NULL,
        `data` blob NOT NULL,
        PRIMARY KEY (id),
        KEY `ci_sessions_timestamp` (`timestamp`)
);

Comment by Nirbhay Choubey (Inactive) [ 2015-10-01 ]

Hi tnelson!
Do you, by any chance, have server error logs for these crashes? Will it be possible for you to run server with wsrep_debug=ON?

Comment by Trevor Nelson [ 2015-10-01 ]

I am a little mixed up, what is attached in this Jira is coming from the error logs?

I have turned on wsrep_debug and will update the next time this happens.

Comment by Nirbhay Choubey (Inactive) [ 2015-10-02 ]

tnelson I am interested in looking at the logs right before the crash.

Comment by Nirbhay Choubey (Inactive) [ 2015-10-02 ]

I have turned on wsrep_debug and will update the next time this happens

Thanks

Comment by Jan Lindström (Inactive) [ 2015-10-14 ]

If this repeats, we would need full stack for all threads from the core file.

Comment by Anton [ 2015-12-24 ]

This problem repeat everday (i test on provider version 3.12.2/3.13)

>If this repeats, we would need full stack for all threads from the core file.
Can you provide more information how to make it?

Comment by Nirbhay Choubey (Inactive) [ 2016-02-27 ]

Hett Can you try repeating this on a debug build with --wsrep_debug=ON?

Comment by Beat Jörg [ 2016-08-11 ]

I see the same issue.
Server version: 10.1.16-MariaDB-1~trusty in combination with codeigniter sessions

Comment by Jan Lindström (Inactive) [ 2020-03-12 ]

Support for Galera 10.0 has ended.

Generated at Thu Feb 08 07:29:24 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.