[MDEV-9728] Hard crash in metadata_lock_info Created: 2016-03-15  Updated: 2017-11-07  Resolved: 2016-06-21

Status: Closed
Project: MariaDB Server
Component/s: Locking
Affects Version/s: 10.1.12, 10.0, 10.1
Fix Version/s: 10.0.26, 10.1.15

Type: Bug Priority: Major
Reporter: Sander Pilon Assignee: Sergey Vojtovich
Resolution: Fixed Votes: 1
Labels: None

Issue Links:
Problem/Incident
causes MDEV-14307 LOCK_DURATION always NULL in METADATA... Closed
Relates
relates to MDEV-9486 Crash in metadata lock plugin Closed
relates to MDEV-12523 MariaDB server crash with METADATA_LO... Closed
Sprint: 10.0.26

 Description   

Our server crashed today with the following stack trace:

Server version: 10.1.12-MariaDB-1~trusty
key_buffer_size=16777216
read_buffer_size=131072
max_used_connections=227
max_threads=502
thread_count=59
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1118998 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7fd005084008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fcfe1e6bdf0 thread_stack 0x80000
*** buffer overflow detected ***: /usr/sbin/mysqld terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7338f)[0x7fde8d17c38f]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7fde8d213c9c]
/lib/x86_64-linux-gnu/libc.so.6(+0x109b60)[0x7fde8d212b60]
/lib/x86_64-linux-gnu/libc.so.6(+0x10abe7)[0x7fde8d213be7]
/usr/sbin/mysqld(my_addr_resolve+0x48)[0x7fde8fa78af8]
/usr/sbin/mysqld(my_print_stacktrace+0x1c2)[0x7fde8fa651f2]
/usr/sbin/mysqld(handle_fatal_signal+0x38d)[0x7fde8f5921bd]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7fde8dae8340]
/usr/sbin/mysqld(_ZN11MDL_context11find_ticketEP11MDL_requestP17enum_mdl_duration+0x68)[0x7fde8f4e9948]
/usr/lib/mysql/plugin/metadata_lock_info.so(_Z31i_s_metadata_lock_info_fill_rowP10MDL_ticketPv+0x74)[0x7fde88dfdfa4]
/usr/sbin/mysqld(+0x5057cd)[0x7fde8f4e77cd]
/usr/sbin/mysqld(lf_hash_iterate+0x193)[0x7fde8fa6ec03]
/usr/sbin/mysqld(_Z11mdl_iteratePFiP10MDL_ticketPvES1_+0x8f)[0x7fde8f4e7a6f]
/usr/lib/mysql/plugin/metadata_lock_info.so(_Z33i_s_metadata_lock_info_fill_tableP3THDP10TABLE_LISTP4Item+0x27)[0x7fde88dfe197]
/usr/sbin/mysqld(_Z24get_schema_tables_resultP4JOIN23enum_schema_table_state+0x28e)[0x7fde8f48222e]
/usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0x6bd)[0x7fde8f46885d]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x54)[0x7fde8f46ace4]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x112)[0x7fde8f467392]
/usr/sbin/mysqld(_Z18mysql_derived_fillP3THDP3LEXP10TABLE_LIST+0x11b)[0x7fde8f3ef86b]
/usr/sbin/mysqld(_Z27mysql_handle_single_derivedP3LEXP10TABLE_LISTj+0xe4)[0x7fde8f3f0b74]
/usr/sbin/mysqld(_ZN13st_join_table12preread_initEv+0xcf)[0x7fde8f44b0df]
/usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x2f8)[0x7fde8f44b568]
/usr/sbin/mysqld(+0x476fed)[0x7fde8f458fed]
/usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0xb4c)[0x7fde8f468cec]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x54)[0x7fde8f46ace4]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x112)[0x7fde8f467392]
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x245)[0x7fde8f467e75]
/usr/sbin/mysqld(+0x428351)[0x7fde8f40a351]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x61e4)[0x7fde8f416854]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x26d)[0x7fde8f419fed]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x2460)[0x7fde8f41d330]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x169)[0x7fde8f41dae9]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x18a)[0x7fde8f4e10fa]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x7fde8f4e12d0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7fde8dae0182]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fde8d20347d]

My.cnf files can be found in my other issues about other crashes.

We are monitoring the server status with this query:

SELECT
  `information_schema`.`metadata_lock_info`.`THREAD_ID`     AS `THREAD_ID`,
  CONCAT(`information_schema`.`processlist`.`USER`,'@',`information_schema`.`processlist`.`HOST`) AS `USER`,
  REPLACE(`information_schema`.`processlist`.`INFO`,'\n',' ') AS `QUERY`,
  `information_schema`.`metadata_lock_info`.`LOCK_MODE`     AS `LOCK_MODE`,
  `information_schema`.`metadata_lock_info`.`LOCK_TYPE`     AS `LOCK_TYPE`,
  GROUP_CONCAT(CONCAT(`information_schema`.`metadata_lock_info`.`TABLE_SCHEMA`,'.',`information_schema`.`metadata_lock_info`.`TABLE_NAME`) SEPARATOR ',') AS `LOCK_TABLES`,
  `information_schema`.`metadata_lock_info`.`LOCK_DURATION` AS `LOCK_DURATION`,
  ROUND((`information_schema`.`processlist`.`TIME_MS` / 1000),2) AS `DURATION`
FROM (`information_schema`.`metadata_lock_info`
   JOIN `information_schema`.`processlist`
     ON ((`information_schema`.`metadata_lock_info`.`THREAD_ID` = `information_schema`.`processlist`.`ID`)))
GROUP BY `information_schema`.`metadata_lock_info`.`THREAD_ID`,`information_schema`.`metadata_lock_info`.`LOCK_MODE`,`information_schema`.`metadata_lock_info`.`LOCK_TYPE`$$

But the server didn't crash when I was executing this query, so it might be something else. Come to think of it, there might have been a very small network hiccup. (Which is why it crashed on an addr_resolve)



 Comments   
Comment by Elena Stepanova [ 2016-03-15 ]

From what I see, it most certainly crashed upon SELECT involving metadata_lock_info, specifically somewhere in MDL. It got to my_addr_resolve after the crash, while printing the stack trace.
If you mean that you tried to re-execute the query afterwards, and it didn't crash, unfortunately it doesn't mean much, the crash could have been (and most likely was) caused by a race condition.

Comment by Elena Stepanova [ 2016-03-17 ]

Somewhat better stack trace:

Stack trace from 10.1 commit 9b53d84d14a9b031d193f6beae382a232aa738e3 RelWithDebInfo build

Thread 1 (Thread 0x7f652a3ec700 (LWP 34003)):
#0  0x00007f6661efef8c in pthread_kill () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f6663b98d55 in handle_fatal_signal (sig=11) at /home/elenst/git/10.1-rel/sql/signal_handler.cc:273
#2  <signal handler called>
#3  MDL_context::find_ticket (this=this@entry=0x7f6536b6e100, mdl_request=mdl_request@entry=0x7f652a3e9090, result_duration=result_duration@entry=0x7f652a3e908c) at /home/elenst/git/10.1-rel/sql/mdl.cc:1752
#4  0x00007f666111af7a in i_s_metadata_lock_info_fill_row (mdl_ticket=<optimised out>, arg=<optimised out>) at /home/elenst/git/10.1-rel/plugin/metadata_lock_info/metadata_lock_info.cc:91
#5  0x00007f6663af8e97 in mdl_iterate_lock (lock=0x7f6533511228, arg=0x7f652a3e93b0) at /home/elenst/git/10.1-rel/sql/mdl.cc:643
#6  0x00007f6663fda2d9 in l_find (head=0x7f66608be0d8, head@entry=0x7f6660871c10, cs=cs@entry=0x0, hashnr=hashnr@entry=0, key=key@entry=0x7f652a3e93b0 "\020\257\021af\177", keylen=keylen@entry=0, cursor=cursor@entry=0x7f652a3e9360, pins=<optimised out>, pins@entry=0x7f665dc17200, callback=<optimised out>, callback@entry=0x7f6663af8e00 <mdl_iterate_lock(MDL_lock*, mdl_iterate_arg*)>) at /home/elenst/git/10.1-rel/mysys/lf_hash.c:126
#7  0x00007f6663fdacbf in lf_hash_iterate (hash=0x7f6664819160 <mdl_locks>, pins=0x7f665dc17200, action=0x7f6663af8e00 <mdl_iterate_lock(MDL_lock*, mdl_iterate_arg*)>, argument=0x7f652a3e93b0) at /home/elenst/git/10.1-rel/mysys/lf_hash.c:518
#8  0x00007f6663af9126 in mdl_iterate (callback=callback@entry=0x7f666111af10 <i_s_metadata_lock_info_fill_row(MDL_ticket*, void*)>, arg=arg@entry=0x7f652a3e93e0) at /home/elenst/git/10.1-rel/sql/mdl.cc:662
#9  0x00007f666111b163 in i_s_metadata_lock_info_fill_table (thd=<optimised out>, tables=<optimised out>, cond=<optimised out>) at /home/elenst/git/10.1-rel/plugin/metadata_lock_info/metadata_lock_info.cc:128
#10 0x00007f6663a99e2f in get_schema_tables_result (join=join@entry=0x7f665f139ea0, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC) at /home/elenst/git/10.1-rel/sql/sql_show.cc:8111
#11 0x00007f6663a816ed in JOIN::exec_inner (this=0x7f665f139ea0) at /home/elenst/git/10.1-rel/sql/sql_select.cc:2670
#12 0x00007f6663a83984 in JOIN::exec (this=0x7f665f139ea0) at /home/elenst/git/10.1-rel/sql/sql_select.cc:2518
#13 0x00007f6663a80224 in mysql_select (thd=thd@entry=0x7f6531b6d008, rref_pointer_array=rref_pointer_array@entry=0x7f6531b71258, tables=0x7f665f038218, wild_num=<optimised out>, fields=..., conds=<optimised out>, og_num=3, order=0x0, group=0x7f665f039c40, having=0x0, proc_param=0x0, select_options=2684619520, result=result@entry=0x7f665f139e80, unit=unit@entry=0x7f6531b708e0, select_lex=select_lex@entry=0x7f6531b70fe0) at /home/elenst/git/10.1-rel/sql/sql_select.cc:3451
#14 0x00007f6663a80d44 in handle_select (thd=thd@entry=0x7f6531b6d008, lex=lex@entry=0x7f6531b70818, result=result@entry=0x7f665f139e80, setup_tables_done_option=setup_tables_done_option@entry=0) at /home/elenst/git/10.1-rel/sql/sql_select.cc:384
#15 0x00007f6663a2c699 in execute_sqlcom_select (thd=thd@entry=0x7f6531b6d008, all_tables=0x7f665f038218) at /home/elenst/git/10.1-rel/sql/sql_parse.cc:5936
#16 0x00007f6663a38961 in mysql_execute_command (thd=thd@entry=0x7f6531b6d008) at /home/elenst/git/10.1-rel/sql/sql_parse.cc:2962
#17 0x00007f6663a3bcb6 in mysql_parse (thd=0x7f6531b6d008, rawbuf=<optimised out>, length=<optimised out>, parser_state=0x7f652a3eb1a0) at /home/elenst/git/10.1-rel/sql/sql_parse.cc:7336
#18 0x00007f6663a3edfd in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7f6531b6d008, packet=packet@entry=0x7f6531ba6009 "SELECT `information_schema`.`metadata_lock_info`.`THREAD_ID` AS `THREAD_ID`, CONCAT(`information_schema`.`processlist`.`USER`,'@',`information_schema`.`processlist`.`HOST`) AS `USER`, REPLACE(`informa"..., packet_length=packet_length@entry=1083) at /home/elenst/git/10.1-rel/sql/sql_parse.cc:1488
#19 0x00007f6663a3f5b1 in do_command (thd=0x7f6531b6d008) at /home/elenst/git/10.1-rel/sql/sql_parse.cc:1109
#20 0x00007f6663af2c54 in do_handle_one_connection (thd_arg=thd_arg@entry=0x7f6531b6d008) at /home/elenst/git/10.1-rel/sql/sql_connect.cc:1349
#21 0x00007f6663af2e27 in handle_one_connection (arg=0x7f6531b6d008) at /home/elenst/git/10.1-rel/sql/sql_connect.cc:1261
#22 0x00007f6661ef9e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#23 0x00007f6661626cbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#24 0x0000000000000000 in ?? ()

I got it while running a concurrent (30 threads) OLTP-like transactional workflow and two threads executing the provided query from I_S.metadata_lock_info. It's not easily reproducible. So far I had it once out of ~20 1-hour test runs.

And here is a variation that I also got twice while running the same tests:

Stack trace from 10.1 commit 9b53d84d14a9b031d193f6beae382a232aa738e3 RelWithDebInfo build

#2  <signal handler called>
#3  0x00007f7794579038 in i_s_metadata_lock_info_fill_row (mdl_ticket=<optimised out>, arg=<optimised out>) at /home/elenst/git/10.1-rel/plugin/metadata_lock_info/metadata_lock_info.cc:102
#4  0x00007f7796f56e81 in mdl_iterate_lock (lock=0x7f7793c20a80, arg=arg@entry=0x7f765d7e93b0) at /home/elenst/git/10.1-rel/sql/mdl.cc:643
#5  0x00007f7796f570d7 in mdl_iterate (callback=callback@entry=0x7f7794578f10 <i_s_metadata_lock_info_fill_row(MDL_ticket*, void*)>, arg=arg@entry=0x7f765d7e93e0) at /home/elenst/git/10.1-rel/sql/mdl.cc:659
#6  0x00007f7794579163 in i_s_metadata_lock_info_fill_table (thd=<optimised out>, tables=<optimised out>, cond=<optimised out>) at /home/elenst/git/10.1-rel/plugin/metadata_lock_info/metadata_lock_info.cc:128
#7  0x00007f7796ef7e2f in get_schema_tables_result (join=join@entry=0x7f7669f75ea0, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC) at /home/elenst/git/10.1-rel/sql/sql_show.cc:8111
#8  0x00007f7796edf6ed in JOIN::exec_inner (this=0x7f7669f75ea0) at /home/elenst/git/10.1-rel/sql/sql_select.cc:2670
#9  0x00007f7796ee1984 in JOIN::exec (this=0x7f7669f75ea0) at /home/elenst/git/10.1-rel/sql/sql_select.cc:2518
#10 0x00007f7796ede224 in mysql_select (thd=thd@entry=0x7f7664ebe008, rref_pointer_array=rref_pointer_array@entry=0x7f7664ec2258, tables=0x7f7664eea218, wild_num=<optimised out>, fields=..., conds=<optimised out>, og_num=3, order=0x0, group=0x7f7664eebc40, having=0x0, proc_param=0x0, select_options=2684619520, result=result@entry=0x7f7669f75e80, unit=unit@entry=0x7f7664ec18e0, select_lex=select_lex@entry=0x7f7664ec1fe0) at /home/elenst/git/10.1-rel/sql/sql_select.cc:3451
#11 0x00007f7796eded44 in handle_select (thd=thd@entry=0x7f7664ebe008, lex=lex@entry=0x7f7664ec1818, result=result@entry=0x7f7669f75e80, setup_tables_done_option=setup_tables_done_option@entry=0) at /home/elenst/git/10.1-rel/sql/sql_select.cc:384
#12 0x00007f7796e8a699 in execute_sqlcom_select (thd=thd@entry=0x7f7664ebe008, all_tables=0x7f7664eea218) at /home/elenst/git/10.1-rel/sql/sql_parse.cc:5936
#13 0x00007f7796e96961 in mysql_execute_command (thd=thd@entry=0x7f7664ebe008) at /home/elenst/git/10.1-rel/sql/sql_parse.cc:2962
#14 0x00007f7796e99cb6 in mysql_parse (thd=0x7f7664ebe008, rawbuf=<optimised out>, length=<optimised out>, parser_state=0x7f765d7eb1a0) at /home/elenst/git/10.1-rel/sql/sql_parse.cc:7336
#15 0x00007f7796e9cdfd in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7f7664ebe008, packet=packet@entry=0x7f7664ec4009 "SELECT `information_schema`.`metadata_lock_info`.`THREAD_ID` AS `THREAD_ID`, CONCAT(`information_schema`.`processlist`.`USER`,'@',`information_schema`.`processlist`.`HOST`) AS `USER`, REPLACE(`informa"..., packet_length=packet_length@entry=1083) at /home/elenst/git/10.1-rel/sql/sql_parse.cc:1488
#16 0x00007f7796e9d5b1 in do_command (thd=0x7f7664ebe008) at /home/elenst/git/10.1-rel/sql/sql_parse.cc:1109
#17 0x00007f7796f50c54 in do_handle_one_connection (thd_arg=thd_arg@entry=0x7f7664ebe008) at /home/elenst/git/10.1-rel/sql/sql_connect.cc:1349
#18 0x00007f7796f50e27 in handle_one_connection (arg=0x7f7664ebe008) at /home/elenst/git/10.1-rel/sql/sql_connect.cc:1261
#19 0x00007f7795357e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#20 0x00007f7794a84cbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#21 0x0000000000000000 in ?? ()

Comment by Elena Stepanova [ 2016-03-18 ]

No luck on simplifying the test, so it comes as is.

RQG command line

# Use RQG from lp:~elenst/randgen/mariadb-patches (revno >= 1065).
# Make sure metadata_lock_info plugin is available in your plugindir
# I could only reproduce the problem on a non-debug build
 
runall-new.pl \
--grammar=mdev9728.yy \
--gendata=conf/mariadb/oltp.zz \
--threads=32 \
--duration=3600 \
--queries=100M \
--rows=10000,10000,10000 \
--genconfig=mdev9728.cnf \
--basedir=<your basedir> \
--vardir=<your vardir>

mdev9738.yy

thread1:
SELECT
  `information_schema`.`metadata_lock_info`.`THREAD_ID`     AS `THREAD_ID`,
  CONCAT(`information_schema`.`processlist`.`USER`,'@',`information_schema`.`processlist`.`HOST`) AS `USER`,
  REPLACE(`information_schema`.`processlist`.`INFO`,'\n',' ') AS `QUERY`,
  `information_schema`.`metadata_lock_info`.`LOCK_MODE`     AS `LOCK_MODE`,
  `information_schema`.`metadata_lock_info`.`LOCK_TYPE`     AS `LOCK_TYPE`,
  GROUP_CONCAT(CONCAT(`information_schema`.`metadata_lock_info`.`TABLE_SCHEMA`,'.',`information_schema`.`metadata_lock_info`.`TABLE_NAME`) SEPARATOR ',') AS `LOCK_TABLES`,
  `information_schema`.`metadata_lock_info`.`LOCK_DURATION` AS `LOCK_DURATION`,
  ROUND((`information_schema`.`processlist`.`TIME_MS` / 1000),2) AS `DURATION`
FROM (`information_schema`.`metadata_lock_info`
   JOIN `information_schema`.`processlist`
     ON ((`information_schema`.`metadata_lock_info`.`THREAD_ID` = `information_schema`.`processlist`.`ID`)))
GROUP BY `information_schema`.`metadata_lock_info`.`THREAD_ID`,`information_schema`.`metadata_lock_info`.`LOCK_MODE`,`information_schema`.`metadata_lock_info`.`LOCK_TYPE`;
 
thread3:
    thread1;
 
query:
    dml | dml | dml | dml | dml | dml | dml |
    START TRANSACTION | COMMIT ;
 
dml:
    select | 
    update |
    delete |
    insert
;
 
insert:
    INSERT IGNORE INTO _table ( _field_pk ) VALUES ( NULL ) |
    INSERT IGNORE INTO _table ( _field_int ) VALUES ( _smallint_unsigned ) |
    INSERT IGNORE INTO _table ( _field_char ) VALUES ( _string ) |
    INSERT IGNORE INTO _table ( _field_pk, _field_int)  VALUES ( NULL, _int ) |
    INSERT IGNORE INTO _table ( _field_pk, _field_char ) VALUES ( NULL, _string ) 
;
 
update:
    index_update |
    non_index_update
;
 
delete:
    DELETE FROM _table WHERE _field_pk = _smallint_unsigned ;
 
index_update:
    UPDATE IGNORE _table SET _field_int_indexed = _field_int_indexed + 1 WHERE _field_pk = _smallint_unsigned ;
 
# It relies on char fields being unindexed. 
# If char fields happen to be indexed in the table spec, then this update can be indexed as well. No big harm though. 
non_index_update:
    UPDATE _table SET _field_char = _string WHERE _field_pk = _smallint_unsigned ;
 
select:
    point_select |
    simple_range |
    sum_range |
    order_range |
    distinct_range 
;
 
point_select:
    SELECT _field FROM _table WHERE _field_pk = _smallint_unsigned ;
 
simple_range:
    SELECT _field FROM _table WHERE _field_pk BETWEEN _smallint_unsigned AND _smallint_unsigned ;
 
sum_range:
    SELECT SUM(_field) FROM _table WHERE _field_pk BETWEEN _smallint_unsigned AND _smallint_unsigned ;
 
order_range:
    SELECT _field FROM _table WHERE _field_pk BETWEEN _smallint_unsigned AND _smallint_unsigned ORDER BY _field ;
 
distinct_range:
    SELECT DISTINCT _field FROM _table WHERE _field_pk BETWEEN _smallint_unsigned AND _smallint_unsigned ORDER BY _field ;

mdev9728.cnf

[mysqld]                                                                                                                                                                       
plugin-load-add=metadata_lock_info                                                                                                                                             
replicate-wild-ignore-table = %.Voorafgaand%_Klanten                                                                                                                           
replicate-wild-ignore-table = tmp.%                                                                                                                                            
transaction-isolation =  READ-COMMITTED                                                                                                                                        
skip-external-locking                                                                                                                                                          
innodb_stats_on_metadata = 0                                                                                                                                                   
lock_wait_timeout = 20
innodb_lock_wait_timeout = 20
master_info_repository = table
relay_log_info_repository = table
master-verify-checksum = on
slave-sql-verify-checksum = on
key_buffer              = 16M
#max_allowed_packet     = 16M
thread_stack            = 512K
thread_cache_size       = 8
myisam-recover         = BACKUP
max_allowed_packet             = 64M
max_connect_errors             = 1000000
sysdate_is_now                 = 1
max_connections=500
event_scheduler=ON
table_cache     = 3000
tmp_table_size                 = 64M
max_heap_table_size            = 64M
query_cache_type               = 1
query_cache_size               = 8M
max_connections                = 500
thread_cache_size              = 50
open_files_limit               = 65535
table_definition_cache         = 4096
table_open_cache               = 4096
max_sp_recursion_depth         = 2
back_log=1500
character_set_server = utf8
collation_server = utf8_general_ci
long_query_time         = 1
log-queries-not-using-indexes
log-output              = TABLE
log_warnings            = 2
sync_binlog             = 0
binlog_checksum         = crc32
binlog_commit_wait_usec = 50000
binlog_commit_wait_count= 20
server-id               = 765
log_bin                 = mysql-bin.log
binlog_format           = MIXED
expire_logs_days        = 7
max_binlog_size         = 1G
log-slave-updates
slave_compressed_protocol = 1
relay-log               = mysql-relay.log
innodb_log_files_in_group      = 2
innodb_flush_log_at_trx_commit = 0
innodb_buffer_pool_size        = 4G
innodb_file_per_table
innodb_log_file_size=1G
innodb_flush_method=O_DIRECT
innodb_file_format=barracuda
innodb_io_capacity = 20000
innodb_buffer_pool_instances=10
innodb_log_buffer_size=32M
group_concat_max_len = 8096

Stack trace from 10.0 commit 96a7e74ed350063d47b2b625cbfbad1e65706da2

#2  <signal handler called>
#3  MDL_context::find_ticket (this=this@entry=0x7ff2e61a7c08, mdl_request=mdl_request@entry=0x7ff41124a690, result_duration=result_duration@entry=0x7ff41124a68c) at /home/mariadb/git/10.0-mdev9728/sql/mdl.cc:2015
#4  0x00007ff2de9f3f7a in i_s_metadata_lock_info_fill_row (mdl_ticket=<optimised out>, arg=<optimised out>) at /home/mariadb/git/10.0-mdev9728/plugin/metadata_lock_info/metadata_lock_info.cc:91
#5  0x00000000006708c9 in mdl_iterate_lock (arg=0x7ff41124a940, callback=0x7ff2de9f3f10 <i_s_metadata_lock_info_fill_row(MDL_ticket*, void*)>, lock=0x7ff410820d00) at /home/mariadb/git/10.0-mdev9728/sql/mdl.cc:713
#6  mdl_iterate (callback=callback@entry=0x7ff2de9f3f10 <i_s_metadata_lock_info_fill_row(MDL_ticket*, void*)>, arg=arg@entry=0x7ff41124a940) at /home/mariadb/git/10.0-mdev9728/sql/mdl.cc:726
#7  0x00007ff2de9f4163 in i_s_metadata_lock_info_fill_table (thd=<optimised out>, tables=<optimised out>, cond=<optimised out>) at /home/mariadb/git/10.0-mdev9728/plugin/metadata_lock_info/metadata_lock_info.cc:128
#8  0x00000000006166b7 in get_schema_tables_result (join=join@entry=0x7ff2e88cdf68, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC) at /home/mariadb/git/10.0-mdev9728/sql/sql_show.cc:8168
#9  0x00000000005fed15 in JOIN::exec_inner (this=this@entry=0x7ff2e88cdf68) at /home/mariadb/git/10.0-mdev9728/sql/sql_select.cc:2538
#10 0x0000000000600c79 in JOIN::exec (this=this@entry=0x7ff2e88cdf68) at /home/mariadb/git/10.0-mdev9728/sql/sql_select.cc:2373
#11 0x00000000005fd934 in mysql_select (thd=thd@entry=0x7ff2e5727008, rref_pointer_array=rref_pointer_array@entry=0x7ff2e572b1a8, tables=0x7ff2e8890358, wild_num=<optimised out>, fields=..., conds=<optimised out>, og_num=3, order=0x0, group=0x7ff2e8891df0, having=0x0, proc_param=0x0, select_options=2684619520, result=result@entry=0x7ff2e88cdf48, unit=unit@entry=0x7ff2e572a840, select_lex=select_lex@entry=0x7ff2e572af30) at /home/mariadb/git/10.0-mdev9728/sql/sql_select.cc:3308
#12 0x00000000005fe3b4 in handle_select (thd=thd@entry=0x7ff2e5727008, lex=lex@entry=0x7ff2e572a778, result=result@entry=0x7ff2e88cdf48, setup_tables_done_option=setup_tables_done_option@entry=0) at /home/mariadb/git/10.0-mdev9728/sql/sql_select.cc:373
#13 0x00000000005aec57 in execute_sqlcom_select (thd=thd@entry=0x7ff2e5727008, all_tables=0x7ff2e8890358) at /home/mariadb/git/10.0-mdev9728/sql/sql_parse.cc:5304
#14 0x00000000005b9ee6 in mysql_execute_command (thd=thd@entry=0x7ff2e5727008) at /home/mariadb/git/10.0-mdev9728/sql/sql_parse.cc:2562
#15 0x00000000005bcb78 in mysql_parse (thd=0x7ff2e5727008, rawbuf=<optimised out>, length=1083, parser_state=0x7ff41124c680) at /home/mariadb/git/10.0-mdev9728/sql/sql_parse.cc:6565
#16 0x00000000005be94f in dispatch_command (command=COM_QUERY, thd=0x7ff2e5727008, packet=<optimised out>, packet_length=<optimised out>) at /home/mariadb/git/10.0-mdev9728/sql/sql_parse.cc:1308
#17 0x00000000005bf3b0 in do_command (thd=<optimised out>) at /home/mariadb/git/10.0-mdev9728/sql/sql_parse.cc:998
#18 0x000000000066a6c4 in do_handle_one_connection (thd_arg=thd_arg@entry=0x7ff2e5727008) at /home/mariadb/git/10.0-mdev9728/sql/sql_connect.cc:1377
#19 0x000000000066a728 in handle_one_connection (arg=0x7ff2e5727008) at /home/mariadb/git/10.0-mdev9728/sql/sql_connect.cc:1292
#20 0x00007ff4127eae9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#21 0x00007ff4116dacbd in clone () from /lib/x86_64-linux-gnu/libc.so.6

Comment by Sergey Vojtovich [ 2016-06-15 ]

elenst, please test patch for this fix (patch is at commits@mariadb.org). If testing is successful please reassign to Sergei for review.

Comment by Elena Stepanova [ 2016-06-15 ]

Given that the issue is on the sprint, and that with some bad luck review can become a bottleneck, it makes sense to do both in parallel. I'll run the tests, but meanwhile please proceed with the review.

Comment by Sergey Vojtovich [ 2016-06-16 ]

serg, in APC based solution requestor thread has to hold rdlock(MDL_lock::m_rwlock) to iterate through granted tickets list (see mdl_iterate_lock()). It may issue request against APC thread waiting gor wrlock(MDL_lock::m_rwlock) and deadlock.

I don't really see easy way to workaround this. Any thoughts?

Comment by Sergey Vojtovich [ 2016-06-16 ]

OTOH we may iterate threads instead of mdl and gather tickets from thread-local lists, will try this.

Comment by Elena Stepanova [ 2016-06-17 ]

Nevermind my previous comment (deleted now), false alarm.

Comment by Sergey Vojtovich [ 2016-06-17 ]

serg, please review APC based solution.

Comment by Elena Stepanova [ 2016-06-19 ]

I've run tests on the first patch, haven't had any crashes.
I had some deadlocks, both before and after the patch, they might well be unrelated, I'll look at them separately.

svoj, do I need to run tests on the second patch as well?

Comment by Sergey Vojtovich [ 2016-06-19 ]

elenst, not necessary. I only needed proof that I'm fixing the right thing and apparently it is.

Comment by Nirbhay Choubey (Inactive) [ 2016-06-22 ]

svoj I appears the patch would also fix MDEV-9486.

Generated at Thu Feb 08 07:36:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.