[MDEV-14414] mysqld got signal 11 Created: 2017-11-15  Updated: 2018-07-25  Resolved: 2018-05-30

Status: Closed
Project: MariaDB Server
Component/s: Query Cache, Server, Storage Engine - Federated
Affects Version/s: 10.1.28
Fix Version/s: 10.1.31

Type: Bug Priority: Major
Reporter: Steve Fatula Assignee: Unassigned
Resolution: Fixed Votes: 1
Labels: None
Environment:

Centos 7.4.1708 64 bit 64GB ram, hardware raid array


Attachments: Text File crash.log    
Issue Links:
Duplicate
duplicates MDEV-16822 mysqld crash restart Closed
is duplicated by MDEV-15617 [ERROR] mysqld got signal 11 ; Closed
Relates
relates to MDEV-14272 Mariadb crashes with signal 11 when u... Closed
relates to MDEV-12951 Server crash [mysqld got exception 0x... Closed

 Description   
  • In the middle of the day, when use is actually low (we run massive updates over night and machine is quite a bit more busy then with many threads going), mariadb died. CPU use was around 12% on one core by mariadb at the time. No idea from the log why, so, including here. From reading the info, it would appear there is no dump, right? I see no reference to any fix in 10.1.29.

mysqld would have been started with the following arguments:
--datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file=/var/run/mariadb/mariadb.pid --symbolic-links=0 --innodb_file_per_table=1 --datadir=/home/mysql --socket=/var/lib/mysql/mysql.sock --tmpdir=/home/mysqltemp --log-bin=/var/lib/mysqllogs/binlog --expire_logs_days=3 --log_error=/var/log/mariadb/mariadb.log --slow_query_log=1 --slow_query_log_file=/var/log/mariadb/slow-query.log --long_query_time=30 --log_slow_admin_statements=1 --user=mysql --binlog-format=MIXED --collation_server=latin1_general_cs --group_concat_max_len=2M --server_id=9 --slave_compressed_protocol=1 --symbolic-links=0 --max_connections=250 --thread_handling=pool-of-threads --extra_port=3307 --connect_timeout=50 --max_heap_table_size=512M --tmp_table_size=512M --max_allowed_packet=64M --query_cache_size=128M --query_prealloc_size=499712 --net_read_timeout=300 --net_write_timeout=600 --table_open_cache=10000 --innodb_flush_log_at_trx_commit=0 --innodb_log_file_size=1G --innodb_buffer_pool_size=24G --innodb_lock_wait_timeout=500 --innodb_flush_method=O_DIRECT --innodb_file_format=Barracuda --innodb_status_output=0 --innodb_status_output_locks=1 --innodb_stats_sample_pages=12 --innodb_read_io_threads=8 --key_buffer_size=256M --join_buffer_size=512K --bulk_insert_buffer_size=512M --read_rnd_buffer_size=1M --aria_pagecache_buffer_size=512M --aria_sort_buffer_size=256M



 Comments   
Comment by Daniel Black [ 2017-11-16 ]

For easy reference:

171115 12:09:44 [ERROR] mysqld got signal 11 ;
...
Thread pointer: 0x7f85d5ae7008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f8b80233ff8 thread_stack 0x48400
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55ffd482c92e]
/usr/sbin/mysqld(handle_fatal_signal+0x305)[0x55ffd434fce5]
/lib64/libpthread.so.0(+0xf5e0)[0x7f8c8946f5e0]
/usr/sbin/mysqld(_ZN13QUERY_PROFILE10new_statusEPKcS1_S1_j+0x70)[0x55ffd429e000]
/usr/sbin/mysqld(set_thd_stage_info+0xd4)[0x55ffd4191da4]
/usr/sbin/mysqld(_ZN11Query_cache8try_lockEP3THDNS_19Cache_try_lock_modeE+0x91)[0x55ffd4189031]
/usr/sbin/mysqld(_ZN11Query_cache6insertEP3THDP15Query_cache_tlsPKcmj+0x3f)[0x55ffd418d1ff]
/usr/sbin/mysqld(net_real_write+0x45)[0x55ffd4136e05]
/usr/sbin/mysqld(net_flush+0x23)[0x55ffd4137143]
/usr/sbin/mysqld(net_write_command+0x3fb)[0x55ffd41378ab]
/usr/sbin/mysqld(cli_advanced_command+0xe2)[0x55ffd432f782]
/usr/sbin/mysqld(mysql_send_query+0x31)[0x55ffd432ce61]
/usr/sbin/mysqld(mysql_real_query+0x11)[0x55ffd432ce81]
/usr/lib64/mysql/plugin/ha_federated.so(_ZN12ha_federated4infoEj+0x13c)[0x7f8c7b5f85fc]
/usr/sbin/mysqld(+0x484215)[0x55ffd4214215]
/usr/sbin/mysqld(_ZN4JOIN14optimize_innerEv+0x72f)[0x55ffd421cc2f]
/usr/sbin/mysqld(_ZN4JOIN8optimizeEv+0x2f)[0x55ffd421f51f]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x8f)[0x55ffd421f65f]
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x255)[0x55ffd42201e5]
/usr/sbin/mysqld(+0x376297)[0x55ffd4106297]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x6cb0)[0x55ffd41ca9b0]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x342)[0x55ffd41cd702]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x280a)[0x55ffd41d106a]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x136)[0x55ffd41d19c6]
/usr/sbin/mysqld(_Z26threadpool_process_requestP3THD+0xfb)[0x55ffd42dad2b]
/usr/sbin/mysqld(+0x598735)[0x55ffd4328735]
/lib64/libpthread.so.0(+0x7e25)[0x7f8c89467e25]
/lib64/libc.so.6(clone+0x6d)[0x7f8c8780b34d]

Comment by Steve Fatula [ 2017-11-16 ]

It appears my comment to this issue did not save, will try again. This problem is almost certainly a duplicate of MDEV-12951, after it crashed another time, I noticed the query it reported in both cases was a federated query just as the other issue states. I would suggest closing this case. I am going to try to change to federatedx as the other bug report shows, and, perhaps it will also go away.

Comment by Elena Stepanova [ 2017-11-16 ]

sfatula, I will keep it open for a while longer, please let us know how it goes with FederatedX. If it works for you, then we'll close it.

Comment by Steve Fatula [ 2017-11-20 ]

The problem appears "resolved" by switching from federated to federatedx. We have no good reason to still use federated, so, federatedx is fine for us.

Comment by Elena Stepanova [ 2017-11-20 ]

Thank you. Then indeed it's the same problem as MDEV-12951, which was fixed in 10.2, but still exists in 10.1. The test case from MDEV-12951 applies:

--source include/have_innodb.inc
 
INSTALL SONAME 'ha_federated';
 
SET GLOBAL query_cache_size= 16*1024*1024;
SET GLOBAL query_cache_type= 1;
 
CREATE TABLE t1 (i INT) ENGINE=InnoDB;
eval
CREATE TABLE t2 (i INT) ENGINE=FEDERATED 
  CONNECTION="mysql://root@localhost:$MASTER_MYPORT/test/t1";
 
SELECT * FROM t2;

10.1 0f43279cc472d8

#3  <signal handler called>
#4  0x000056407f0277f3 in QUERY_PROFILE::new_status (this=0x8f8f8f8f8f8f8f8f, status_arg=0x56407f82192a "Waiting for query cache lock", function_arg=0x56407f836cdb <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", file_arg=0x56407f834420 "/data/src/10.1/sql/sql_cache.cc", line_arg=603) at /data/src/10.1/sql/sql_profile.cc:312
#5  0x000056407ee31df8 in PROFILING::status_change (this=0x7ff3283ca578, status_arg=0x56407f82192a "Waiting for query cache lock", function_arg=0x56407f836cdb <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", file_arg=0x56407f834420 "/data/src/10.1/sql/sql_cache.cc", line_arg=603) at /data/src/10.1/sql/sql_profile.h:312
#6  0x000056407ee323f2 in THD::enter_stage (this=0x7ff3283c7070, stage=0x56407ffd2db0 <stage_waiting_for_query_cache_lock>, calling_func=0x56407f836cdb <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", calling_file=0x56407f834420 "/data/src/10.1/sql/sql_cache.cc", calling_line=603) at /data/src/10.1/sql/sql_class.h:2066
#7  0x000056407eead4d4 in set_thd_stage_info (thd_arg=0x7ff3283c7070, new_stage=0x56407ffd2db0 <stage_waiting_for_query_cache_lock>, old_stage=0x7ff333a42a68, calling_func=0x56407f836cdb <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", calling_file=0x56407f834420 "/data/src/10.1/sql/sql_cache.cc", calling_line=603) at /data/src/10.1/sql/sql_class.cc:557
#8  0x000056407eeab8b4 in Query_cache_wait_state::Query_cache_wait_state (this=0x7ff333a42a60, thd=0x7ff3283c7070, func=0x56407f836cdb <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", file=0x56407f834420 "/data/src/10.1/sql/sql_cache.cc", line=603) at /data/src/10.1/sql/sql_cache.cc:432
#9  0x000056407ee9dfee in Query_cache::try_lock (this=0x5640801ce4c0 <query_cache>, thd=0x7ff3283c7070, mode=Query_cache::WAIT) at /data/src/10.1/sql/sql_cache.cc:603
#10 0x000056407ee9efeb in Query_cache::insert (this=0x5640801ce4c0 <query_cache>, thd=0x7ff3283c7070, query_cache_tls=0x7ff3283c7328, packet=0x7ff31ba6b070 "\001", length=5, pkt_nr=1) at /data/src/10.1/sql/sql_cache.cc:1082
#11 0x000056407ee9ef1b in query_cache_insert (thd_arg=0x7ff3283c7070, packet=0x7ff31ba6b070 "\001", length=5, pkt_nr=1) at /data/src/10.1/sql/sql_cache.cc:1057
#12 0x000056407ee3380a in net_real_write (net=0x7ff31b869c70, packet=0x7ff31ba6b070 "\001", len=5) at /data/src/10.1/sql/net_serv.cc:606
#13 0x000056407ee33139 in net_flush (net=0x7ff31b869c70) at /data/src/10.1/sql/net_serv.cc:363
#14 0x000056407ee3354e in net_write_command (net=0x7ff31b869c70, command=1 '\001', header=0x0, head_len=0, packet=0x0, len=0) at /data/src/10.1/sql/net_serv.cc:501
#15 0x000056407f0f09a7 in cli_advanced_command (mysql=0x7ff31b869c70, command=COM_QUIT, header=0x0, header_length=0, arg=0x0, arg_length=0, skip_check=1 '\001', stmt=0x0) at /data/src/10.1/sql-common/client.c:701
#16 0x000056407f0f7c85 in mysql_close_slow_part (mysql=0x7ff31b869c70) at /data/src/10.1/sql-common/client.c:3950
#17 0x000056407f0f7cfe in mysql_close (mysql=0x7ff31b869c70) at /data/src/10.1/sql-common/client.c:3962
#18 0x00007ff33182aec3 in ha_federated::close (this=0x7ff31b8b0888) at /data/src/10.1/storage/federated/ha_federated.cc:1693
#19 0x000056407f134091 in handler::ha_close (this=0x7ff31b8b0888) at /data/src/10.1/sql/handler.cc:2596
#20 0x000056407efe5053 in closefrm (table=0x7ff31b881c70, free_share=true) at /data/src/10.1/sql/table.cc:3054
#21 0x000056407ee864d9 in intern_close_table (table=0x7ff31b881c70) at /data/src/10.1/sql/sql_base.cc:354
#22 0x000056407f0b5aab in tc_purge (mark_flushed=true) at /data/src/10.1/sql/table_cache.cc:204
#23 0x000056407ee868fc in close_cached_tables (thd=0x0, tables=0x0, wait_for_refresh=false, timeout=31536000) at /data/src/10.1/sql/sql_base.cc:485
#24 0x000056407f0b6422 in tdc_start_shutdown () at /data/src/10.1/sql/table_cache.cc:460
#25 0x000056407ee22e12 in clean_up (print_message=true) at /data/src/10.1/sql/mysqld.cc:2120
#26 0x000056407ee22a38 in unireg_end () at /data/src/10.1/sql/mysqld.cc:2002
#27 0x000056407ee2294f in kill_server (sig_ptr=0x0) at /data/src/10.1/sql/mysqld.cc:1930
#28 0x000056407ee22977 in kill_server_thread (arg=0x7ff333b6b1d0) at /data/src/10.1/sql/mysqld.cc:1953
#29 0x000056407f3dfd46 in pfs_spawn_thread (arg=0x7ff31b02c0f0) at /data/src/10.1/storage/perfschema/pfs.cc:1861
#30 0x00007ff333766494 in start_thread (arg=0x7ff333a43b00) at pthread_create.c:333
#31 0x00007ff331b1f93f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Comment by Henk Plessius [ 2018-04-12 ]

I have the same problem, but already using ha_federatedx.so
Server version: 10.1.26-MariaDB-0+deb9u1
Just a simple query with a local union, acc_vpn1 is the federated table

We think the query pointer is invalid, but we will try to print it anyway.
Query: delete from acc_vpn1 where `time` < current_date - interval 5 day and callid in (select callid from acc union all select callid from acc_backup) and time_hires in (select time_hires from acc union all select time_hires from acc_backup)

last part of the stack trace:
/usr/sbin/mysqld(mysql_real_query+0x11)[0x558f058d5b61]
/usr/lib/x86_64-linux-gnu/mariadb18/plugin/ha_federated.so(_ZN12ha_federated4infoEj+0x14c)[0x7f7fdadf806c]
/usr/sbin/mysqld(_Z12mysql_deleteP3THDP10TABLE_LISTP4ItemP10SQL_I_ListI8st_orderEyyP13select_result+0x676)[0x558f05a130b6]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x2e58)[0x558f05767158]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x311)[0x558f0576d251]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x27d7)[0x558f05770bc7]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x148)[0x558f057713d8]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x1d2)[0x558f0583f552]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x558f0583f6b0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7494)[0x7f7fe053e494]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f7fdeef8acf]

if more data is needed, I'm happy to provide it. Remote DB is old, 5.5.56
Unfortunately, the data cannot made public

Comment by Elena Stepanova [ 2018-05-30 ]

It was fixed in 10.1.31 in the scope of MDEV-14272.

Generated at Thu Feb 08 08:13:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.