[MDEV-7938] MariaDB Crashes Suddenly while writing binlogs Created: 2015-04-08  Updated: 2015-04-27  Resolved: 2015-04-27

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 5.5, 10.0, 10.1
Fix Version/s: 5.5.43, 10.0.18, 10.1.5

Type: Bug Priority: Major
Reporter: Nik Vyzas Assignee: Sergei Golubchik
Resolution: Fixed Votes: 0
Labels: verified
Environment:

CentOS release 6.4 (Final)



 Description   

The stack trace indicates that the issue was caused while writing the server's own binlogs (this is verified by the binlogs which were not written properly):

150408 12:13:53 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.0.17-MariaDB-log
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=226
max_threads=302
thread_count=200
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 794455 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7ee81d936008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7ee79d89dd30 thread_stack 0x30000
(my_addr_resolve failure: fork)
/usr/sbin/mysqld(my_print_stacktrace+0x2b) [0xb743db]
/usr/sbin/mysqld(handle_fatal_signal+0x398) [0x726a38]
/lib64/libpthread.so.0() [0x3891e0f500]
/lib64/libc.so.6(memcpy+0xa0) [0x3891a89900]
/usr/sbin/mysqld(_my_b_write+0x4f) [0xb5a36f]
/usr/sbin/mysqld(Log_event::wrapper_my_b_safe_write(st_io_cache*, unsigned char const*, unsigned long)+0x5a) [0x7ded3a]
/usr/sbin/mysqld(User_var_log_event::write(st_io_cache*)+0xfd) [0x7df76d]
/usr/sbin/mysqld(MYSQL_BIN_LOG::write(Log_event*, char*)+0x40b) [0x7d271b]
/usr/sbin/mysqld(close_temporary_tables(THD*)+0x389) [0x591e29]
/usr/sbin/mysqld(THD::cleanup()+0x3d) [0x5a6cfd]
/usr/sbin/mysqld(unlink_thd(THD*)+0x13) [0x54bd43]
/usr/sbin/mysqld(one_thread_per_connection_end(THD*, bool)+0x10) [0x54bf50]
/usr/sbin/mysqld(do_handle_one_connection(THD*)+0x29b) [0x69621b]
/usr/sbin/mysqld(handle_one_connection+0x42) [0x6964a2]
/lib64/libpthread.so.0() [0x3891e07851]
/lib64/libc.so.6(clone+0x6d) [0x3891ae894d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x0): is an invalid pointer
Connection ID (thread ID): 62268
Status: KILL_CONNECTION
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=off,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=off,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=on,mrr_cost_based=on,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=off,join_cache_hashed=off,join_cache_bka=off,optimize_join_buffer_size=off,table_elimination=off,extended_keys=on,exists_to_in=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.



 Comments   
Comment by Nik Vyzas [ 2015-04-08 ]

Some additional details:
1.) This node has no slave nodes connected - binlogs are written in case point in time recovery is needed
2.) This is a slave of 2x hosts and uses multi-source replication - the last transaction received from one of the hosts was a rollback after a DROP TEMPORARY TABLE xxx / generated by server /
3.) Thread pool is not being used

Comment by Elena Stepanova [ 2015-04-14 ]

According to the error log, the problem happened on KILL_CONNECTION.
Do you know what was it – somebody tried to kill the thread from another connection, or was it server shutdown, or...?
Did it happen once, or is it reproducible?
Do you happen to know the query it was trying to write?
Could you please attach your cnf files and SHOW ALL SLAVES STATUS output?
Thanks.

Comment by Nik Vyzas [ 2015-04-16 ]

The issue occurs several times per day however it is unclear under which situations this occurs. Also the last 3 recorded binlog statements are:

COMMIT
DROP TEMPORARY TABLE IF EXISTS `<some_schema_name>`.`<some_table_name>` / generated by server / # this is the second last statement before crash
ROLLBACK

Please find the requested information below:

#
# These groups are read by MariaDB server.
# Use it for options that only the server (but not clients) should see
#
# See the examples of server my.cnf files in /usr/share/mysql/
#
 
# this is read by the standalone daemon and embedded servers
[server]
 
# this is only for the mysqld standalone daemon
[mysqld]
default-time-zone=SYSTEM
server-id=244
 
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
tmpdir=/var/lib/mysql-tmp
user=mysql
#general_log
symbolic-links=0
binlog-format=MIXED
 
#innodbFiles
innodb_file_per_table
innodb_flush_method = O_DIRECT
innodb_buffer_pool_size = 180G
innodb_log_buffer_size = 64M
 
innodb_log_file_size = 3G
innodb_read_io_threads = 64
innodb_write_io_threads = 64
max_connections = 300
max_user_connections = 2000
key_buffer = 128M
#join_buffer_size = 2M
#read_buffer_size = 40M
#sort_buffer_size = 60M
thread_cache_size = 192K
max_allowed_packet =4096M
# Disabled query cache on 13/04/2015
query_cache_size = 0
query_cache_type = 0
#query_cache_limit = 2M
#query_cache_size = 256M
#query_cache_type = 1
interactive_timeout=700
wait_timeout=60
connect_timeout=60
tmp_table_size = 1024M
thread_stack = 192K
max_heap_table_size=320M
 
# BI1
host_cache_size=428
innodb_lock_wait_timeout=1000000
innodb_open_files=2000
long_query_time=0.001000
max_relay_log_size=0
#slave_skip_errors=1062,1452,1054,1091,1060,1050,1146,0,1005
slave_transaction_retries=1000000
table_definition_cache=1400
table_open_cache=2000
thread_cache_size=16384
 
#Optimizers
optimizer_use_condition_selectivity=1
#optimizer_switch=index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=off,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=off,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=on,mrr_cost_based=on,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=off,extended_keys=on,exists_to_in=on
 
optimizer_switch=index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=off,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=off,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=on,mrr_cost_based=on,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=off,join_cache_hashed=off,join_cache_bka=off,optimize_join_buffer_size=off,table_elimination=off,extended_keys=on,exists_to_in=on
 
 
#optimizer_use_condition_selectivity=2
character_set_server = utf8
collation_server = utf8_general_ci
init-connect = 'SET NAMES utf8'
 
# enable binary logs - eventum ticket #50072
log_bin = /var/log/mysql/mysql-bin.log
log-bin-index = /var/log/mysql/mysql-bin.index
 
# allow unsafe events to be stored in bin log
log_bin_trust_function_creators = 1
 
#Repl Filters
replicate_wild_ignore_table=XXXXXXXXXXXX.%
replicate_wild_ignore_table=XXXXXXXXXXXX.%
replicate_wild_ignore_table=XXXXXXXXXXXX.%
replicate_wild_ignore_table=XXXXXXXXXXXX.%
replicate_wild_ignore_table=XXXXXXXXXXXX.%
replicate_wild_ignore_table=XXXXXXXXXXXX.%
replicate_wild_ignore_table=XXXXXXXXXXXX%.%
replicate_wild_ignore_table=XXXXXXXXXXXX%.%
replicate_wild_ignore_table=XXXXXXXXXXXX.%
replicate_wild_ignore_table=XXXXXXXXXXXX.%
replicate_wild_ignore_table=XXXXXXXXXXXX%.%
replicate_wild_ignore_table=XXXXXXXXXXXX%.%
replicate_wild_ignore_table=XXXXXXXXXXXX.%
replicate_wild_ignore_table=%.XXXXXXXXXXXX
replicate_wild_ignore_table=%.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
replicate_ignore_table=XXXXXXXXXXXX.XXXXXXXXXXXX
 
# this is only for embedded server
[embedded]
 
# This group is only read by MariaDB servers, not by MySQL.
# If you use the same .cnf file for MySQL and MariaDB,
# you can put MariaDB-only options here
[mariadb]
 
# This group is only read by MariaDB-10.0 servers.
# If you use the same .cnf file for MariaDB of different versions,
# use this group for options that older servers don't understand
[mariadb-10.0]
 
 
#########################################
 
 
MariaDB [(none)]> show all slaves status\G
*************************** 1. row ***************************
Connection_name: DB123
Slave_SQL_State: updating
Slave_IO_State: Waiting for master to send event
Master_Host: XXX.XXX.XXX.1
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000218
Read_Master_Log_Pos: 64018426
Relay_Log_File: 669910-bi2-relay-bin-db01.000381
Relay_Log_Pos: 63348500
Relay_Master_Log_File: mysql-bin.000218
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table: XXXXXXXXXXXXX
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table: XXXXXXXXXXXXX
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 63348213
Relay_Log_Space: 64019063
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 19
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1000
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Retried_transactions: 0
Max_relay_log_size: 1073741824
Executed_log_entries: 1845792
Slave_received_heartbeats: 0
Slave_heartbeat_period: 1800.000
Gtid_Slave_Pos: 0-244-295793368
*************************** 2. row ***************************
Connection_name: DB456
Slave_SQL_State: updating
Slave_IO_State: Waiting for master to send event
Master_Host: XXX.XXX.XXX.2
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000375
Read_Master_Log_Pos: 144514571
Relay_Log_File: 669910-bi2-relay-bin-db03.000967
Relay_Log_Pos: 207428781
Relay_Master_Log_File: mysql-bin.000374
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table: XXXXXXXXXX
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table: XXXXXXXXXX
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 207428494
Relay_Log_Space: 681903439
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 3000
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Retried_transactions: 8
Max_relay_log_size: 1073741824
Executed_log_entries: 1314494
Slave_received_heartbeats: 0
Slave_heartbeat_period: 1800.000
Gtid_Slave_Pos: 0-244-295793368

Comment by Elena Stepanova [ 2015-04-17 ]

Thanks for the information.

Test case

--source include/have_log_bin.inc
--source include/have_binlog_format_mixed.inc
 
--connect(con1,localhost,root,,)
create table t1 (i int);
create trigger tr after insert on t1 for each row set @b=@a;
 
create temporary table tmp like t1;
insert into t1 values (1);
--disconnect con1
 
--connection default
select * from t1;

Stack trace from 5.5 commit 1115a59d3f

#2  0x00000000007e7edd in handle_fatal_signal (sig=11) at 5.5/sql/signal_handler.cc:262
#3  <signal handler called>
#4  0x00000000008b5e7a in MYSQL_BIN_LOG::write (this=0x15705a0, event_info=0x7feb43ae5770, with_annotate=0x0) at 5.5/sql/log.cc:5198
#5  0x00000000005d5ce8 in close_temporary_tables (thd=0x7feb3d614060) at 5.5/sql/sql_base.cc:1799
#6  0x00000000005fb690 in THD::cleanup (this=0x7feb3d614060) at 5.5/sql/sql_class.cc:1397
#7  0x0000000000587e8b in thd_cleanup (thd=0x7feb3d614060) at 5.5/sql/mysqld.cc:2450
#8  0x0000000000587f47 in unlink_thd (thd=0x7feb3d614060) at 5.5/sql/mysqld.cc:2484
#9  0x0000000000588328 in one_thread_per_connection_end (thd=0x7feb3d614060, put_in_cache=true) at 5.5/sql/mysqld.cc:2605
#10 0x00000000007341bc in do_handle_one_connection (thd_arg=0x7feb3d614060) at 5.5/sql/sql_connect.cc:1277
#11 0x0000000000733bd5 in handle_one_connection (arg=0x7feb3d614060) at 5.5/sql/sql_connect.cc:1181
#12 0x0000000000b6c5e5 in pfs_spawn_thread (arg=0x7feb3d5a5d00) at 5.5/storage/perfschema/pfs.cc:1015
#13 0x00007feb436d6b50 in start_thread (arg=<optimized out>) at pthread_create.c:304
#14 0x00007feb4198c95d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

If it doesn't crash, try building and running with valgrind.

Generated at Thu Feb 08 07:23:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.