[MDEV-15796] MariaDB crashes on startup with semisync master enabled Created: 2018-04-06  Updated: 2018-04-14  Resolved: 2018-04-14

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.3.5
Fix Version/s: 10.3.6

Type: Bug Priority: Blocker
Reporter: Michaël de groot Assignee: Sergey Vojtovich
Resolution: Fixed Votes: 0
Labels: None


 Description   

After a clean install and minimum configuration, MariaDB crashes on start up with semisync enabled.

bash# curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash -s -- --mariadb-server-version=mariadb-10.3
bash# yum install MariaDB-server

[mysqld]
log_bin
binlog_format=ROW
server_id=1000
rpl_semi_sync_master_enabled=1
rpl_semi_sync_slave_enabled=1

bash# systemctl start MariaDB

Log output:

Apr  6 17:20:41 ip-10-0-1-250 mysqld: Status: NOT_KILLED
Apr  6 17:20:41 ip-10-0-1-250 mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
Apr  6 17:20:41 ip-10-0-1-250 mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Apr  6 17:20:41 ip-10-0-1-250 mysqld: information that should help you find out what is causing the crash.
Apr  6 17:20:46 ip-10-0-1-250 mysqld: 2018-04-06 17:20:46 0 [Note] /usr/sbin/mysqld (mysqld 10.3.5-MariaDB-log) starting as process 12508 ...
Apr  6 17:20:46 ip-10-0-1-250 mysqld: 2018-04-06 17:20:46 0 [Note] Semi-sync replication initialized for transactions.
Apr  6 17:20:46 ip-10-0-1-250 mysqld: 2018-04-06 17:20:46 0 [Note] Semi-sync replication enabled on the master.
Apr  6 17:20:46 ip-10-0-1-250 mysqld: 2018-04-06 17:20:46 0 [Warning] No argument was provided to --log-bin and neither --log-basename or --log-bin-index where used;  This may cause repliction to break when this server acts as a master and has its hostname changed! Please use '--log-basename=ip-10-0-1-250.ec2.internal' or '--log-bin=ip-10-0-1-250-bin' to avoid this problem.
Apr  6 17:20:46 ip-10-0-1-250 mysqld: 180406 17:20:46 [ERROR] mysqld got signal 11 ;
Apr  6 17:20:46 ip-10-0-1-250 mysqld: This could be because you hit a bug. It is also possible that this binary
Apr  6 17:20:46 ip-10-0-1-250 mysqld: or one of the libraries it was linked against is corrupt, improperly built,
Apr  6 17:20:46 ip-10-0-1-250 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
Apr  6 17:20:46 ip-10-0-1-250 mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Apr  6 17:20:46 ip-10-0-1-250 mysqld: We will try our best to scrape up some info that will hopefully help
Apr  6 17:20:46 ip-10-0-1-250 mysqld: diagnose the problem, but since we have already crashed,
Apr  6 17:20:46 ip-10-0-1-250 mysqld: something is definitely wrong and this may fail.
Apr  6 17:20:46 ip-10-0-1-250 mysqld: Server version: 10.3.5-MariaDB-log
Apr  6 17:20:46 ip-10-0-1-250 mysqld: key_buffer_size=0
Apr  6 17:20:46 ip-10-0-1-250 mysqld: read_buffer_size=131072
Apr  6 17:20:46 ip-10-0-1-250 mysqld: max_used_connections=0
Apr  6 17:20:46 ip-10-0-1-250 mysqld: max_threads=153
Apr  6 17:20:46 ip-10-0-1-250 mysqld: thread_count=0
Apr  6 17:20:46 ip-10-0-1-250 mysqld: It is possible that mysqld could use up to
Apr  6 17:20:46 ip-10-0-1-250 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 336312 K  bytes of memory
Apr  6 17:20:46 ip-10-0-1-250 mysqld: Hope that's ok; if not, decrease some variables in the equation.
Apr  6 17:20:46 ip-10-0-1-250 mysqld: Thread pointer: 0x7fd3000009a8
Apr  6 17:20:46 ip-10-0-1-250 mysqld: Attempting backtrace. You can use the following information to find out
Apr  6 17:20:46 ip-10-0-1-250 mysqld: where mysqld died. If you see no messages after this, something went
Apr  6 17:20:46 ip-10-0-1-250 mysqld: terribly wrong...
Apr  6 17:20:46 ip-10-0-1-250 mysqld: stack_bottom = 0x0 thread_stack 0x49000
Apr  6 17:20:46 ip-10-0-1-250 mysqld: mysys/stacktrace.c:270(my_print_stacktrace)[0x7fd31b4dfcce]
Apr  6 17:20:46 ip-10-0-1-250 mysqld: sql/signal_handler.cc:168(handle_fatal_signal)[0x7fd31af6dd17]
Apr  6 17:20:46 ip-10-0-1-250 mysqld: sigaction.c:0(__restore_rt)[0x7fd31a3da100]
Apr  6 17:20:46 ip-10-0-1-250 mysqld: sql/sql_plugin.cc:970(intern_plugin_lock)[0x7fd31adad374]
Apr  6 17:20:46 ip-10-0-1-250 mysqld: sql/sql_class.cc:1286(THD::init(bool))[0x7fd31ad618e8]
Apr  6 17:20:46 ip-10-0-1-250 mysqld: sql/sql_profile.h:318(PROFILING::set_thd(THD*))[0x7fd31ad62a5f]
Apr  6 17:20:46 ip-10-0-1-250 mysqld: sql/semisync_master_ack_receiver.cc:209(Ack_receiver::run())[0x7fd31aebdf47]
Apr  6 17:20:46 ip-10-0-1-250 mysqld: sql/semisync_master_ack_receiver.cc:36(ack_receive_handler)[0x7fd31aebe589]
Apr  6 17:20:46 ip-10-0-1-250 mysqld: pthread_create.c:0(start_thread)[0x7fd31a3d2dc5]
Apr  6 17:20:46 ip-10-0-1-250 mysqld: /lib64/libc.so.6(clone+0x6d)[0x7fd318a2a28d]
Apr  6 17:20:46 ip-10-0-1-250 mysqld: Trying to get some variables.
Apr  6 17:20:46 ip-10-0-1-250 mysqld: Some pointers may be invalid and cause the dump to abort.
Apr  6 17:20:46 ip-10-0-1-250 mysqld: Query (0x0):
Apr  6 17:20:46 ip-10-0-1-250 mysqld: Connection ID (thread ID): 1
Apr  6 17:20:46 ip-10-0-1-250 mysqld: Status: NOT_KILLED



 Comments   
Comment by Elena Stepanova [ 2018-04-06 ]

--rpl_semi_sync_master_enabled=1 alone on startup is already enough to get the crash. Setting it at runtime, as all MTR tests seem to do, doesn't cause the trouble.

10.3 d8da97b09ab

#3  <signal handler called>
#4  0x000055aa6a26b89e in intern_plugin_lock (lex=0x0, rc=0x0, state_mask=14) at /data/src/10.3/sql/sql_plugin.cc:970
#5  0x000055aa6a27240c in plugin_thdvar_init (thd=0x7f0ab4001d50) at /data/src/10.3/sql/sql_plugin.cc:3177
#6  0x000055aa6a1f0e51 in THD::init (this=0x7f0ab4001d50, skip_lock=true) at /data/src/10.3/sql/sql_class.cc:1282
#7  0x000055aa6a1ef974 in THD::THD (this=0x7f0ab4001d50, id=1, is_wsrep_applier=false, skip_global_sys_var_lock=true) at /data/src/10.3/sql/sql_class.cc:904
#8  0x000055aa6a4309cd in Ack_receiver::run (this=0x55aa6b9c41c0 <ack_receiver>) at /data/src/10.3/sql/semisync_master_ack_receiver.cc:209
#9  0x000055aa6a4300c8 in ack_receive_handler (arg=0x55aa6b9c41c0 <ack_receiver>) at /data/src/10.3/sql/semisync_master_ack_receiver.cc:35
#10 0x00007f0ac9236494 in start_thread (arg=0x7f0ac6c36700) at pthread_create.c:333
#11 0x00007f0ac761c93f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Comment by Andrei Elkin [ 2018-04-12 ]

Sergei has commented on the initial patch which I am refining accordingly.

Comment by Andrei Elkin [ 2018-04-14 ]

95811f01f34 pushed to 10.3.

Generated at Thu Feb 08 08:24:05 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.