Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.1.9
-
Server version: 10.1.9-MariaDB-log MariaDB Server
CentOS 6.6
2.6.32-504.el6.x86_64 #1 SMP Wed Oct 15 04:27:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
3 nodes of mariadb galera cluster
-
10.1.18
Description
Updated at 2016/1/15
From the stack trace and the code, it seems wsrep thread is trying to abort a QUERY_COMMITING?
I'm not sure what does that mean, from the literal, wsrep is aborting a committing tran? So what's happened?
Does the alter table fail cause this?(current node fail, but another node unique index created), Or
alter table add unique index has been succefully commited but not flushed to disk yet, then wsrep
failed to abort but hit signal 11 that caused the current node remains unchange?
wsrep_grant_mdl_exception()
|
... ...
|
else if (granted_thd->wsrep_query_state == QUERY_COMMITTING) |
{
|
WSREP_DEBUG("MDL granted, but committing thd abort scheduled"); |
ticket->wsrep_report(wsrep_debug);
|
mysql_mutex_unlock(&granted_thd->LOCK_wsrep_thd);
|
wsrep_abort_thd((void*)request_thd, (void*)granted_thd, 1); |
ret = FALSE;
|
ha_abort_transaction()
|
... ...
|
THD_TRANS *trans= &victim_thd->transaction.all;
|
Ha_trx_info *ha_info= trans->ha_list, *ha_info_next;
|
 |
for (; ha_info; ha_info= ha_info_next) |
{
|
handlerton *hton= ha_info->ht();
|
if (!hton->abort_transaction) // <--- signal 11 should occur here |
I assume we are trying to abort the wsrep thread for this alter table command replication.
It seems we don't regist the corrent storage engine info info the transaction
==============================================================================
In an online enviroment, I need to modify the index type to be unique, so I fire an alter
command
ALTER TABLE `ht_user_loginfo` |
DROP INDEX `Index_uniqueid`, |
ADD UNIQUE INDEX `Index_uniqueid` (`uniqueid`) |
to my mariadb galera cluster 10.1.9 with 3 nodes (two data nodes and one arbitrator node),
unfortunately, the mariadb server crashed, from the errorlog,I can see
it htis signal 11,
Then I check another data node of cluster, I found the unique index is created successfully,
while current node iis failed.
The status showed below is captured after I recovered the cluster from the crash.
MariaDB [(none)]> show status like 'wsrep%';
|
+------------------------------+-----------------------------------------------------+
|
| Variable_name | Value |
|
+------------------------------+-----------------------------------------------------+
|
| wsrep_apply_oooe | 0.695295 |
|
| wsrep_apply_oool | 0.001338 |
|
| wsrep_apply_window | 7.136138 |
|
| wsrep_causal_reads | 0 |
|
| wsrep_cert_deps_distance | 116.807631 |
|
| wsrep_cert_index_size | 85 |
|
| wsrep_cert_interval | 8.518146 |
|
| wsrep_cluster_conf_id | 3 |
|
| wsrep_cluster_size | 3 |
|
| wsrep_cluster_state_uuid | 41c2f17c-b8dd-11e5-a2f0-d29d6194f4fd |
|
| wsrep_cluster_status | Primary |
|
| wsrep_commit_oooe | 0.000000 |
|
| wsrep_commit_oool | 0.000000 |
|
| wsrep_commit_window | 6.429097 |
|
| wsrep_connected | ON |
|
| wsrep_evs_delayed | |
|
| wsrep_evs_evict_list | |
|
| wsrep_evs_repl_latency | 0.000330851/0.000559662/0.00195228/0.000179164/1643 |
|
| wsrep_evs_state | OPERATIONAL |
|
| wsrep_flow_control_paused | 0.009547 |
|
| wsrep_flow_control_paused_ns | 1819437628239 |
|
| wsrep_flow_control_recv | 534776 |
|
| wsrep_flow_control_sent | 0 |
|
| wsrep_gcomm_uuid | c6e2e2f9-ad43-11e5-874e-e29556fbb355 |
|
| wsrep_incoming_addresses | 192.168.6.101:3306,192.168.6.21:3306, |
|
| wsrep_last_committed | 63491248 |
|
| wsrep_local_bf_aborts | 0 |
|
| wsrep_local_cached_downto | 63263077 |
|
| wsrep_local_cert_failures | 13 |
|
| wsrep_local_commits | 63491216 |
|
| wsrep_local_index | 1 |
|
| wsrep_local_recv_queue | 0 |
|
| wsrep_local_recv_queue_avg | 0.005204 |
|
| wsrep_local_recv_queue_max | 4 |
|
| wsrep_local_recv_queue_min | 0 |
|
| wsrep_local_replays | 0 |
|
| wsrep_local_send_queue | 0 |
|
| wsrep_local_send_queue_avg | 1.841741 |
|
| wsrep_local_send_queue_max | 62 |
|
| wsrep_local_send_queue_min | 0 |
|
| wsrep_local_state | 4 |
|
| wsrep_local_state_comment | Synced |
|
| wsrep_local_state_uuid | 41c2f17c-b8dd-11e5-a2f0-d29d6194f4fd |
|
| wsrep_protocol_version | 7 |
|
| wsrep_provider_name | Galera |
|
| wsrep_provider_vendor | Codership Oy <info@codership.com> |
|
| wsrep_provider_version | 25.3.9(r3385) |
|
| wsrep_ready | ON |
|
| wsrep_received | 620329 |
|
| wsrep_received_bytes | 4963695 |
|
| wsrep_repl_data_bytes | 31427131711 |
|
| wsrep_repl_keys | 272911424 |
|
| wsrep_repl_keys_bytes | 3643668604 |
|
| wsrep_repl_other_bytes | 0 |
|
| wsrep_replicated | 63491247 |
|
| wsrep_replicated_bytes | 39134240123 |
|
| wsrep_thread_count | 2 |
|
+------------------------------+-----------------------------------------------------+
|
below is the errorlog
2016-01-12 9:35:24 140579879008000 [Warning] IP address '168.8.44.132' could not be resolved: Temporary failure in name resolution
|
2016-01-12 11:29:15 140579879008000 [Note] InnoDB: Online DDL : Start
|
2016-01-12 11:29:15 140579879008000 [Note] InnoDB: Online DDL : Start reading clustered index of the table and create temporary files
|
2016-01-12 11:29:16 140579879008000 [Note] InnoDB: Online DDL : End of reading clustered index of the table and create temporary files
|
2016-01-12 11:29:16 140579879008000 [Note] InnoDB: Online DDL : Start merge-sorting index `Index_uniqueid`--temporary-- (1 / 1), estimated cost : 20.0000
|
2016-01-12 11:29:16 140579879008000 [Note] InnoDB: Online DDL : merge-sorting has estimated 32 runs
|
2016-01-12 11:29:16 140579879008000 [Note] InnoDB: Online DDL : merge-sorting current run 1 estimated 32 runs
|
2016-01-12 11:29:16 140579879008000 [Note] InnoDB: Online DDL : merge-sorting current run 2 estimated 16 runs
|
2016-01-12 11:29:16 140579879008000 [Note] InnoDB: Online DDL : merge-sorting current run 3 estimated 8 runs
|
2016-01-12 11:29:16 140579879008000 [Note] InnoDB: Online DDL : merge-sorting current run 4 estimated 4 runs
|
2016-01-12 11:29:16 140579879008000 [Note] InnoDB: Online DDL : merge-sorting current run 5 estimated 2 runs
|
2016-01-12 11:29:16 140579879008000 [Note] InnoDB: Online DDL : End of merge-sorting index `Index_uniqueid`--temporary-- (1 / 1)
|
2016-01-12 11:29:16 140579879008000 [Note] InnoDB: Online DDL : Start building index `Index_uniqueid`--temporary-- (1 / 1), estimated cost : 30.0000
|
2016-01-12 11:29:17 140579879008000 [Note] InnoDB: Online DDL : End of building index `Index_uniqueid`--temporary-- (1 / 1)
|
2016-01-12 11:29:17 140579879008000 [Note] InnoDB: Online DDL : Start applying row log
|
2016-01-12 11:29:17 140579879008000 [Note] InnoDB: Online DDL : End of applying row log
|
2016-01-12 11:29:17 140579879008000 [Note] InnoDB: Online DDL : Completed
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
2016-01-12 11:29:17 140579879008000 [Warning] WSREP: Cannot abort transaction.
|
 |
160112 11:29:17 [ERROR] mysqld got signal 11 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
 |
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
|
 |
Server version: 10.1.9-MariaDB-log
|
key_buffer_size=67108864
|
read_buffer_size=1048576
|
max_used_connections=171
|
max_threads=3002
|
thread_count=152
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 27793254 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
 |
 |
Thread pointer: 0x0x7fdb78869008
|
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
stack_bottom = 0x7fdb4dc02dd8 thread_stack 0x48000
|
mysys/stacktrace.c:247(my_print_stacktrace)[0xbf822e]
|
sql/signal_handler.cc:160(handle_fatal_signal)[0x760e24]
|
/lib64/libpthread.so.0[0x35c500f710]
|
sql/handler.cc:6084(ha_abort_transaction(THD*, THD*, char))[0x7637ca]
|
sql/wsrep_thd.cc:603(wsrep_abort_thd(void*, void*, char))[0x6ff448]
|
sql/wsrep_mysqld.cc:1732(wsrep_grant_mdl_exception(MDL_context*, MDL_ticket*, MDL_key const*))[0x6f2423]
|
sql/mdl.cc:1173(MDL_lock::Ticket_list::add_ticket(MDL_ticket*))[0x6928cf]
|
psi/mysql_thread.h:662(inline_mysql_mutex_lock)[0x695580]
|
sql/mdl.cc:2241(MDL_context::upgrade_shared_lock(MDL_ticket*, enum_mdl_type, double))[0x695c94]
|
sql/sql_base.cc:1882(wait_while_table_is_used)[0x5612c7]
|
sql/sql_table.cc:7068(mysql_inplace_alter_table)[0x637846]
|
sql/sql_table.cc:8874(mysql_alter_table(THD*, char*, char*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool))[0x63fb0a]
|
sql/sql_alter.cc:327(Sql_cmd_alter_table::execute(THD*))[0x68dff3]
|
sql/sql_parse.cc:5677(mysql_execute_command(THD*))[0x5afa9c]
|
sql/sql_parse.cc:7302(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x5b7ed9]
|
sql/sql_parse.cc:7126(wsrep_mysql_parse)[0x5b7f68]
|
sql/sql_parse.cc:1489(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x5ba035]
|
sql/sql_parse.cc:1111(do_command(THD*))[0x5bb2f7]
|
sql/sql_connect.cc:1349(do_handle_one_connection(THD*))[0x689883]
|
sql/sql_connect.cc:1263(handle_one_connection)[0x689aa2]
|
perfschema/pfs.cc:1863(pfs_spawn_thread)[0xafa679]
|
 |
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x7fdb690c1020): is an invalid pointer
|
Connection ID (thread ID): 5105
|
Status: NOT_KILLED
|
 |
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
|
 |
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
|
information that should help you find out what is causing the crash.
|
160112 11:29:19 mysqld_safe Number of processes running now: 0
|
160112 11:29:19 mysqld_safe WSREP: not restarting wsrep node automatically
|
160112 11:29:19 mysqld_safe mysqld from pid file /mysqldata/3306/mysqld.pid ended
|
Attachments
Issue Links
- duplicates
-
MDEV-11147 Crash during ALTER TABLE DROP PARTITION
- Closed
-
MDEV-11266 MariaDB crashes every other day
- Closed
- is duplicated by
-
MDEV-10768 Galera segfault on transaction abort after failed wsrep_grant_mdl_exception()
- Closed
-
MDEV-10835 Denial Of Service - Crash Any Galera Node - Race Condition: CREATE TABLE IF NOT EXISTS / INSERT
- Closed
- relates to
-
MDEV-10732 buffer overflow detected - mariadb crashing regularly
- Closed
- links to