Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Duplicate
-
10.3.12
-
None
-
CentOS Linux 7
Description
While trying to ALTER a very large table (23M records), The server , and all galera linked servers crash. The peer server can never recover and and I have to do a manual rsync to get them to restart.
2019-02-02 22:52:09 0x7f4960194700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.3.12/storage/innobase/btr/btr0pcur.cc line 461
|
InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == btr_pcur_get_block(cursor)->page.id.page_no()
|
InnoDB: We intentionally generate a memory trap.
|
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
|
InnoDB: If you get repeated assertion failures or crashes, even
|
InnoDB: immediately after the mysqld startup, there may be
|
InnoDB: corruption in the InnoDB tablespace. Please refer to
|
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
|
InnoDB: about forcing recovery.
|
190202 22:52:09 [ERROR] mysqld got signal 6 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail.
|
|
Server version: 10.3.12-MariaDB-log
|
key_buffer_size=134217728
|
read_buffer_size=131072
|
max_used_connections=2
|
max_threads=153
|
thread_count=10
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467389 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
|
Thread pointer: 0x7f491c000ac8
|
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
stack_bottom = 0x7f4960193d30 thread_stack 0x49000
|
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55d09923dd1e]
|
/usr/sbin/mysqld(handle_fatal_signal+0x357)[0x55d098cded97]
|
sigaction.c:0(__restore_rt)[0x7f4975ac55d0]
|
:0(__GI_raise)[0x7f4973d99207]
|
:0(__GI_abort)[0x7f4973d9a8f8]
|
ut/ut0rbt.cc:218(rbt_tree_add_child(ib_rbt_t const*, ib_rbt_bound_t*, ib_rbt_node_t*) [clone .isra.4] [clone .part.5])[0x55d098a29946]
|
btr/btr0pcur.cc:459(btr_pcur_move_to_next_page(btr_pcur_t*, mtr_t*))[0x55d09902b152]
|
include/btr0pcur.ic:340(btr_pcur_move_to_next(btr_pcur_t*, mtr_t*))[0x55d098a27c6b]
|
row/row0sel.cc:5589(row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long))[0x55d098f94201]
|
handler/ha_innodb.cc:9537(ha_innobase::general_fetch(unsigned char*, unsigned int, unsigned int))[0x55d098ebced8]
|
sql/handler.cc:2765(handler::ha_rnd_next(unsigned char*))[0x55d098ce3bc7]
|
sql/records.cc:481(rr_sequential(READ_RECORD*))[0x55d098e0785c]
|
sql/records.h:73(READ_RECORD::read_record())[0x55d098a1cdd0]
|
sql/sql_table.cc:9881(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool))[0x55d098b9aaac]
|
sql/sql_alter.cc:497(Sql_cmd_alter_table::execute(THD*))[0x55d098be7db3]
|
sql/sql_parse.cc:6285(mysql_execute_command(THD*))[0x55d098b09b05]
|
sql/sql_parse.cc:8093(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55d098b1182b]
|
sql/sql_parse.cc:7887(wsrep_mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55d098b121c1]
|
sql/sql_parse.cc:1915(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x55d098b13a43]
|
sql/sql_parse.cc:1398(do_command(THD*))[0x55d098b1532e]
|
sql/sql_connect.cc:1402(do_handle_one_connection(CONNECT*))[0x55d098be512a]
|
sql/sql_connect.cc:1310(handle_one_connection)[0x55d098be524d]
|
pthread_create.c:0(start_thread)[0x7f4975abddd5]
|
/lib64/libc.so.6(clone+0x6d)[0x7f4973e60ead]
|
|
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x7f491c011730): ALTER TABLE contents ADD publisher_id INT UNSIGNED, ADD cr_holder_id INT UNSIGNED, ADD INDEX IDX_content_publishers(publisher_id), ADD INDEX IDX_content_copyright(cr_holder_id), ADD CONSTRAINT Refpublishers1691 FOREIGN KEY (publisher_id) REFERENCES publishers(publisher_id), ADD CONSTRAINT Refcopyright_holders1701 FOREIGN KEY (cr_holder_id) REFERENCES copyright_holders(cr_holder_id)
|
Connection ID (thread ID): 11
|
Status: NOT_KILLED
|
|
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
|
|
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
|
information that should help you find out what is causing the crash.
|
2019-02-02 22:52:22 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
|
2019-02-02 22:52:22 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
|
2019-02-02 22:52:22 0 [Note] WSREP: wsrep_load(): Galera 25.3.25(r3836) by Codership Oy <info@codership.com> loaded successfully.
|
2019-02-02 22:52:22 0 [Note] WSREP: CRC-32C: using hardware acceleration.
|
2019-02-02 22:52:22 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 0
|
2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: 6e8ceebd-277d-11e9-9b83-13112382c60b, offset: -1
|
2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer initial scan... 0.0% ( 0/134217768 bytes) complete.
|
2019-02-02 22:52:22 0 [Note] WSREP: Attempt to reuse the same seqno: 1. New ptr = 0x7f1087734ce8, seqno_g: 1, seqno_d: 0, size: 920, ctx: 0x56110d49a548, flags: 1. store: 1, cs: 00000000 00000000 00000000 00000000, previous ptr = 0x7f107fb99528, seqno_g: 1, seqno_d: 0, size: 936, ctx: 0x56110d49a548, flags: 1. store: 1, cs: 00000000 00000000 00000000 00000000
|
2019-02-02 22:52:22 0 [Note] WSREP: Contents differ. Discarding both.
|
2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (134217768/134217768 bytes) complete.
|
2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: found gapless sequence 2-2
|
2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: discarding seqnos 1-1
|
2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer unused buffers scan... 0.0% ( 0/936 bytes) complete.
|
2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer unused buffers scan...100.0% (936/936 bytes) complete.
|
2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): found 0/1 locked buffers
|
2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): used space: 936/134217728
|
2019-02-02 22:52:22 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.50.216; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT
|
2019-02-02 22:52:22 0 [Note] WSREP: GCache history reset: 6e8ceebd-277d-11e9-9b83-13112382c60b:2 -> 00000000-0000-0000-0000-000000000000:-1
|
2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): full reset
|
2019-02-02 22:52:22 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
|
2019-02-02 22:52:22 0 [Note] WSREP: wsrep_sst_grab()
|
2019-02-02 22:52:22 0 [Note] WSREP: Start replication
|
2019-02-02 22:52:22 0 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
|
2019-02-02 22:52:22 0 [Note] WSREP: protonet asio version 0
|
2019-02-02 22:52:22 0 [Note] WSREP: Using CRC-32C for message checksums.
|
2019-02-02 22:52:22 0 [Note] WSREP: backend: asio
|
2019-02-02 22:52:22 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
|
2019-02-02 22:52:22 0 [Note] WSREP: restore pc from disk successfully
|
2019-02-02 22:52:22 0 [Note] WSREP: GMCast version 0
|
2019-02-02 22:52:22 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
|
2019-02-02 22:52:22 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
|
2019-02-02 22:52:22 0 [Note] WSREP: EVS version 0
|
2019-02-02 22:52:22 0 [Note] WSREP: gcomm: connecting to group '9jr DB Cluster', peer '192.168.50.30:,192.168.50.65:,192.168.50.213:,192.168.50.214:,192.168.50.225:,192.168.50.55:'
|
2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.213:4567 timed out, no messages seen in PT3S
|
2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.225:4567 timed out, no messages seen in PT3S
|
2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.55:4567 timed out, no messages seen in PT3S
|
2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.65:4567 timed out, no messages seen in PT3S
|
2019-02-02 22:52:25 0 [Warning] WSREP: no nodes coming from prim view, prim not possible
|
2019-02-02 22:52:25 0 [Note] WSREP: view(view_id(NON_PRIM,f766f12e,4) memb {
|
f766f12e,0
|
} joined {
|
} left {
|
} partitioned {
|
})
|
2019-02-02 22:52:25 0 [Note] WSREP: gcomm: connected
|
2019-02-02 22:52:25 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
|
2019-02-02 22:52:25 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
|
2019-02-02 22:52:25 0 [Note] WSREP: Opened channel '9jr DB Cluster'
|
2019-02-02 22:52:25 0 [Note] WSREP: Waiting for SST to complete.
|
2019-02-02 22:52:25 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
|
2019-02-02 22:52:25 0 [Note] WSREP: Flow-control interval: [16, 16]
|
2019-02-02 22:52:25 0 [Note] WSREP: Trying to continue unpaused monitor
|
2019-02-02 22:52:25 0 [Note] WSREP: Received NON-PRIMARY.
|
2019-02-02 22:52:25 2 [Note] WSREP: New cluster view: global state: :-1, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version -1
|
2019-02-02 22:52:25 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
|
2019-02-02 22:52:26 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50278S), skipping check
|
2019-02-02 22:52:29 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.213:4567 timed out, no messages seen in PT3S
|
2019-02-02 22:52:32 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection established to 119c0d79 tcp://192.168.50.30:4567
|
2019-02-02 22:52:32 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
|
2019-02-02 22:52:32 0 [Note] WSREP: declaring 119c0d79 at tcp://192.168.50.30:4567 stable
|
2019-02-02 22:52:32 0 [Warning] WSREP: no nodes coming from prim view, prim not possible
|
2019-02-02 22:52:32 0 [Note] WSREP: view(view_id(NON_PRIM,119c0d79,5) memb {
|
119c0d79,0
|
f766f12e,0
|
} joined {
|
} left {
|
} partitioned {
|
})
|
2019-02-02 22:52:32 0 [Note] WSREP: promote to primary component
|
2019-02-02 22:52:32 0 [Note] WSREP: view(view_id(PRIM,119c0d79,5) memb {
|
119c0d79,0
|
f766f12e,0
|
} joined {
|
} left {
|
} partitioned {
|
})
|
2019-02-02 22:52:32 0 [Note] WSREP: save pc into disk
|
2019-02-02 22:52:32 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 1, memb_num = 2
|
2019-02-02 22:52:32 0 [Note] WSREP: Flow-control interval: [23, 23]
|
2019-02-02 22:52:32 0 [Note] WSREP: Trying to continue unpaused monitor
|
2019-02-02 22:52:32 0 [Note] WSREP: Received NON-PRIMARY.
|
2019-02-02 22:52:32 2 [Note] WSREP: New cluster view: global state: :-1, view# -1: non-Primary, number of nodes: 2, my index: 1, protocol version -1
|
2019-02-02 22:52:32 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
|
2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.213:4567
|
2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.214:4567
|
2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.225:4567
|
2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr proto entry 0x7f1070001240
|
2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.55:4567
|
2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.65:4567
|
2019-02-02 22:52:32 0 [Note] WSREP: clear restored view
|
2019-02-02 22:52:32 0 [Warning] WSREP: non weight changing install in S_PRIM: pcmsg{ type=INSTALL, seq=0, flags= 2, node_map { 119c0d79,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,119c0d79,2),to_seq=-1,weight=1,segment=0
|
f766f12e,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,119c0d79,2),to_seq=-1,weight=1,segment=0
|
}}
|
2019-02-02 22:52:32 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = yes, my_idx = 1, memb_num = 2
|
2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
|
2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 47e64989-2780-11e9-a986-ef5ad4566067
|
2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: got state msg: 47e64989-2780-11e9-a986-ef5ad4566067 from 0 (kvm.amar.com)
|
2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: got state msg: 47e64989-2780-11e9-a986-ef5ad4566067 from 1 (db.9jr.com)
|
2019-02-02 22:52:32 0 [Warning] WSREP: Quorum: No node with complete state:
|
|
Version : 4
|
Flags : 0x5
|
Protocols : 0 / 9 / 3
|
State : NON-PRIMARY
|
Desync count : 0
|
Prim state : NON-PRIMARY
|
Prim UUID : 00000000-0000-0000-0000-000000000000
|
Prim seqno : -1
|
First seqno : -1
|
Last seqno : -1
|
Prim JOINED : 0
|
State UUID : 47e64989-2780-11e9-a986-ef5ad4566067
|
Group UUID : 00000000-0000-0000-0000-000000000000
|
Name : 'kvm.amar.com'
|
Incoming addr: '192.168.50.30:3306'
|
|
Version : 4
|
Flags : 0x4
|
Protocols : 0 / 9 / 3
|
State : NON-PRIMARY
|
Desync count : 0
|
Prim state : NON-PRIMARY
|
Prim UUID : 00000000-0000-0000-0000-000000000000
|
Prim seqno : -1
|
First seqno : -1
|
Last seqno : -1
|
Prim JOINED : 0
|
State UUID : 47e64989-2780-11e9-a986-ef5ad4566067
|
Group UUID : 00000000-0000-0000-0000-000000000000
|
Name : 'db.9jr.com'
|
Incoming addr: '192.168.50.216:3306'
|
|
2019-02-02 22:52:32 0 [Warning] WSREP: No re-merged primary component found.
|
2019-02-02 22:52:32 0 [Note] WSREP: Bootstrapped primary 00000000-0000-0000-0000-000000000000 found: 2.
|
2019-02-02 22:52:32 0 [Note] WSREP: Quorum results:
|
version = 4,
|
component = PRIMARY,
|
conf_id = -1,
|
members = 2/2 (joined/total),
|
act_id = -1,
|
last_appl. = -1,
|
protocols = 0/9/3 (gcs/repl/appl),
|
group UUID = 00000000-0000-0000-0000-000000000000
|
2019-02-02 22:52:32 0 [Note] WSREP: Flow-control interval: [23, 23]
|
2019-02-02 22:52:32 0 [Note] WSREP: Trying to continue unpaused monitor
|
2019-02-02 22:52:32 0 [Note] WSREP: Restored state OPEN -> JOINED (-1)
|
2019-02-02 22:52:32 2 [Note] WSREP: New cluster view: global state: :-1, view# 0: Primary, number of nodes: 2, my index: 1, protocol version 3
|
2019-02-02 22:52:32 0 [ERROR] WSREP: SST failed: 1 (Operation not permitted)
|
2019-02-02 22:52:32 0 [ERROR] Aborting
|
|
Error in my_thread_global_end(): 1 threads didn't exit
|
Attachments
Issue Links
- relates to
-
MDEV-13542 Crashing on a corrupted page is unhelpful
-
- Closed
-
-
MDEV-34118 fsp_alloc_free_extent() fails to flag DB_OUT_OF_FILE_SPACE
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Description |
While trying to ALTER a very large table (23M records), The server , and all galera linked servers crash. The peer server can never recover and and I have to do a manual rsync to get them to restart.
2019-02-02 22:52:09 0x7f4960194700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.3.12/storage/innobase/btr/btr0pcur.cc line 461 InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == btr_pcur_get_block(cursor)->page.id.page_no() InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/ InnoDB: about forcing recovery. 190202 22:52:09 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.3.12-MariaDB-log key_buffer_size=134217728 read_buffer_size=131072 max_used_connections=2 max_threads=153 thread_count=10 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467389 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x7f491c000ac8 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f4960193d30 thread_stack 0x49000 /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55d09923dd1e] /usr/sbin/mysqld(handle_fatal_signal+0x357)[0x55d098cded97] sigaction.c:0(__restore_rt)[0x7f4975ac55d0] :0(__GI_raise)[0x7f4973d99207] :0(__GI_abort)[0x7f4973d9a8f8] ut/ut0rbt.cc:218(rbt_tree_add_child(ib_rbt_t const*, ib_rbt_bound_t*, ib_rbt_node_t*) [clone .isra.4] [clone .part.5])[0x55d098a29946] btr/btr0pcur.cc:459(btr_pcur_move_to_next_page(btr_pcur_t*, mtr_t*))[0x55d09902b152] include/btr0pcur.ic:340(btr_pcur_move_to_next(btr_pcur_t*, mtr_t*))[0x55d098a27c6b] row/row0sel.cc:5589(row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long))[0x55d098f94201] handler/ha_innodb.cc:9537(ha_innobase::general_fetch(unsigned char*, unsigned int, unsigned int))[0x55d098ebced8] sql/handler.cc:2765(handler::ha_rnd_next(unsigned char*))[0x55d098ce3bc7] sql/records.cc:481(rr_sequential(READ_RECORD*))[0x55d098e0785c] sql/records.h:73(READ_RECORD::read_record())[0x55d098a1cdd0] sql/sql_table.cc:9881(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool))[0x55d098b9aaac] sql/sql_alter.cc:497(Sql_cmd_alter_table::execute(THD*))[0x55d098be7db3] sql/sql_parse.cc:6285(mysql_execute_command(THD*))[0x55d098b09b05] sql/sql_parse.cc:8093(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55d098b1182b] sql/sql_parse.cc:7887(wsrep_mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55d098b121c1] sql/sql_parse.cc:1915(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x55d098b13a43] sql/sql_parse.cc:1398(do_command(THD*))[0x55d098b1532e] sql/sql_connect.cc:1402(do_handle_one_connection(CONNECT*))[0x55d098be512a] sql/sql_connect.cc:1310(handle_one_connection)[0x55d098be524d] pthread_create.c:0(start_thread)[0x7f4975abddd5] /lib64/libc.so.6(clone+0x6d)[0x7f4973e60ead] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7f491c011730): ALTER TABLE contents ADD publisher_id INT UNSIGNED, ADD cr_holder_id INT UNSIGNED, ADD INDEX IDX_content_publishers(publisher_id), ADD INDEX IDX_content_copyright(cr_holder_id), ADD CONSTRAINT Refpublishers1691 FOREIGN KEY (publisher_id) REFERENCES publishers(publisher_id), ADD CONSTRAINT Refcopyright_holders1701 FOREIGN KEY (cr_holder_id) REFERENCES copyright_holders(cr_holder_id) Connection ID (thread ID): 11 Status: NOT_KILLED Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 2019-02-02 22:52:22 0 [Note] WSREP: Read nil XID from storage engines, skipping position init 2019-02-02 22:52:22 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so' 2019-02-02 22:52:22 0 [Note] WSREP: wsrep_load(): Galera 25.3.25(r3836) by Codership Oy <info@codership.com> loaded successfully. 2019-02-02 22:52:22 0 [Note] WSREP: CRC-32C: using hardware acceleration. 2019-02-02 22:52:22 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 0 2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: 6e8ceebd-277d-11e9-9b83-13112382c60b, offset: -1 2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer initial scan... 0.0% ( 0/134217768 bytes) complete. 2019-02-02 22:52:22 0 [Note] WSREP: Attempt to reuse the same seqno: 1. New ptr = 0x7f1087734ce8, seqno_g: 1, seqno_d: 0, size: 920, ctx: 0x56110d49a548, flags: 1. store: 1, cs: 00000000 00000000 00000000 00000000, previous ptr = 0x7f107fb99528, seqno_g: 1, seqno_d: 0, size: 936, ctx: 0x56110d49a548, flags: 1. store: 1, cs: 00000000 00000000 00000000 00000000 2019-02-02 22:52:22 0 [Note] WSREP: Contents differ. Discarding both. 2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (134217768/134217768 bytes) complete. 2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: found gapless sequence 2-2 2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: discarding seqnos 1-1 2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer unused buffers scan... 0.0% ( 0/936 bytes) complete. 2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer unused buffers scan...100.0% (936/936 bytes) complete. 2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): found 0/1 locked buffers 2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): used space: 936/134217728 2019-02-02 22:52:22 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.50.216; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT 2019-02-02 22:52:22 0 [Note] WSREP: GCache history reset: 6e8ceebd-277d-11e9-9b83-13112382c60b:2 -> 00000000-0000-0000-0000-000000000000:-1 2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): full reset 2019-02-02 22:52:22 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1 2019-02-02 22:52:22 0 [Note] WSREP: wsrep_sst_grab() 2019-02-02 22:52:22 0 [Note] WSREP: Start replication 2019-02-02 22:52:22 0 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1 2019-02-02 22:52:22 0 [Note] WSREP: protonet asio version 0 2019-02-02 22:52:22 0 [Note] WSREP: Using CRC-32C for message checksums. 2019-02-02 22:52:22 0 [Note] WSREP: backend: asio 2019-02-02 22:52:22 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 2019-02-02 22:52:22 0 [Note] WSREP: restore pc from disk successfully 2019-02-02 22:52:22 0 [Note] WSREP: GMCast version 0 2019-02-02 22:52:22 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567 2019-02-02 22:52:22 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') multicast: , ttl: 1 2019-02-02 22:52:22 0 [Note] WSREP: EVS version 0 2019-02-02 22:52:22 0 [Note] WSREP: gcomm: connecting to group '9jr DB Cluster', peer '192.168.50.30:,192.168.50.65:,192.168.50.213:,192.168.50.214:,192.168.50.225:,192.168.50.55:' 2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.213:4567 timed out, no messages seen in PT3S 2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.225:4567 timed out, no messages seen in PT3S 2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.55:4567 timed out, no messages seen in PT3S 2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.65:4567 timed out, no messages seen in PT3S 2019-02-02 22:52:25 0 [Warning] WSREP: no nodes coming from prim view, prim not possible 2019-02-02 22:52:25 0 [Note] WSREP: view(view_id(NON_PRIM,f766f12e,4) memb { f766f12e,0 } joined { } left { } partitioned { }) 2019-02-02 22:52:25 0 [Note] WSREP: gcomm: connected 2019-02-02 22:52:25 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636 2019-02-02 22:52:25 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0) 2019-02-02 22:52:25 0 [Note] WSREP: Opened channel '9jr DB Cluster' 2019-02-02 22:52:25 0 [Note] WSREP: Waiting for SST to complete. 2019-02-02 22:52:25 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1 2019-02-02 22:52:25 0 [Note] WSREP: Flow-control interval: [16, 16] 2019-02-02 22:52:25 0 [Note] WSREP: Trying to continue unpaused monitor 2019-02-02 22:52:25 0 [Note] WSREP: Received NON-PRIMARY. 2019-02-02 22:52:25 2 [Note] WSREP: New cluster view: global state: :-1, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version -1 2019-02-02 22:52:25 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2019-02-02 22:52:26 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50278S), skipping check 2019-02-02 22:52:29 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.213:4567 timed out, no messages seen in PT3S 2019-02-02 22:52:32 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection established to 119c0d79 tcp://192.168.50.30:4567 2019-02-02 22:52:32 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: 2019-02-02 22:52:32 0 [Note] WSREP: declaring 119c0d79 at tcp://192.168.50.30:4567 stable 2019-02-02 22:52:32 0 [Warning] WSREP: no nodes coming from prim view, prim not possible 2019-02-02 22:52:32 0 [Note] WSREP: view(view_id(NON_PRIM,119c0d79,5) memb { 119c0d79,0 f766f12e,0 } joined { } left { } partitioned { }) 2019-02-02 22:52:32 0 [Note] WSREP: promote to primary component 2019-02-02 22:52:32 0 [Note] WSREP: view(view_id(PRIM,119c0d79,5) memb { 119c0d79,0 f766f12e,0 } joined { } left { } partitioned { }) 2019-02-02 22:52:32 0 [Note] WSREP: save pc into disk 2019-02-02 22:52:32 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 1, memb_num = 2 2019-02-02 22:52:32 0 [Note] WSREP: Flow-control interval: [23, 23] 2019-02-02 22:52:32 0 [Note] WSREP: Trying to continue unpaused monitor 2019-02-02 22:52:32 0 [Note] WSREP: Received NON-PRIMARY. 2019-02-02 22:52:32 2 [Note] WSREP: New cluster view: global state: :-1, view# -1: non-Primary, number of nodes: 2, my index: 1, protocol version -1 2019-02-02 22:52:32 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.213:4567 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.214:4567 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.225:4567 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr proto entry 0x7f1070001240 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.55:4567 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.65:4567 2019-02-02 22:52:32 0 [Note] WSREP: clear restored view 2019-02-02 22:52:32 0 [Warning] WSREP: non weight changing install in S_PRIM: pcmsg{ type=INSTALL, seq=0, flags= 2, node_map { 119c0d79,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,119c0d79,2),to_seq=-1,weight=1,segment=0 f766f12e,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,119c0d79,2),to_seq=-1,weight=1,segment=0 }} 2019-02-02 22:52:32 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = yes, my_idx = 1, memb_num = 2 2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID. 2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 47e64989-2780-11e9-a986-ef5ad4566067 2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: got state msg: 47e64989-2780-11e9-a986-ef5ad4566067 from 0 (kvm.amar.com) 2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: got state msg: 47e64989-2780-11e9-a986-ef5ad4566067 from 1 (db.9jr.com) 2019-02-02 22:52:32 0 [Warning] WSREP: Quorum: No node with complete state: Version : 4 Flags : 0x5 Protocols : 0 / 9 / 3 State : NON-PRIMARY Desync count : 0 Prim state : NON-PRIMARY Prim UUID : 00000000-0000-0000-0000-000000000000 Prim seqno : -1 First seqno : -1 Last seqno : -1 Prim JOINED : 0 State UUID : 47e64989-2780-11e9-a986-ef5ad4566067 Group UUID : 00000000-0000-0000-0000-000000000000 Name : 'kvm.amar.com' Incoming addr: '192.168.50.30:3306' Version : 4 Flags : 0x4 Protocols : 0 / 9 / 3 State : NON-PRIMARY Desync count : 0 Prim state : NON-PRIMARY Prim UUID : 00000000-0000-0000-0000-000000000000 Prim seqno : -1 First seqno : -1 Last seqno : -1 Prim JOINED : 0 State UUID : 47e64989-2780-11e9-a986-ef5ad4566067 Group UUID : 00000000-0000-0000-0000-000000000000 Name : 'db.9jr.com' Incoming addr: '192.168.50.216:3306' 2019-02-02 22:52:32 0 [Warning] WSREP: No re-merged primary component found. 2019-02-02 22:52:32 0 [Note] WSREP: Bootstrapped primary 00000000-0000-0000-0000-000000000000 found: 2. 2019-02-02 22:52:32 0 [Note] WSREP: Quorum results: version = 4, component = PRIMARY, conf_id = -1, members = 2/2 (joined/total), act_id = -1, last_appl. = -1, protocols = 0/9/3 (gcs/repl/appl), group UUID = 00000000-0000-0000-0000-000000000000 2019-02-02 22:52:32 0 [Note] WSREP: Flow-control interval: [23, 23] 2019-02-02 22:52:32 0 [Note] WSREP: Trying to continue unpaused monitor 2019-02-02 22:52:32 0 [Note] WSREP: Restored state OPEN -> JOINED (-1) 2019-02-02 22:52:32 2 [Note] WSREP: New cluster view: global state: :-1, view# 0: Primary, number of nodes: 2, my index: 1, protocol version 3 2019-02-02 22:52:32 0 [ERROR] WSREP: SST failed: 1 (Operation not permitted) 2019-02-02 22:52:32 0 [ERROR] Aborting Error in my_thread_global_end(): 1 threads didn't exit |
While trying to ALTER a very large table (23M records), The server , and all galera linked servers crash. The peer server can never recover and and I have to do a manual rsync to get them to restart.
{noformat} 2019-02-02 22:52:09 0x7f4960194700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.3.12/storage/innobase/btr/btr0pcur.cc line 461 InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == btr_pcur_get_block(cursor)->page.id.page_no() InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/ InnoDB: about forcing recovery. 190202 22:52:09 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.3.12-MariaDB-log key_buffer_size=134217728 read_buffer_size=131072 max_used_connections=2 max_threads=153 thread_count=10 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467389 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x7f491c000ac8 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f4960193d30 thread_stack 0x49000 /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55d09923dd1e] /usr/sbin/mysqld(handle_fatal_signal+0x357)[0x55d098cded97] sigaction.c:0(__restore_rt)[0x7f4975ac55d0] :0(__GI_raise)[0x7f4973d99207] :0(__GI_abort)[0x7f4973d9a8f8] ut/ut0rbt.cc:218(rbt_tree_add_child(ib_rbt_t const*, ib_rbt_bound_t*, ib_rbt_node_t*) [clone .isra.4] [clone .part.5])[0x55d098a29946] btr/btr0pcur.cc:459(btr_pcur_move_to_next_page(btr_pcur_t*, mtr_t*))[0x55d09902b152] include/btr0pcur.ic:340(btr_pcur_move_to_next(btr_pcur_t*, mtr_t*))[0x55d098a27c6b] row/row0sel.cc:5589(row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long))[0x55d098f94201] handler/ha_innodb.cc:9537(ha_innobase::general_fetch(unsigned char*, unsigned int, unsigned int))[0x55d098ebced8] sql/handler.cc:2765(handler::ha_rnd_next(unsigned char*))[0x55d098ce3bc7] sql/records.cc:481(rr_sequential(READ_RECORD*))[0x55d098e0785c] sql/records.h:73(READ_RECORD::read_record())[0x55d098a1cdd0] sql/sql_table.cc:9881(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool))[0x55d098b9aaac] sql/sql_alter.cc:497(Sql_cmd_alter_table::execute(THD*))[0x55d098be7db3] sql/sql_parse.cc:6285(mysql_execute_command(THD*))[0x55d098b09b05] sql/sql_parse.cc:8093(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55d098b1182b] sql/sql_parse.cc:7887(wsrep_mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55d098b121c1] sql/sql_parse.cc:1915(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x55d098b13a43] sql/sql_parse.cc:1398(do_command(THD*))[0x55d098b1532e] sql/sql_connect.cc:1402(do_handle_one_connection(CONNECT*))[0x55d098be512a] sql/sql_connect.cc:1310(handle_one_connection)[0x55d098be524d] pthread_create.c:0(start_thread)[0x7f4975abddd5] /lib64/libc.so.6(clone+0x6d)[0x7f4973e60ead] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7f491c011730): ALTER TABLE contents ADD publisher_id INT UNSIGNED, ADD cr_holder_id INT UNSIGNED, ADD INDEX IDX_content_publishers(publisher_id), ADD INDEX IDX_content_copyright(cr_holder_id), ADD CONSTRAINT Refpublishers1691 FOREIGN KEY (publisher_id) REFERENCES publishers(publisher_id), ADD CONSTRAINT Refcopyright_holders1701 FOREIGN KEY (cr_holder_id) REFERENCES copyright_holders(cr_holder_id) Connection ID (thread ID): 11 Status: NOT_KILLED Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 2019-02-02 22:52:22 0 [Note] WSREP: Read nil XID from storage engines, skipping position init 2019-02-02 22:52:22 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so' 2019-02-02 22:52:22 0 [Note] WSREP: wsrep_load(): Galera 25.3.25(r3836) by Codership Oy <info@codership.com> loaded successfully. 2019-02-02 22:52:22 0 [Note] WSREP: CRC-32C: using hardware acceleration. 2019-02-02 22:52:22 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 0 2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: 6e8ceebd-277d-11e9-9b83-13112382c60b, offset: -1 2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer initial scan... 0.0% ( 0/134217768 bytes) complete. 2019-02-02 22:52:22 0 [Note] WSREP: Attempt to reuse the same seqno: 1. New ptr = 0x7f1087734ce8, seqno_g: 1, seqno_d: 0, size: 920, ctx: 0x56110d49a548, flags: 1. store: 1, cs: 00000000 00000000 00000000 00000000, previous ptr = 0x7f107fb99528, seqno_g: 1, seqno_d: 0, size: 936, ctx: 0x56110d49a548, flags: 1. store: 1, cs: 00000000 00000000 00000000 00000000 2019-02-02 22:52:22 0 [Note] WSREP: Contents differ. Discarding both. 2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (134217768/134217768 bytes) complete. 2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: found gapless sequence 2-2 2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: discarding seqnos 1-1 2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer unused buffers scan... 0.0% ( 0/936 bytes) complete. 2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer unused buffers scan...100.0% (936/936 bytes) complete. 2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): found 0/1 locked buffers 2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): used space: 936/134217728 2019-02-02 22:52:22 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.50.216; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT 2019-02-02 22:52:22 0 [Note] WSREP: GCache history reset: 6e8ceebd-277d-11e9-9b83-13112382c60b:2 -> 00000000-0000-0000-0000-000000000000:-1 2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): full reset 2019-02-02 22:52:22 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1 2019-02-02 22:52:22 0 [Note] WSREP: wsrep_sst_grab() 2019-02-02 22:52:22 0 [Note] WSREP: Start replication 2019-02-02 22:52:22 0 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1 2019-02-02 22:52:22 0 [Note] WSREP: protonet asio version 0 2019-02-02 22:52:22 0 [Note] WSREP: Using CRC-32C for message checksums. 2019-02-02 22:52:22 0 [Note] WSREP: backend: asio 2019-02-02 22:52:22 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 2019-02-02 22:52:22 0 [Note] WSREP: restore pc from disk successfully 2019-02-02 22:52:22 0 [Note] WSREP: GMCast version 0 2019-02-02 22:52:22 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567 2019-02-02 22:52:22 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') multicast: , ttl: 1 2019-02-02 22:52:22 0 [Note] WSREP: EVS version 0 2019-02-02 22:52:22 0 [Note] WSREP: gcomm: connecting to group '9jr DB Cluster', peer '192.168.50.30:,192.168.50.65:,192.168.50.213:,192.168.50.214:,192.168.50.225:,192.168.50.55:' 2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.213:4567 timed out, no messages seen in PT3S 2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.225:4567 timed out, no messages seen in PT3S 2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.55:4567 timed out, no messages seen in PT3S 2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.65:4567 timed out, no messages seen in PT3S 2019-02-02 22:52:25 0 [Warning] WSREP: no nodes coming from prim view, prim not possible 2019-02-02 22:52:25 0 [Note] WSREP: view(view_id(NON_PRIM,f766f12e,4) memb { f766f12e,0 } joined { } left { } partitioned { }) 2019-02-02 22:52:25 0 [Note] WSREP: gcomm: connected 2019-02-02 22:52:25 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636 2019-02-02 22:52:25 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0) 2019-02-02 22:52:25 0 [Note] WSREP: Opened channel '9jr DB Cluster' 2019-02-02 22:52:25 0 [Note] WSREP: Waiting for SST to complete. 2019-02-02 22:52:25 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1 2019-02-02 22:52:25 0 [Note] WSREP: Flow-control interval: [16, 16] 2019-02-02 22:52:25 0 [Note] WSREP: Trying to continue unpaused monitor 2019-02-02 22:52:25 0 [Note] WSREP: Received NON-PRIMARY. 2019-02-02 22:52:25 2 [Note] WSREP: New cluster view: global state: :-1, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version -1 2019-02-02 22:52:25 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2019-02-02 22:52:26 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50278S), skipping check 2019-02-02 22:52:29 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.213:4567 timed out, no messages seen in PT3S 2019-02-02 22:52:32 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection established to 119c0d79 tcp://192.168.50.30:4567 2019-02-02 22:52:32 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: 2019-02-02 22:52:32 0 [Note] WSREP: declaring 119c0d79 at tcp://192.168.50.30:4567 stable 2019-02-02 22:52:32 0 [Warning] WSREP: no nodes coming from prim view, prim not possible 2019-02-02 22:52:32 0 [Note] WSREP: view(view_id(NON_PRIM,119c0d79,5) memb { 119c0d79,0 f766f12e,0 } joined { } left { } partitioned { }) 2019-02-02 22:52:32 0 [Note] WSREP: promote to primary component 2019-02-02 22:52:32 0 [Note] WSREP: view(view_id(PRIM,119c0d79,5) memb { 119c0d79,0 f766f12e,0 } joined { } left { } partitioned { }) 2019-02-02 22:52:32 0 [Note] WSREP: save pc into disk 2019-02-02 22:52:32 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 1, memb_num = 2 2019-02-02 22:52:32 0 [Note] WSREP: Flow-control interval: [23, 23] 2019-02-02 22:52:32 0 [Note] WSREP: Trying to continue unpaused monitor 2019-02-02 22:52:32 0 [Note] WSREP: Received NON-PRIMARY. 2019-02-02 22:52:32 2 [Note] WSREP: New cluster view: global state: :-1, view# -1: non-Primary, number of nodes: 2, my index: 1, protocol version -1 2019-02-02 22:52:32 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.213:4567 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.214:4567 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.225:4567 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr proto entry 0x7f1070001240 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.55:4567 2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.65:4567 2019-02-02 22:52:32 0 [Note] WSREP: clear restored view 2019-02-02 22:52:32 0 [Warning] WSREP: non weight changing install in S_PRIM: pcmsg{ type=INSTALL, seq=0, flags= 2, node_map { 119c0d79,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,119c0d79,2),to_seq=-1,weight=1,segment=0 f766f12e,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,119c0d79,2),to_seq=-1,weight=1,segment=0 }} 2019-02-02 22:52:32 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = yes, my_idx = 1, memb_num = 2 2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID. 2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 47e64989-2780-11e9-a986-ef5ad4566067 2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: got state msg: 47e64989-2780-11e9-a986-ef5ad4566067 from 0 (kvm.amar.com) 2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: got state msg: 47e64989-2780-11e9-a986-ef5ad4566067 from 1 (db.9jr.com) 2019-02-02 22:52:32 0 [Warning] WSREP: Quorum: No node with complete state: Version : 4 Flags : 0x5 Protocols : 0 / 9 / 3 State : NON-PRIMARY Desync count : 0 Prim state : NON-PRIMARY Prim UUID : 00000000-0000-0000-0000-000000000000 Prim seqno : -1 First seqno : -1 Last seqno : -1 Prim JOINED : 0 State UUID : 47e64989-2780-11e9-a986-ef5ad4566067 Group UUID : 00000000-0000-0000-0000-000000000000 Name : 'kvm.amar.com' Incoming addr: '192.168.50.30:3306' Version : 4 Flags : 0x4 Protocols : 0 / 9 / 3 State : NON-PRIMARY Desync count : 0 Prim state : NON-PRIMARY Prim UUID : 00000000-0000-0000-0000-000000000000 Prim seqno : -1 First seqno : -1 Last seqno : -1 Prim JOINED : 0 State UUID : 47e64989-2780-11e9-a986-ef5ad4566067 Group UUID : 00000000-0000-0000-0000-000000000000 Name : 'db.9jr.com' Incoming addr: '192.168.50.216:3306' 2019-02-02 22:52:32 0 [Warning] WSREP: No re-merged primary component found. 2019-02-02 22:52:32 0 [Note] WSREP: Bootstrapped primary 00000000-0000-0000-0000-000000000000 found: 2. 2019-02-02 22:52:32 0 [Note] WSREP: Quorum results: version = 4, component = PRIMARY, conf_id = -1, members = 2/2 (joined/total), act_id = -1, last_appl. = -1, protocols = 0/9/3 (gcs/repl/appl), group UUID = 00000000-0000-0000-0000-000000000000 2019-02-02 22:52:32 0 [Note] WSREP: Flow-control interval: [23, 23] 2019-02-02 22:52:32 0 [Note] WSREP: Trying to continue unpaused monitor 2019-02-02 22:52:32 0 [Note] WSREP: Restored state OPEN -> JOINED (-1) 2019-02-02 22:52:32 2 [Note] WSREP: New cluster view: global state: :-1, view# 0: Primary, number of nodes: 2, my index: 1, protocol version 3 2019-02-02 22:52:32 0 [ERROR] WSREP: SST failed: 1 (Operation not permitted) 2019-02-02 22:52:32 0 [ERROR] Aborting Error in my_thread_global_end(): 1 threads didn't exit {noformat} |
Labels | need_feedback |
Component/s | Storage Engine - InnoDB [ 10129 ] |
Summary | Crash On Table structure Change on very large table | Crash On Table structure Change on very large table - rendering the table (and server) unusable |
Labels | need_feedback |
Priority | Blocker [ 1 ] | Major [ 3 ] |
Assignee | Marko Mäkelä [ marko ] |
Summary | Crash On Table structure Change on very large table - rendering the table (and server) unusable | Crash On Table structure Change on very large table - With filesystem not having enough space, rendering the table (and server) unusable |
Fix Version/s | 10.3 [ 22126 ] |
Fix Version/s | 10.4 [ 22408 ] |
Workflow | MariaDB v3 [ 92232 ] | MariaDB v4 [ 141137 ] |
Fix Version/s | 10.3 [ 22126 ] |
Fix Version/s | 10.6 [ 24028 ] | |
Fix Version/s | 10.11 [ 27614 ] | |
Fix Version/s | 11.4 [ 29301 ] | |
Fix Version/s | 11.8 [ 29921 ] | |
Fix Version/s | 10.4(EOL) [ 22408 ] | |
Assignee | Marko Mäkelä [ marko ] | Thirunarayanan Balathandayuthapani [ thiru ] |
Priority | Major [ 3 ] | Critical [ 2 ] |
Link |
This issue relates to |
Link |
This issue relates to |
Fix Version/s | N/A [ 14700 ] | |
Fix Version/s | 10.6 [ 24028 ] | |
Fix Version/s | 10.11 [ 27614 ] | |
Fix Version/s | 11.4 [ 29301 ] | |
Fix Version/s | 11.8 [ 29921 ] | |
Resolution | Duplicate [ 3 ] | |
Status | Open [ 1 ] | Closed [ 6 ] |
The command that crashes and corrupts the full cluster every time: