Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Incomplete
-
10.3.36
-
Ubuntu 18.04.6 LTS
Physical Machine
RAM: 64G, CPU: 16, SSD - LVM
Description
Its a Galera cluster of 6 members (configured with pc.weight).
Sphinx Plugin loaded and in production use.
Each time, the checksum (pt-table-checksum) is executed on main writer (1 writer at a time), it crashes the node. Then, it resyncs via IST and joins cluster.
The complete error log is as follows:
2023-01-03 5:15:55 2 [ERROR] WSREP: wsrep_abort_slave_trx: BF Aborter BF thread: 2 seqno: 871572484 query_state: executing conflict_state: no conflict exec mode applier query: REPLACE INTO `exploitation`.`checksums_20230103T051001` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT /* 99997*/ 'service.auth', 'oauth_refresh_tokens', '8', 'PRIMARY', 'l4qxLOlUVYIMgVYbnvgAJranElthjE3yTqR2lPcqE6c3pdhLFo7Jya8uBChgBLqCp2Umg5DRU0bJ0fxA', 'pcE2bYvm2UpqP8ZaH9wfv2QoJi12ghJkPD6wKcSxCebyS8pI8e3l8c7T0yzBFZSexM7tjAsngvNbduyJ', COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', convert(`id` using utf8mb4), convert(`access_token_id` using utf8mb4), `revoked`, `expires_at`, CONCAT(ISNULL(`expires_at`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `service.auth`.`oauth_refresh_tokens` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= 'l4qxLOlUVYIMgVYbnvgAJranElthjE3yTqR2lPcqE6c3pdhLFo7Jya8uBChgBLqCp2Umg5DRU0bJ0fxA')) AND ((`id` <= 'pcE2bYvm2UpqP8ZaH9wfv2QoJi12ghJkPD6wKcSxCebyS8pI8e3l8c7T0yzBFZSexM7tjAsngvNbduyJ')) / |
2023-01-03 5:15:55 2 [ERROR] WSREP: wsrep_abort_slave_trx: Victim BF thread: 15 seqno: 871572485 query_state: idle conflict_state: must abort exec mode applier query: NULL |
2023-01-03 5:15:55 2 [ERROR] WSREP: Trx 871572484 tries to abort slave trx 871572485. This could be caused by: |
1) unsupported configuration options combination, please check documentation. |
2) a bug in the code. |
3) a database corruption. |
Node consistency compromized, need to abort. Restart the node to resync with cluster.
|
230103 5:15:55 [ERROR] mysqld got signal 6 ; |
This could be because you hit a bug. It is also possible that this binary |
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
 |
To report this bug, see https://mariadb.com/kb/en/reporting-bugs |
 |
We will try our best to scrape up some info that will hopefully help |
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail. |
 |
Server version: 10.3.36-MariaDB-1:10.3.36+maria~ubu1804-log |
key_buffer_size=402653184 |
read_buffer_size=2097152 |
max_used_connections=265 |
max_threads=2002 |
thread_count=143 |
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 8638539 K bytes of memory |
Hope that's ok; if not, decrease some variables in the equation. |
 |
Thread pointer: 0x7f7a3ec12008 |
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went |
terribly wrong...
|
stack_bottom = 0x7f7a5214bdc8 thread_stack 0x30000 |
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5619976a728e] |
/usr/sbin/mysqld(handle_fatal_signal+0x515)[0x561997139fe5] |
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12980)[0x7f7a4fd3f980] |
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7f7a4f97ae87] |
/lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7f7a4f97c7f1] |
/usr/sbin/mysqld(+0x4bd12b)[0x561996e7112b] |
/usr/sbin/mysqld(+0x903536)[0x5619972b7536] |
/usr/sbin/mysqld(+0x9539b5)[0x5619973079b5] |
/usr/sbin/mysqld(+0x958a81)[0x56199730ca81] |
/usr/sbin/mysqld(+0x9f0217)[0x5619973a4217] |
/usr/sbin/mysqld(+0x9f6467)[0x5619973aa467] |
/usr/sbin/mysqld(+0x912cdc)[0x5619972c6cdc] |
/usr/sbin/mysqld(_ZN7handler13ha_index_nextEPh+0xef)[0x56199713faff] |
/usr/sbin/mysqld(_ZN7handler15read_range_nextEv+0x20)[0x561997143c60] |
/usr/sbin/mysqld(_ZN7handler21multi_range_read_nextEPPv+0xb2)[0x561997060e22] |
/usr/sbin/mysqld(_ZN23Mrr_simple_index_reader8get_nextEPPv+0x4a)[0x561997060eaa] |
/usr/sbin/mysqld(_ZN10DsMrr_impl10dsmrr_nextEPPv+0x42)[0x561997062202] |
/usr/sbin/mysqld(_ZN18QUICK_RANGE_SELECT8get_nextEv+0x99)[0x5619972360b9] |
/usr/sbin/mysqld(+0x89dfff)[0x561997251fff] |
/usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x1b0)[0x561996f8a970] |
/usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0x932)[0x561996fa9fe2] |
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x33)[0x561996faa433] |
/usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_jP8st_orderS9_S7_S9_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xf7)[0x561996faa587] |
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x14d)[0x561996faaf1d] |
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x6c29)[0x561996f583e9] |
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x225)[0x561996f59375] |
/usr/sbin/mysqld(_ZN15Query_log_event14do_apply_eventEP14rpl_group_infoPKcj+0x94b)[0x561997224abb] |
/usr/sbin/mysqld(wsrep_apply_cb+0x512)[0x5619970b6c52] |
/usr/lib/galera/libgalera_smm.so(+0x327ee)[0x7f7a4c7f97ee] |
src/trx_handle.cpp:312(galera::TrxHandle::apply(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_trx_meta const&) const)[0x7f7a4c8081a4] |
src/replicator_smm.cpp:92(apply_trx_ws(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_cb_status (*)(void*, unsigned int, wsrep_trx_meta const*, bool*, bool), galera::TrxHandle const&, wsrep_trx_meta const&))[0x7f7a4c80b5d8] |
src/replicator_smm.cpp:1259(galera::ReplicatorSMM::process_trx(void*, galera::TrxHandle*))[0x7f7a4c80e29e] |
src/gcs_action_source.cpp:116(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&))[0x7f7a4c8448a8] |
src/gcs_action_source.cpp:28(galera::GcsActionSource::process(void*, bool&))[0x7f7a4c845780] |
src/replicator_smm.cpp:363(galera::ReplicatorSMM::async_recv(void*))[0x7f7a4c80eabd] |
src/wsrep_provider.cpp:271(galera_recv)[0x7f7a4c7e38eb] |
/usr/sbin/mysqld(+0x703aa5)[0x5619970b7aa5] |
/usr/sbin/mysqld(start_wsrep_THD+0x3ce)[0x5619970a8b6e] |
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f7a4fd346db] |
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f7a4fa5d61f] |
 |
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x7f745edbf91b): REPLACE INTO `exploitation`.`checksums_20230103T051001` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT /* 99997*/ 'service.auth', 'oauth_refresh_tokens', '8', 'PRIMARY', 'l4qxLOlUVYIMgVYbnvgAJranElthjE3yTqR2lPcqE6c3pdhLFo7Jya8uBChgBLqCp2Umg5DRU0bJ0fxA', 'pcE2bYvm2UpqP8ZaH9wfv2QoJi12ghJkPD6wKcSxCebyS8pI8e3l8c7T0yzBFZSexM7tjAsngvNbduyJ', COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', convert(`id` using utf8mb4), convert(`access_token_id` using utf8mb4), `revoked`, `expires_at`, CONCAT(ISNULL(`expires_at`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `service.auth`.`oauth_refresh_tokens` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= 'l4qxLOlUVYIMgVYbnvgAJranElthjE3yTqR2lPcqE6c3pdhLFo7Jya8uBChgBLqCp2Umg5DRU0bJ0fxA')) AND ((`id` <= 'pcE2bYvm2UpqP8ZaH9wfv2QoJi12ghJkPD6wKcSxCebyS8pI8e3l8c7T0yzBFZSexM7tjAsngvNbduyJ')) /*checksum chunk*/ |
 |
Connection ID (thread ID): 2 |
Status: NOT_KILLED
|
 |
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=off,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=off,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on |
 |
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains |
information that should help you find out what is causing the crash.
|
Writing a core file...
|
Working directory at /var/lib/mysql
|
Resource Limits:
|
Limit Soft Limit Hard Limit Units
|
Max cpu time unlimited unlimited seconds
|
Max file size unlimited unlimited bytes
|
Max data size unlimited unlimited bytes
|
Max stack size 8388608 unlimited bytes |
Max core file size 0 unlimited bytes |
Max resident set unlimited unlimited bytes
|
Max processes 256598 256598 processes |
Max open files 65535 65535 files |
Max locked memory 67108864 67108864 bytes |
Max address space unlimited unlimited bytes
|
Max file locks unlimited unlimited locks
|
Max pending signals 256598 256598 signals |
Max msgqueue size 819200 819200 bytes |
Max nice priority 0 0 |
Max realtime priority 0 0 |
Max realtime timeout unlimited unlimited us
|
Core pattern: core
|
 |
Kernel version: Linux version 4.15.0-192-generic (buildd@lcy02-amd64-029) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #203-Ubuntu SMP Wed Aug 10 17:40:03 UTC 2022 |
 |
2023-01-03 5:16:15 0 [Note] WSREP: Read nil XID from storage engines, skipping position init |
2023-01-03 5:16:15 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so' |
2023-01-03 5:16:15 0 [Note] WSREP: wsrep_load(): Galera 25.3.37(rd0a7bd74) by Codership Oy <info@codership.com> loaded successfully. |
2023-01-03 5:16:15 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration. |
2023-01-03 5:16:15 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 0 |
2023-01-03 5:16:15 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 10.1.1.7; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc. |
2023-01-03 5:16:15 0 [Note] WSREP: GCache history reset: 535c1425-9ea9-11e7-ab58-8bcd0ee945f3:0 -> 00000000-0000-0000-0000-000000000000:-1 |
2023-01-03 5:16:15 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1 |
2023-01-03 5:16:15 0 [Note] WSREP: wsrep_sst_grab() |
2023-01-03 5:16:15 0 [Note] WSREP: Start replication |
2023-01-03 5:16:15 0 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1 |
2023-01-03 5:16:15 0 [Note] WSREP: protonet asio version 0 |
2023-01-03 5:16:15 0 [Note] WSREP: Using CRC-32C for message checksums. |
2023-01-03 5:16:15 0 [Note] WSREP: backend: asio |
2023-01-03 5:16:15 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 |
2023-01-03 5:16:15 0 [Note] WSREP: restore pc from disk successfully |
2023-01-03 5:16:15 0 [Note] WSREP: GMCast version 0 |
2023-01-03 5:16:15 0 [Note] WSREP: (817d0f4f, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567 |
2023-01-03 5:16:15 0 [Note] WSREP: (817d0f4f, 'tcp://0.0.0.0:4567') multicast: , ttl: 1 |
2023-01-03 5:16:15 0 [Note] WSREP: EVS version 0 |
2023-01-03 5:16:15 0 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '10.1.11.14:,10.1.11.15:,10.1.11.28:,10.1.1.6:,10.1.1.7:,10.1.1.8:' |
2023-01-03 5:16:15 0 [Note] WSREP: (817d0f4f, 'tcp://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address tcp://10.1.1.7:4567 |
2023-01-03 5:16:15 0 [Note] WSREP: (817d0f4f, 'tcp://0.0.0.0:4567') connection established to c667e1df tcp://10.1.1.6:4567 |
2023-01-03 5:16:15 0 [Note] WSREP: (817d0f4f, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: |
2023-01-03 5:16:15 0 [Note] WSREP: (817d0f4f, 'tcp://0.0.0.0:4567') connection established to bfd000c6 tcp://10.1.11.14:4567 |
2023-01-03 5:16:15 0 [Note] WSREP: (817d0f4f, 'tcp://0.0.0.0:4567') connection established to 5c52b67f tcp://10.1.11.15:4567 |
2023-01-03 5:16:15 0 [Note] WSREP: (817d0f4f, 'tcp://0.0.0.0:4567') connection established to 193c628e tcp://10.1.1.8:4567 |
2023-01-03 5:16:15 0 [Note] WSREP: (817d0f4f, 'tcp://0.0.0.0:4567') connection established to c6633185 tcp://10.1.11.28:4567 |
2023-01-03 5:16:15 0 [Note] WSREP: gcomm: connected |
2023-01-03 5:16:15 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636 |
2023-01-03 5:16:15 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0) |
2023-01-03 5:16:15 0 [Note] WSREP: Opened channel 'my_wsrep_cluster' |
2023-01-03 5:16:15 0 [Note] WSREP: Waiting for SST to complete. |
2023-01-03 5:16:16 0 [Note] WSREP: declaring 193c628e at tcp://10.1.1.8:4567 stable |
2023-01-03 5:16:16 0 [Note] WSREP: declaring 5c52b67f at tcp://10.1.11.15:4567 stable |
2023-01-03 5:16:16 0 [Note] WSREP: declaring bfd000c6 at tcp://10.1.11.14:4567 stable |
2023-01-03 5:16:16 0 [Note] WSREP: declaring c6633185 at tcp://10.1.11.28:4567 stable |
2023-01-03 5:16:16 0 [Note] WSREP: declaring c667e1df at tcp://10.1.1.6:4567 stable |
2023-01-03 5:16:16 0 [Note] WSREP: Node 193c628e state prim |
2023-01-03 5:16:16 0 [Note] WSREP: view(view_id(PRIM,193c628e,2574) memb { |
193c628e,0 |
5c52b67f,0 |
817d0f4f,0 |
bfd000c6,0 |
c6633185,0 |
c667e1df,0 |
} joined {
|
} left {
|
} partitioned {
|
})
|
2023-01-03 5:16:16 0 [Note] WSREP: save pc into disk |
2023-01-03 5:16:16 0 [Note] WSREP: clear restored view |
2023-01-03 5:16:16 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 6 |
2023-01-03 5:16:16 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID. |
2023-01-03 5:16:16 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 5e0ccc78-8b1d-11ed-a20f-db17d096b8bf |
2023-01-03 5:16:16 0 [Note] WSREP: STATE EXCHANGE: got state msg: 5e0ccc78-8b1d-11ed-a20f-db17d096b8bf from 0 (prddb303) |
2023-01-03 5:16:16 0 [Note] WSREP: STATE EXCHANGE: got state msg: 5e0ccc78-8b1d-11ed-a20f-db17d096b8bf from 1 (prddb202) |
2023-01-03 5:16:16 0 [Note] WSREP: STATE EXCHANGE: got state msg: 5e0ccc78-8b1d-11ed-a20f-db17d096b8bf from 2 (prddb302) |
2023-01-03 5:16:16 0 [Note] WSREP: STATE EXCHANGE: got state msg: 5e0ccc78-8b1d-11ed-a20f-db17d096b8bf from 3 (prddb201) |
2023-01-03 5:16:16 0 [Note] WSREP: STATE EXCHANGE: got state msg: 5e0ccc78-8b1d-11ed-a20f-db17d096b8bf from 4 (prddb203) |
2023-01-03 5:16:16 0 [Note] WSREP: STATE EXCHANGE: got state msg: 5e0ccc78-8b1d-11ed-a20f-db17d096b8bf from 5 (prddb301) |
2023-01-03 5:16:16 0 [Note] WSREP: Quorum results: |
version = 6, |
component = PRIMARY,
|
conf_id = 2313, |
members = 5/6 (joined/total), |
act_id = 871572585, |
last_appl. = -1, |
protocols = 0/9/3 (gcs/repl/appl), |
group UUID = 535c1425-9ea9-11e7-ab58-8bcd0ee945f3
|
2023-01-03 5:16:16 0 [Note] WSREP: Flow-control interval: [39, 39] |
2023-01-03 5:16:16 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 871572585) |
2023-01-03 5:16:16 2 [Note] WSREP: State transfer required: |
Group state: 535c1425-9ea9-11e7-ab58-8bcd0ee945f3:871572585 |
Local state: 00000000-0000-0000-0000-000000000000:-1 |
2023-01-03 5:16:16 2 [Note] WSREP: REPL Protocols: 9 (4, 2) |
2023-01-03 5:16:16 2 [Note] WSREP: New cluster view: global state: 535c1425-9ea9-11e7-ab58-8bcd0ee945f3:871572585, view# 2314: Primary, number of nodes: 6, my index: 2, protocol version 3 |
2023-01-03 5:16:16 2 [Warning] WSREP: Gap in state sequence. Need state transfer. |
2023-01-03 5:16:16 0 [Note] WSREP: Running: 'wsrep_sst_mariabackup --role 'joiner' --address '10.1.1.7' --datadir '/var/lib/mysql/' --parent 13291 --progress 0 --binlog '/var/lib/mysql/log-bin' --mysqld-args --wsrep_start_position=535c1425-9ea9-11e7-ab58-8bcd0ee945f3:871572483' |
WSREP_SST: [INFO] mariabackup SST started on joiner (20230103 05:16:16.705) |
WSREP_SST: [INFO] SSL configuration: CA='', CAPATH='', CERT='', KEY='', MODE='DISABLED', encrypt='0' (20230103 05:16:16.769) |
WSREP_SST: [INFO] Progress reporting tool pv not found in path: /usr//bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/sbin:/usr/bin:/sbin:/bin (20230103 05:16:16.931) |
WSREP_SST: [INFO] Disabling all progress/rate-limiting (20230103 05:16:16.935) |
WSREP_SST: [INFO] Logging all stderr of SST/mariabackup to syslog (20230103 05:16:16.997) |
2023-01-03 5:16:17 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. |
2023-01-03 5:16:17 2 [Note] WSREP: Assign initial position for certification: 871572585, protocol version: 4 |
2023-01-03 5:16:17 0 [Note] WSREP: Service thread queue flushed. |
2023-01-03 5:16:17 2 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (535c1425-9ea9-11e7-ab58-8bcd0ee945f3): 1 (Operation not permitted) |
at /home/buildbot/buildbot/build/galera/src/replicator_str.cpp:prepare_for_IST():467. IST will be unavailable. |
2023-01-03 5:16:17 0 [Note] WSREP: Member 2.0 (prddb302) requested state transfer from '*any*'. Selected 0.0 (prddb303)(SYNCED) as donor. |
2023-01-03 5:16:17 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 871572588) |
2023-01-03 5:16:17 2 [Note] WSREP: Requesting state transfer: success, donor: 0 |
2023-01-03 5:16:17 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 535c1425-9ea9-11e7-ab58-8bcd0ee945f3:871572585 |
2023-01-03 5:16:18 0 [Note] WSREP: (817d0f4f, 'tcp://0.0.0.0:4567') turning message relay requesting off |
2023-01-03 5:30:45 0 [Note] WSREP: 0.0 (prddb303): State transfer to 2.0 (prddb302) complete. |
2023-01-03 5:30:45 0 [Note] WSREP: Member 0.0 (prddb303) synced with group. |
2023-01-03 5:30:49 0 [Note] WSREP: SST complete, seqno: 871574138 |
2023-01-03 5:30:49 0 [Warning] Plugin 'SPHINX' is of maturity level gamma while the server is stable |
2023-01-03 5:30:49 0 [ERROR] mysqld: Plugin 'sphinx' already installed |
2023-01-03 5:30:49 0 [Note] InnoDB: Using Linux native AIO |
2023-01-03 5:30:49 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins |
2023-01-03 5:30:49 0 [Note] InnoDB: Uses event mutexes |
2023-01-03 5:30:49 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 |
2023-01-03 5:30:49 0 [Note] InnoDB: Number of pools: 1 |
2023-01-03 5:30:49 0 [Note] InnoDB: Using SSE2 crc32 instructions |
2023-01-03 5:30:49 0 [Note] InnoDB: Initializing buffer pool, total size = 20G, instances = 8, chunk size = 128M |
2023-01-03 5:30:50 0 [Note] InnoDB: Completed initialization of buffer pool |
2023-01-03 5:30:50 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority(). |
2023-01-03 5:30:50 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 268435456 bytes |
2023-01-03 5:30:50 0 [Note] InnoDB: Setting log file ./ib_logfile1 size to 268435456 bytes |
2023-01-03 5:30:50 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0 |
2023-01-03 5:30:50 0 [Note] InnoDB: New log files created, LSN=6198746160942 |
2023-01-03 5:30:50 0 [Note] InnoDB: 128 out of 128 rollback segments are active. |
2023-01-03 5:30:50 0 [Note] InnoDB: Creating shared tablespace for temporary tables |
2023-01-03 5:30:50 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... |
2023-01-03 5:30:50 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. |
2023-01-03 5:30:50 0 [Note] InnoDB: Waiting for purge to start |
2023-01-03 5:30:50 0 [Note] InnoDB: 10.3.36 started; log sequence number 6198746161164; transaction id 1878424456 |
2023-01-03 5:30:50 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool |
2023-01-03 5:30:50 0 [Note] InnoDB: Cannot open '/var/lib/mysql/ib_buffer_pool' for reading: No such file or directory |
2023-01-03 5:30:50 0 [Note] Plugin 'FEEDBACK' is disabled. |
2023-01-03 5:30:50 0 [Note] Server socket created on IP: '0.0.0.0'. |
2023-01-03 5:30:50 0 [Note] WSREP: Signalling provider to continue. |
2023-01-03 5:30:50 0 [Note] WSREP: SST received: 535c1425-9ea9-11e7-ab58-8bcd0ee945f3:871574138 |
2023-01-03 5:30:50 0 [Note] Reading of all Master_info entries succeeded |
2023-01-03 5:30:50 0 [Note] Added new Master_info '' to hash table |
2023-01-03 5:30:50 0 [Note] /usr/sbin/mysqld: ready for connections. |
Version: '10.3.36-MariaDB-1:10.3.36+maria~ubu1804-log' socket: '/var/run/mysqld/mysqld.sock' port: 3306 mariadb.org binary distribution |
2023-01-03 5:30:50 0 [Note] WSREP: 2.0 (prddb302): State transfer from 0.0 (prddb303) complete. |
2023-01-03 5:30:50 0 [Note] WSREP: Shifting JOINER -> JOINED (TO: 871574152) |
2023-01-03 5:30:50 0 [Note] WSREP: Member 2.0 (prddb302) synced with group. |
2023-01-03 5:30:50 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 871574152) |
2023-01-03 5:30:50 12 [Note] WSREP: Synchronized with group, ready for connections |
2023-01-03 5:30:50 12 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. |
How to reproduce:
- Run checksum and write queries on main writer along with wsrep_slave_threads=8 (it crashes for this setup)
Solutions tried:
- Disable checksum -> No crash
Next solution to try:
- Set wsrep_slave_threads=1
We suspect its linked to https://jira.mariadb.org/browse/MDEV-21910 for the mutex deadlock, and https://jira.mariadb.org/browse/MDEV-25835 for parallel threads application via Galera replication.
Is it an issue more oriented Galera or race condition in MariaDB?