Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24989

Galera assertion at lock0lock.cc line 655

Details

    Description

      I was on experiencing the issue detailed in MDEV-23851, but saw that this was fixed in 10.4, and 10.3. I was on Debian Buster, which provides 10.3. I decided to upgrade to 10.5 to solve this problem, as it was causing production outages on a regular basis.

      I now have a galera-4 cluster, 3 nodes, all with 10.5.8, and I just experienced the same issue:

      Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] InnoDB: Conflicting lock on table: `roundcube`.`contactgroupmembers` index: PRIMARY that has lock
      Feb 25 11:18:02 pochard mariadbd[2405]: RECORD LOCKS space id 95 page no 1870 n bits 440 index PRIMARY of table `roundcube`.`contactgroupmembers` trx id 1744395763 lock_mode X locks rec but not gap
      Feb 25 11:18:02 pochard mariadbd[2405]: Record lock, heap no 221 PHYSICAL RECORD: n_fields 5; compact format; info bits 32
      Feb 25 11:18:02 pochard mariadbd[2405]:  0: len 4; hex 00032f40; asc   /@;;
      Feb 25 11:18:02 pochard mariadbd[2405]:  1: len 4; hex 0132c73d; asc  2 =;;
      Feb 25 11:18:02 pochard mariadbd[2405]:  2: len 6; hex 000067f95df3; asc   g ] ;;
      Feb 25 11:18:02 pochard mariadbd[2405]:  3: len 7; hex 52000002533cc7; asc R   S< ;;
      Feb 25 11:18:02 pochard mariadbd[2405]:  4: len 5; hex 99a8e2cbad; asc      ;;
      [snip]
      Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] InnoDB: WSREP state:
      Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] WSREP: Thread BF trx_id: 1744395764 thread: 10 seqno: 65554111 client_state: exec client_mode: high priority transaction_mode: executing applier: 1 toi: 0 local: 0 query: DELETE FROM `contactgroups` WHERE `del` = 1 AND `changed` < '2021-02-18 00:00:00'��7`#023#004
      Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] WSREP: Thread BF trx_id: 1744395763 thread: 8 seqno: 65554110 client_state: exec client_mode: high priority transaction_mode: ordered_commit applier: 1 toi: 0 local: 0 query: NULL
      Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 0x7f1085ffb700  InnoDB: Assertion failure in file /build/mariadb-10.5-mnI6vJ/mariadb-10.5-10.5.8/storage/innobase/lock/lock0lock.cc line 655
      Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: We intentionally generate a memory trap.
      Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: If you get repeated assertion failures or crashes, even
      Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: immediately after the mysqld startup, there may be
      Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: corruption in the InnoDB tablespace. Please refer to
      Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
      Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: about forcing recovery.
      Feb 25 11:18:02 pochard mariadbd[2405]: 210225 11:18:02 [ERROR] mysqld got signal 6 ;
      Feb 25 11:18:02 pochard mariadbd[2405]: This could be because you hit a bug. It is also possible that this binary
      Feb 25 11:18:02 pochard mariadbd[2405]: or one of the libraries it was linked against is corrupt, improperly built,
      Feb 25 11:18:02 pochard mariadbd[2405]: or misconfigured. This error can also be caused by malfunctioning hardware.
      Feb 25 11:18:02 pochard mariadbd[2405]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
      Feb 25 11:18:02 pochard mariadbd[2405]: We will try our best to scrape up some info that will hopefully help
      Feb 25 11:18:02 pochard mariadbd[2405]: diagnose the problem, but since we have already crashed,
      Feb 25 11:18:02 pochard mariadbd[2405]: something is definitely wrong and this may fail.
      Feb 25 11:18:02 pochard mariadbd[2405]: Server version: 10.5.8-MariaDB-3-log
      Feb 25 11:18:02 pochard mariadbd[2405]: key_buffer_size=536870912
      Feb 25 11:18:02 pochard mariadbd[2405]: read_buffer_size=786432
      Feb 25 11:18:02 pochard mariadbd[2405]: max_used_connections=2
      Feb 25 11:18:02 pochard mariadbd[2405]: max_threads=2002
      Feb 25 11:18:02 pochard mariadbd[2405]: thread_count=17
      Feb 25 11:18:02 pochard mariadbd[2405]: It is possible that mysqld could use up to
      Feb 25 11:18:02 pochard mariadbd[2405]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3650182 K  bytes of memory
      Feb 25 11:18:02 pochard mariadbd[2405]: Hope that's ok; if not, decrease some variables in the equation.
      Feb 25 11:18:02 pochard mariadbd[2405]: Thread pointer: 0x7f11f8009fd8
      Feb 25 11:18:02 pochard mariadbd[2405]: Attempting backtrace. You can use the following information to find out
      Feb 25 11:18:02 pochard mariadbd[2405]: where mysqld died. If you see no messages after this, something went
      Feb 25 11:18:02 pochard mariadbd[2405]: terribly wrong...
      Feb 25 11:18:02 pochard mariadbd[2405]: stack_bottom = 0x7f1085ffad98 thread_stack 0x30000
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(my_print_stacktrace)[0x561bdef4947e]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(handle_fatal_signal)[0x561bdea5a2d5]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(__restore_rt)[0x7f121f7a2140]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(gsignal)[0x7f121f2ebce1]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(abort)[0x7f121f2d5537]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_server_servi0:0:0:0:0:0:0:0log_dummy_write_set(wsrep0:0:0:0:0:0:0:0lient_state&, wsrep0:0:0:0:0:0:0:0ws_meta const&))[0x561bde742b91]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_server_servi0:0:0:0:0:0:0:0log_dummy_write_set(wsrep0:0:0:0:0:0:0:0lient_state&, wsrep0:0:0:0:0:0:0:0ws_meta const&))[0x561bde720517]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_notify_status(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0state, wsrep0:0:0:0:0:0:0:0view const*))[0x561bded6f75b]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_notify_status(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0state, wsrep0:0:0:0:0:0:0:0view const*))[0x561bded74d45]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedbbefa]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedbdb7f]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdeded63f]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedf13d9]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedf24ae]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedd0dcb]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_notify_status(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0state, wsrep0:0:0:0:0:0:0:0view const*))[0x561bded2c041]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(handler0:0:0:0:0:0:0:0ha_delete_row(unsigned char const*))[0x561bdea67460]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Delete_rows_log_event0:0:0:0:0:0:0:0o_exec_row(rpl_group_info*))[0x561bdeb79164]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Rows_log_event0:0:0:0:0:0:0:0o_apply_event(rpl_group_info*))[0x561bdeb6d3ef]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_apply_events(THD*, Relay_log_info*, void const*, unsigned long))[0x561bded04cc9]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_high_priority_servi0:0:0:0:0:0:0:0remove_fragments(wsrep0:0:0:0:0:0:0:0ws_meta const&))[0x561bdecedeb0]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_applier_servi0:0:0:0:0:0:0:0pply_write_set(wsrep0:0:0:0:0:0:0:0ws_meta const&, wsrep0:0:0:0:0:0:0:0onst_buffer const&, wsrep0:0:0:0:0:0:0:0mutable_buffer&))[0x561bdeceecc6]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0start_streaming_applier(wsrep0:0:0:0:0:0:0:0id const&, wsrep0:0:0:0:0:0:0:0transaction_id const&, wsrep0:0:0:0:0:0:0:0high_priority_service*))[0x561bdefb8037]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep0:0:0:0:0:0:0:0wsrep_provider_v0:0:0:0:0:0:0:0options[abi:cxx11]() const)[0x561bdefc80be]
      Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1afba1)[0x7f121eef9ba1]
      Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1f29d2)[0x7f121ef3c9d2]
      Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1f642c)[0x7f121ef4042c]
      Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1cd80c)[0x7f121ef1780c]
      Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1ce452)[0x7f121ef18452]
      Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1f6908)[0x7f121ef40908]
      Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x21236d)[0x7f121ef5c36d]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep0:0:0:0:0:0:0:0wsrep_provider_v0:0:0:0:0:0:0:0run_applier(wsrep0:0:0:0:0:0:0:0high_priority_service*))[0x561bdefc868e]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_bf_abort(THD const*, THD*))[0x561bded06f73]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(start_wsrep_THD(void*))[0x561bdecf91a3]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(MyCTX_nop0:0:0:0:0:0:0:0inish(unsigned char*, unsigned int*))[0x561bdec91ee2]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(start_thread)[0x7f121f796ea7]
      Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(clone)[0x7f121f3addef]
      Feb 25 11:18:02 pochard mariadbd[2405]: Trying to get some variables.
      Feb 25 11:18:02 pochard mariadbd[2405]: Some pointers may be invalid and cause the dump to abort.
      Feb 25 11:18:02 pochard mariadbd[2405]: Query (0x7f1212558cab): DELETE FROM `contactgroups` WHERE `del` = 1 AND `changed` < '2021-02-18 00:00:00'
      Feb 25 11:18:02 pochard mariadbd[2405]: Connection ID (thread ID): 10
      Feb 25 11:18:02 pochard mariadbd[2405]: Status: NOT_KILLED
      Feb 25 11:18:02 pochard mariadbd[2405]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
      Feb 25 11:18:02 pochard mariadbd[2405]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
      Feb 25 11:18:02 pochard mariadbd[2405]: information that should help you find out what is causing the crash.
      Feb 25 11:18:02 pochard mariadbd[2405]: Writing a core file...
      Feb 25 11:18:02 pochard mariadbd[2405]: Working directory at /var/lib/mysql
      Feb 25 11:18:02 pochard mariadbd[2405]: Resource Limits:
      Feb 25 11:18:02 pochard mariadbd[2405]: Limit                     Soft Limit           Hard Limit           Units
      Feb 25 11:18:02 pochard mariadbd[2405]: Max cpu time              unlimited            unlimited            seconds
      Feb 25 11:18:02 pochard mariadbd[2405]: Max file size             unlimited            unlimited            bytes
      Feb 25 11:18:02 pochard mariadbd[2405]: Max data size             unlimited            unlimited            bytes
      Feb 25 11:18:02 pochard mariadbd[2405]: Max stack size            8388608              unlimited            bytes
      Feb 25 11:18:02 pochard mariadbd[2405]: Max core file size        0                    unlimited            bytes
      Feb 25 11:18:02 pochard mariadbd[2405]: Max resident set          unlimited            unlimited            bytes
      Feb 25 11:18:02 pochard mariadbd[2405]: Max processes             47792                47792                processes
      Feb 25 11:18:02 pochard mariadbd[2405]: Max open files            16384                16384                files
      Feb 25 11:18:02 pochard mariadbd[2405]: Max locked memory         65536                65536                bytes
      Feb 25 11:18:02 pochard mariadbd[2405]: Max address space         unlimited            unlimited            bytes
      Feb 25 11:18:02 pochard mariadbd[2405]: Max file locks            unlimited            unlimited            locks
      Feb 25 11:18:02 pochard mariadbd[2405]: Max pending signals       47792                47792                signals
      Feb 25 11:18:02 pochard mariadbd[2405]: Max msgqueue size         819200               819200               bytes
      Feb 25 11:18:02 pochard mariadbd[2405]: Max nice priority         0                    0
      Feb 25 11:18:02 pochard mariadbd[2405]: Max realtime priority     0                    0
      Feb 25 11:18:02 pochard mariadbd[2405]: Max realtime timeout      unlimited            unlimited            us
      Feb 25 11:18:02 pochard mariadbd[2405]: Core pattern: core
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] /usr/sbin/mariadbd (mysqld 10.5.8-MariaDB-3-log) starting as process 112552 ...
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Warning] Could not increase number of max_open_files to more than 16384 (request: 18416)
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Loading provider /usr/lib/galera/libgalera_smm.so initial position: c962afef-9de9-11ea-a9b2-2fa479531940:65554108
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): Galera 4.6(r323e509) by Codership Oy <info@codership.com> loaded successfully.
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Found saved state: c962afef-9de9-11ea-a9b2-2fa479531940:-1, safe_to_bootstrap: 0
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCache DEBUG: opened preamble:
      Feb 25 11:18:12 pochard mariadbd[112552]: Version: 2
      Feb 25 11:18:12 pochard mariadbd[112552]: UUID: c962afef-9de9-11ea-a9b2-2fa479531940
      Feb 25 11:18:12 pochard mariadbd[112552]: Seqno: -1 - -1
      Feb 25 11:18:12 pochard mariadbd[112552]: Offset: -1
      Feb 25 11:18:12 pochard mariadbd[112552]: Synced: 0
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: c962afef-9de9-11ea-a9b2-2fa479531940, offset: -1
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCach0:0:0:0:0:0:0:0RingBuffer initial scan...  0.0% (        0/134217752 bytes) complete.
      Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: st0:0:0:0:0:0:0:0_alloc
      Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to create a new provider '/usr/lib/galera/libgalera_smm.so' with options 'gcs.fc_factor=0.8;gcs.fc_limit=60;gmcast.listen_addr=0.0.0.0': Failed to initialize wsrep provider
      Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to load provider
      Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] Aborting
      Feb 25 11:18:13 pochard mariadbd[112552]: Warning: Memory not freed: 56
      

      The node simply dies.

      If I attempt to restart it, it does this:

      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] /usr/sbin/mariadbd (mysqld 10.5.8-MariaDB-3-log) starting as process 112552 ...
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Warning] Could not increase number of max_open_files to more than 16384 (request: 18416)
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Loading provider /usr/lib/galera/libgalera_smm.so initial position: c962afef-9de9-11ea-a9b2-2fa479531940:65554108
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): Galera 4.6(r323e509) by Codership Oy <info@codership.com> loaded successfully.
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Found saved state: c962afef-9de9-11ea-a9b2-2fa479531940:-1, safe_to_bootstrap: 0
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCache DEBUG: opened preamble:
      Feb 25 11:18:12 pochard mariadbd[112552]: Version: 2
      Feb 25 11:18:12 pochard mariadbd[112552]: UUID: c962afef-9de9-11ea-a9b2-2fa479531940
      Feb 25 11:18:12 pochard mariadbd[112552]: Seqno: -1 - -1
      Feb 25 11:18:12 pochard mariadbd[112552]: Offset: -1
      Feb 25 11:18:12 pochard mariadbd[112552]: Synced: 0
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: c962afef-9de9-11ea-a9b2-2fa479531940, offset: -1
      Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCach0:0:0:0:0:0:0:0RingBuffer initial scan...  0.0% (        0/134217752 bytes) complete.
      Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: st0:0:0:0:0:0:0:0_alloc
      Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to create a new provider '/usr/lib/galera/libgalera_smm.so' with options 'gcs.fc_factor=0.8;gcs.fc_limit=60;gmcast.listen_addr=0.0.0.0': Failed to initialize wsrep provider
      Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to load provider
      Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] Aborting
      Feb 25 11:18:13 pochard mariadbd[112552]: Warning: Memory not freed: 56
      

      I tried to truncate the galera.cache file because it was obviously corrupt (truncate /var/lib/mysql/galera.cache -s 0), but that didn't work. I had to destroy the mysql datadir and do a SST from scratch.

      This is a reproducible problem, it always happens with the same query, which comes from a cronjob that runs to clean the database, the code that does the cleaning is php, from the Roundcube project:

          public static function db_clean($days)
          {
              // mapping for table name => primary key
              $primary_keys = array(
                  'contacts'      => 'contact_id',
                  'contactgroups' => 'contactgroup_id',
              );
       
              $db = self::db();
       
              $threshold = date('Y-m-d 00:00:00', time() - $days * 86400);
       
              foreach (array('contacts','contactgroups','identities') as $table) {
                  $sqltable = $db->table_name($table, true);
       
                  // also delete linked records
                  // could be skipped for databases which respect foreign key constraints
                  if ($db->db_provider == 'sqlite' && ($table == 'contacts' || $table == 'contactgroups')) {
                      $pk           = $primary_keys[$table];
                      $memberstable = $db->table_name('contactgroupmembers');
       
                      $db->query(
                          "DELETE FROM " . $db->quote_identifier($memberstable)
                          . " WHERE `$pk` IN ("
                              . "SELECT `$pk` FROM $sqltable"
                              . " WHERE `del` = 1 AND `changed` < ?"
                          . ")",
                          $threshold);
       
                      echo $db->affected_rows() . " records deleted from '$memberstable'\n";
                  }
       
                  // delete outdated records
                  $db->query("DELETE FROM $sqltable WHERE `del` = 1 AND `changed` < ?", $threshold);
       
                  echo $db->affected_rows() . " records deleted from '$table'\n";
              }
          }
      

      Attachments

        Issue Links

          Activity

            micah Micah created issue -
            micah Micah made changes -
            Field Original Value New Value
            micah Micah made changes -
            micah Micah made changes -
            Description I was on experiencing the issue detailed in MDEV-23851, but saw that this was fixed in 10.4, and 10.3. I was on Debian Buster, which provides 10.3. I decided to upgrade to 10.5 to solve this problem, as it was causing production outages on a regular basis.

            I now have a galera-4 cluster, 3 nodes, all with 10.5.8, and I just experienced the same issue:

            {noformat}
            Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] InnoDB: Conflicting lock on table: `roundcube`.`contactgroupmembers` index: PRIMARY that has lock
            Feb 25 11:18:02 pochard mariadbd[2405]: RECORD LOCKS space id 95 page no 1870 n bits 440 index PRIMARY of table `roundcube`.`contactgroupmembers` trx id 1744395763 lock_mode X locks rec but not gap
            Feb 25 11:18:02 pochard mariadbd[2405]: Record lock, heap no 221 PHYSICAL RECORD: n_fields 5; compact format; info bits 32
            Feb 25 11:18:02 pochard mariadbd[2405]: 0: len 4; hex 00032f40; asc /@;;
            Feb 25 11:18:02 pochard mariadbd[2405]: 1: len 4; hex 0132c73d; asc 2 =;;
            Feb 25 11:18:02 pochard mariadbd[2405]: 2: len 6; hex 000067f95df3; asc g ] ;;
            Feb 25 11:18:02 pochard mariadbd[2405]: 3: len 7; hex 52000002533cc7; asc R S< ;;
            Feb 25 11:18:02 pochard mariadbd[2405]: 4: len 5; hex 99a8e2cbad; asc ;;
            [snip]
            Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] InnoDB: WSREP state:
            Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] WSREP: Thread BF trx_id: 1744395764 thread: 10 seqno: 65554111 client_state: exec client_mode: high priority transaction_mode: executing applier: 1 toi: 0 local: 0 query: DELETE FROM `contactgroups` WHERE `del` = 1 AND `changed` < '2021-02-18 00:00:00'��7`#023#004
            Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] WSREP: Thread BF trx_id: 1744395763 thread: 8 seqno: 65554110 client_state: exec client_mode: high priority transaction_mode: ordered_commit applier: 1 toi: 0 local: 0 query: NULL
            Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 0x7f1085ffb700 InnoDB: Assertion failure in file /build/mariadb-10.5-mnI6vJ/mariadb-10.5-10.5.8/storage/innobase/lock/lock0lock.cc line 655
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: We intentionally generate a memory trap.
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: If you get repeated assertion failures or crashes, even
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: immediately after the mysqld startup, there may be
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: corruption in the InnoDB tablespace. Please refer to
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: about forcing recovery.
            Feb 25 11:18:02 pochard mariadbd[2405]: 210225 11:18:02 [ERROR] mysqld got signal 6 ;
            Feb 25 11:18:02 pochard mariadbd[2405]: This could be because you hit a bug. It is also possible that this binary
            Feb 25 11:18:02 pochard mariadbd[2405]: or one of the libraries it was linked against is corrupt, improperly built,
            Feb 25 11:18:02 pochard mariadbd[2405]: or misconfigured. This error can also be caused by malfunctioning hardware.
            Feb 25 11:18:02 pochard mariadbd[2405]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
            Feb 25 11:18:02 pochard mariadbd[2405]: We will try our best to scrape up some info that will hopefully help
            Feb 25 11:18:02 pochard mariadbd[2405]: diagnose the problem, but since we have already crashed,
            Feb 25 11:18:02 pochard mariadbd[2405]: something is definitely wrong and this may fail.
            Feb 25 11:18:02 pochard mariadbd[2405]: Server version: 10.5.8-MariaDB-3-log
            Feb 25 11:18:02 pochard mariadbd[2405]: key_buffer_size=536870912
            Feb 25 11:18:02 pochard mariadbd[2405]: read_buffer_size=786432
            Feb 25 11:18:02 pochard mariadbd[2405]: max_used_connections=2
            Feb 25 11:18:02 pochard mariadbd[2405]: max_threads=2002
            Feb 25 11:18:02 pochard mariadbd[2405]: thread_count=17
            Feb 25 11:18:02 pochard mariadbd[2405]: It is possible that mysqld could use up to
            Feb 25 11:18:02 pochard mariadbd[2405]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3650182 K bytes of memory
            Feb 25 11:18:02 pochard mariadbd[2405]: Hope that's ok; if not, decrease some variables in the equation.
            Feb 25 11:18:02 pochard mariadbd[2405]: Thread pointer: 0x7f11f8009fd8
            Feb 25 11:18:02 pochard mariadbd[2405]: Attempting backtrace. You can use the following information to find out
            Feb 25 11:18:02 pochard mariadbd[2405]: where mysqld died. If you see no messages after this, something went
            Feb 25 11:18:02 pochard mariadbd[2405]: terribly wrong...
            Feb 25 11:18:02 pochard mariadbd[2405]: stack_bottom = 0x7f1085ffad98 thread_stack 0x30000
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(my_print_stacktrace)[0x561bdef4947e]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(handle_fatal_signal)[0x561bdea5a2d5]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(__restore_rt)[0x7f121f7a2140]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(gsignal)[0x7f121f2ebce1]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(abort)[0x7f121f2d5537]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_server_servi0:0:0:0:0:0:0:0log_dummy_write_set(wsrep0:0:0:0:0:0:0:0lient_state&, wsrep0:0:0:0:0:0:0:0ws_meta const&))[0x561bde742b91]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_server_servi0:0:0:0:0:0:0:0log_dummy_write_set(wsrep0:0:0:0:0:0:0:0lient_state&, wsrep0:0:0:0:0:0:0:0ws_meta const&))[0x561bde720517]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_notify_status(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0state, wsrep0:0:0:0:0:0:0:0view const*))[0x561bded6f75b]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_notify_status(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0state, wsrep0:0:0:0:0:0:0:0view const*))[0x561bded74d45]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedbbefa]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedbdb7f]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdeded63f]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedf13d9]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedf24ae]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedd0dcb]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_notify_status(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0state, wsrep0:0:0:0:0:0:0:0view const*))[0x561bded2c041]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(handler0:0:0:0:0:0:0:0ha_delete_row(unsigned char const*))[0x561bdea67460]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Delete_rows_log_event0:0:0:0:0:0:0:0o_exec_row(rpl_group_info*))[0x561bdeb79164]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Rows_log_event0:0:0:0:0:0:0:0o_apply_event(rpl_group_info*))[0x561bdeb6d3ef]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_apply_events(THD*, Relay_log_info*, void const*, unsigned long))[0x561bded04cc9]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_high_priority_servi0:0:0:0:0:0:0:0remove_fragments(wsrep0:0:0:0:0:0:0:0ws_meta const&))[0x561bdecedeb0]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_applier_servi0:0:0:0:0:0:0:0pply_write_set(wsrep0:0:0:0:0:0:0:0ws_meta const&, wsrep0:0:0:0:0:0:0:0onst_buffer const&, wsrep0:0:0:0:0:0:0:0mutable_buffer&))[0x561bdeceecc6]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0start_streaming_applier(wsrep0:0:0:0:0:0:0:0id const&, wsrep0:0:0:0:0:0:0:0transaction_id const&, wsrep0:0:0:0:0:0:0:0high_priority_service*))[0x561bdefb8037]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep0:0:0:0:0:0:0:0wsrep_provider_v0:0:0:0:0:0:0:0options[abi:cxx11]() const)[0x561bdefc80be]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1afba1)[0x7f121eef9ba1]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1f29d2)[0x7f121ef3c9d2]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1f642c)[0x7f121ef4042c]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1cd80c)[0x7f121ef1780c]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1ce452)[0x7f121ef18452]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1f6908)[0x7f121ef40908]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x21236d)[0x7f121ef5c36d]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep0:0:0:0:0:0:0:0wsrep_provider_v0:0:0:0:0:0:0:0run_applier(wsrep0:0:0:0:0:0:0:0high_priority_service*))[0x561bdefc868e]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_bf_abort(THD const*, THD*))[0x561bded06f73]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(start_wsrep_THD(void*))[0x561bdecf91a3]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(MyCTX_nop0:0:0:0:0:0:0:0inish(unsigned char*, unsigned int*))[0x561bdec91ee2]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(start_thread)[0x7f121f796ea7]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(clone)[0x7f121f3addef]
            Feb 25 11:18:02 pochard mariadbd[2405]: Trying to get some variables.
            Feb 25 11:18:02 pochard mariadbd[2405]: Some pointers may be invalid and cause the dump to abort.
            Feb 25 11:18:02 pochard mariadbd[2405]: Query (0x7f1212558cab): DELETE FROM `contactgroups` WHERE `del` = 1 AND `changed` < '2021-02-18 00:00:00'
            Feb 25 11:18:02 pochard mariadbd[2405]: Connection ID (thread ID): 10
            Feb 25 11:18:02 pochard mariadbd[2405]: Status: NOT_KILLED
            Feb 25 11:18:02 pochard mariadbd[2405]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
            Feb 25 11:18:02 pochard mariadbd[2405]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
            Feb 25 11:18:02 pochard mariadbd[2405]: information that should help you find out what is causing the crash.
            Feb 25 11:18:02 pochard mariadbd[2405]: Writing a core file...
            Feb 25 11:18:02 pochard mariadbd[2405]: Working directory at /var/lib/mysql
            Feb 25 11:18:02 pochard mariadbd[2405]: Resource Limits:
            Feb 25 11:18:02 pochard mariadbd[2405]: Limit Soft Limit Hard Limit Units
            Feb 25 11:18:02 pochard mariadbd[2405]: Max cpu time unlimited unlimited seconds
            Feb 25 11:18:02 pochard mariadbd[2405]: Max file size unlimited unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max data size unlimited unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max stack size 8388608 unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max core file size 0 unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max resident set unlimited unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max processes 47792 47792 processes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max open files 16384 16384 files
            Feb 25 11:18:02 pochard mariadbd[2405]: Max locked memory 65536 65536 bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max address space unlimited unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max file locks unlimited unlimited locks
            Feb 25 11:18:02 pochard mariadbd[2405]: Max pending signals 47792 47792 signals
            Feb 25 11:18:02 pochard mariadbd[2405]: Max msgqueue size 819200 819200 bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max nice priority 0 0
            Feb 25 11:18:02 pochard mariadbd[2405]: Max realtime priority 0 0
            Feb 25 11:18:02 pochard mariadbd[2405]: Max realtime timeout unlimited unlimited us
            Feb 25 11:18:02 pochard mariadbd[2405]: Core pattern: core
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] /usr/sbin/mariadbd (mysqld 10.5.8-MariaDB-3-log) starting as process 112552 ...
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Warning] Could not increase number of max_open_files to more than 16384 (request: 18416)
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Loading provider /usr/lib/galera/libgalera_smm.so initial position: c962afef-9de9-11ea-a9b2-2fa479531940:65554108
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): Galera 4.6(r323e509) by Codership Oy <info@codership.com> loaded successfully.
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Found saved state: c962afef-9de9-11ea-a9b2-2fa479531940:-1, safe_to_bootstrap: 0
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCache DEBUG: opened preamble:
            Feb 25 11:18:12 pochard mariadbd[112552]: Version: 2
            Feb 25 11:18:12 pochard mariadbd[112552]: UUID: c962afef-9de9-11ea-a9b2-2fa479531940
            Feb 25 11:18:12 pochard mariadbd[112552]: Seqno: -1 - -1
            Feb 25 11:18:12 pochard mariadbd[112552]: Offset: -1
            Feb 25 11:18:12 pochard mariadbd[112552]: Synced: 0
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: c962afef-9de9-11ea-a9b2-2fa479531940, offset: -1
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCach0:0:0:0:0:0:0:0RingBuffer initial scan... 0.0% ( 0/134217752 bytes) complete.
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: st0:0:0:0:0:0:0:0_alloc
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to create a new provider '/usr/lib/galera/libgalera_smm.so' with options 'gcs.fc_factor=0.8;gcs.fc_limit=60;gmcast.listen_addr=0.0.0.0': Failed to initialize wsrep provider
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to load provider
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] Aborting
            Feb 25 11:18:13 pochard mariadbd[112552]: Warning: Memory not freed: 56
            {noformat}

            The node simply dies.

            If I attempt to restart it, it does this:

            {noformat}
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] /usr/sbin/mariadbd (mysqld 10.5.8-MariaDB-3-log) starting as process 112552 ...
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Warning] Could not increase number of max_open_files to more than 16384 (request: 18416)
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Loading provider /usr/lib/galera/libgalera_smm.so initial position: c962afef-9de9-11ea-a9b2-2fa479531940:65554108
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): Galera 4.6(r323e509) by Codership Oy <info@codership.com> loaded successfully.
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Found saved state: c962afef-9de9-11ea-a9b2-2fa479531940:-1, safe_to_bootstrap: 0
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCache DEBUG: opened preamble:
            Feb 25 11:18:12 pochard mariadbd[112552]: Version: 2
            Feb 25 11:18:12 pochard mariadbd[112552]: UUID: c962afef-9de9-11ea-a9b2-2fa479531940
            Feb 25 11:18:12 pochard mariadbd[112552]: Seqno: -1 - -1
            Feb 25 11:18:12 pochard mariadbd[112552]: Offset: -1
            Feb 25 11:18:12 pochard mariadbd[112552]: Synced: 0
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: c962afef-9de9-11ea-a9b2-2fa479531940, offset: -1
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCach0:0:0:0:0:0:0:0RingBuffer initial scan... 0.0% ( 0/134217752 bytes) complete.
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: st0:0:0:0:0:0:0:0_alloc
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to create a new provider '/usr/lib/galera/libgalera_smm.so' with options 'gcs.fc_factor=0.8;gcs.fc_limit=60;gmcast.listen_addr=0.0.0.0': Failed to initialize wsrep provider
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to load provider
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] Aborting
            Feb 25 11:18:13 pochard mariadbd[112552]: Warning: Memory not freed: 56
            {noformat}

            I tried to truncate the galera.cache file because it was obviously corrupt (truncate /var/lib/mysql/galera.cache -s 0), but that didn't work. I had to destroy the mysql datadir and do a SST from scratch.
            I was on experiencing the issue detailed in MDEV-23851, but saw that this was fixed in 10.4, and 10.3. I was on Debian Buster, which provides 10.3. I decided to upgrade to 10.5 to solve this problem, as it was causing production outages on a regular basis.

            I now have a galera-4 cluster, 3 nodes, all with 10.5.8, and I just experienced the same issue:

            {noformat}
            Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] InnoDB: Conflicting lock on table: `roundcube`.`contactgroupmembers` index: PRIMARY that has lock
            Feb 25 11:18:02 pochard mariadbd[2405]: RECORD LOCKS space id 95 page no 1870 n bits 440 index PRIMARY of table `roundcube`.`contactgroupmembers` trx id 1744395763 lock_mode X locks rec but not gap
            Feb 25 11:18:02 pochard mariadbd[2405]: Record lock, heap no 221 PHYSICAL RECORD: n_fields 5; compact format; info bits 32
            Feb 25 11:18:02 pochard mariadbd[2405]: 0: len 4; hex 00032f40; asc /@;;
            Feb 25 11:18:02 pochard mariadbd[2405]: 1: len 4; hex 0132c73d; asc 2 =;;
            Feb 25 11:18:02 pochard mariadbd[2405]: 2: len 6; hex 000067f95df3; asc g ] ;;
            Feb 25 11:18:02 pochard mariadbd[2405]: 3: len 7; hex 52000002533cc7; asc R S< ;;
            Feb 25 11:18:02 pochard mariadbd[2405]: 4: len 5; hex 99a8e2cbad; asc ;;
            [snip]
            Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] InnoDB: WSREP state:
            Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] WSREP: Thread BF trx_id: 1744395764 thread: 10 seqno: 65554111 client_state: exec client_mode: high priority transaction_mode: executing applier: 1 toi: 0 local: 0 query: DELETE FROM `contactgroups` WHERE `del` = 1 AND `changed` < '2021-02-18 00:00:00'��7`#023#004
            Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 10 [ERROR] WSREP: Thread BF trx_id: 1744395763 thread: 8 seqno: 65554110 client_state: exec client_mode: high priority transaction_mode: ordered_commit applier: 1 toi: 0 local: 0 query: NULL
            Feb 25 11:18:02 pochard mariadbd[2405]: 2021-02-25 11:18:02 0x7f1085ffb700 InnoDB: Assertion failure in file /build/mariadb-10.5-mnI6vJ/mariadb-10.5-10.5.8/storage/innobase/lock/lock0lock.cc line 655
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: We intentionally generate a memory trap.
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: If you get repeated assertion failures or crashes, even
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: immediately after the mysqld startup, there may be
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: corruption in the InnoDB tablespace. Please refer to
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
            Feb 25 11:18:02 pochard mariadbd[2405]: InnoDB: about forcing recovery.
            Feb 25 11:18:02 pochard mariadbd[2405]: 210225 11:18:02 [ERROR] mysqld got signal 6 ;
            Feb 25 11:18:02 pochard mariadbd[2405]: This could be because you hit a bug. It is also possible that this binary
            Feb 25 11:18:02 pochard mariadbd[2405]: or one of the libraries it was linked against is corrupt, improperly built,
            Feb 25 11:18:02 pochard mariadbd[2405]: or misconfigured. This error can also be caused by malfunctioning hardware.
            Feb 25 11:18:02 pochard mariadbd[2405]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
            Feb 25 11:18:02 pochard mariadbd[2405]: We will try our best to scrape up some info that will hopefully help
            Feb 25 11:18:02 pochard mariadbd[2405]: diagnose the problem, but since we have already crashed,
            Feb 25 11:18:02 pochard mariadbd[2405]: something is definitely wrong and this may fail.
            Feb 25 11:18:02 pochard mariadbd[2405]: Server version: 10.5.8-MariaDB-3-log
            Feb 25 11:18:02 pochard mariadbd[2405]: key_buffer_size=536870912
            Feb 25 11:18:02 pochard mariadbd[2405]: read_buffer_size=786432
            Feb 25 11:18:02 pochard mariadbd[2405]: max_used_connections=2
            Feb 25 11:18:02 pochard mariadbd[2405]: max_threads=2002
            Feb 25 11:18:02 pochard mariadbd[2405]: thread_count=17
            Feb 25 11:18:02 pochard mariadbd[2405]: It is possible that mysqld could use up to
            Feb 25 11:18:02 pochard mariadbd[2405]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3650182 K bytes of memory
            Feb 25 11:18:02 pochard mariadbd[2405]: Hope that's ok; if not, decrease some variables in the equation.
            Feb 25 11:18:02 pochard mariadbd[2405]: Thread pointer: 0x7f11f8009fd8
            Feb 25 11:18:02 pochard mariadbd[2405]: Attempting backtrace. You can use the following information to find out
            Feb 25 11:18:02 pochard mariadbd[2405]: where mysqld died. If you see no messages after this, something went
            Feb 25 11:18:02 pochard mariadbd[2405]: terribly wrong...
            Feb 25 11:18:02 pochard mariadbd[2405]: stack_bottom = 0x7f1085ffad98 thread_stack 0x30000
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(my_print_stacktrace)[0x561bdef4947e]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(handle_fatal_signal)[0x561bdea5a2d5]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(__restore_rt)[0x7f121f7a2140]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(gsignal)[0x7f121f2ebce1]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(abort)[0x7f121f2d5537]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_server_servi0:0:0:0:0:0:0:0log_dummy_write_set(wsrep0:0:0:0:0:0:0:0lient_state&, wsrep0:0:0:0:0:0:0:0ws_meta const&))[0x561bde742b91]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_server_servi0:0:0:0:0:0:0:0log_dummy_write_set(wsrep0:0:0:0:0:0:0:0lient_state&, wsrep0:0:0:0:0:0:0:0ws_meta const&))[0x561bde720517]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_notify_status(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0state, wsrep0:0:0:0:0:0:0:0view const*))[0x561bded6f75b]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_notify_status(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0state, wsrep0:0:0:0:0:0:0:0view const*))[0x561bded74d45]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedbbefa]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedbdb7f]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdeded63f]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedf13d9]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedf24ae]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(st0:0:0:0:0:0:0:0unique_lock<st0:0:0:0:0:0:0:0mutex>0:0:0:0:0:0:0:0unlock())[0x561bdedd0dcb]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_notify_status(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0state, wsrep0:0:0:0:0:0:0:0view const*))[0x561bded2c041]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(handler0:0:0:0:0:0:0:0ha_delete_row(unsigned char const*))[0x561bdea67460]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Delete_rows_log_event0:0:0:0:0:0:0:0o_exec_row(rpl_group_info*))[0x561bdeb79164]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Rows_log_event0:0:0:0:0:0:0:0o_apply_event(rpl_group_info*))[0x561bdeb6d3ef]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_apply_events(THD*, Relay_log_info*, void const*, unsigned long))[0x561bded04cc9]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_high_priority_servi0:0:0:0:0:0:0:0remove_fragments(wsrep0:0:0:0:0:0:0:0ws_meta const&))[0x561bdecedeb0]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(Wsrep_applier_servi0:0:0:0:0:0:0:0pply_write_set(wsrep0:0:0:0:0:0:0:0ws_meta const&, wsrep0:0:0:0:0:0:0:0onst_buffer const&, wsrep0:0:0:0:0:0:0:0mutable_buffer&))[0x561bdeceecc6]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep0:0:0:0:0:0:0:0server_stat0:0:0:0:0:0:0:0start_streaming_applier(wsrep0:0:0:0:0:0:0:0id const&, wsrep0:0:0:0:0:0:0:0transaction_id const&, wsrep0:0:0:0:0:0:0:0high_priority_service*))[0x561bdefb8037]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep0:0:0:0:0:0:0:0wsrep_provider_v0:0:0:0:0:0:0:0options[abi:cxx11]() const)[0x561bdefc80be]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1afba1)[0x7f121eef9ba1]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1f29d2)[0x7f121ef3c9d2]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1f642c)[0x7f121ef4042c]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1cd80c)[0x7f121ef1780c]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1ce452)[0x7f121ef18452]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x1f6908)[0x7f121ef40908]
            Feb 25 11:18:02 pochard mariadbd[2405]: /usr/lib/galera/libgalera_smm.so(+0x21236d)[0x7f121ef5c36d]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep0:0:0:0:0:0:0:0wsrep_provider_v0:0:0:0:0:0:0:0run_applier(wsrep0:0:0:0:0:0:0:0high_priority_service*))[0x561bdefc868e]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(wsrep_bf_abort(THD const*, THD*))[0x561bded06f73]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(start_wsrep_THD(void*))[0x561bdecf91a3]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(MyCTX_nop0:0:0:0:0:0:0:0inish(unsigned char*, unsigned int*))[0x561bdec91ee2]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(start_thread)[0x7f121f796ea7]
            Feb 25 11:18:02 pochard mariadbd[2405]: ??:0(clone)[0x7f121f3addef]
            Feb 25 11:18:02 pochard mariadbd[2405]: Trying to get some variables.
            Feb 25 11:18:02 pochard mariadbd[2405]: Some pointers may be invalid and cause the dump to abort.
            Feb 25 11:18:02 pochard mariadbd[2405]: Query (0x7f1212558cab): DELETE FROM `contactgroups` WHERE `del` = 1 AND `changed` < '2021-02-18 00:00:00'
            Feb 25 11:18:02 pochard mariadbd[2405]: Connection ID (thread ID): 10
            Feb 25 11:18:02 pochard mariadbd[2405]: Status: NOT_KILLED
            Feb 25 11:18:02 pochard mariadbd[2405]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
            Feb 25 11:18:02 pochard mariadbd[2405]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
            Feb 25 11:18:02 pochard mariadbd[2405]: information that should help you find out what is causing the crash.
            Feb 25 11:18:02 pochard mariadbd[2405]: Writing a core file...
            Feb 25 11:18:02 pochard mariadbd[2405]: Working directory at /var/lib/mysql
            Feb 25 11:18:02 pochard mariadbd[2405]: Resource Limits:
            Feb 25 11:18:02 pochard mariadbd[2405]: Limit Soft Limit Hard Limit Units
            Feb 25 11:18:02 pochard mariadbd[2405]: Max cpu time unlimited unlimited seconds
            Feb 25 11:18:02 pochard mariadbd[2405]: Max file size unlimited unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max data size unlimited unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max stack size 8388608 unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max core file size 0 unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max resident set unlimited unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max processes 47792 47792 processes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max open files 16384 16384 files
            Feb 25 11:18:02 pochard mariadbd[2405]: Max locked memory 65536 65536 bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max address space unlimited unlimited bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max file locks unlimited unlimited locks
            Feb 25 11:18:02 pochard mariadbd[2405]: Max pending signals 47792 47792 signals
            Feb 25 11:18:02 pochard mariadbd[2405]: Max msgqueue size 819200 819200 bytes
            Feb 25 11:18:02 pochard mariadbd[2405]: Max nice priority 0 0
            Feb 25 11:18:02 pochard mariadbd[2405]: Max realtime priority 0 0
            Feb 25 11:18:02 pochard mariadbd[2405]: Max realtime timeout unlimited unlimited us
            Feb 25 11:18:02 pochard mariadbd[2405]: Core pattern: core
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] /usr/sbin/mariadbd (mysqld 10.5.8-MariaDB-3-log) starting as process 112552 ...
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Warning] Could not increase number of max_open_files to more than 16384 (request: 18416)
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Loading provider /usr/lib/galera/libgalera_smm.so initial position: c962afef-9de9-11ea-a9b2-2fa479531940:65554108
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): Galera 4.6(r323e509) by Codership Oy <info@codership.com> loaded successfully.
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Found saved state: c962afef-9de9-11ea-a9b2-2fa479531940:-1, safe_to_bootstrap: 0
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCache DEBUG: opened preamble:
            Feb 25 11:18:12 pochard mariadbd[112552]: Version: 2
            Feb 25 11:18:12 pochard mariadbd[112552]: UUID: c962afef-9de9-11ea-a9b2-2fa479531940
            Feb 25 11:18:12 pochard mariadbd[112552]: Seqno: -1 - -1
            Feb 25 11:18:12 pochard mariadbd[112552]: Offset: -1
            Feb 25 11:18:12 pochard mariadbd[112552]: Synced: 0
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: c962afef-9de9-11ea-a9b2-2fa479531940, offset: -1
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCach0:0:0:0:0:0:0:0RingBuffer initial scan... 0.0% ( 0/134217752 bytes) complete.
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: st0:0:0:0:0:0:0:0_alloc
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to create a new provider '/usr/lib/galera/libgalera_smm.so' with options 'gcs.fc_factor=0.8;gcs.fc_limit=60;gmcast.listen_addr=0.0.0.0': Failed to initialize wsrep provider
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to load provider
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] Aborting
            Feb 25 11:18:13 pochard mariadbd[112552]: Warning: Memory not freed: 56
            {noformat}

            The node simply dies.

            If I attempt to restart it, it does this:

            {noformat}
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] /usr/sbin/mariadbd (mysqld 10.5.8-MariaDB-3-log) starting as process 112552 ...
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Warning] Could not increase number of max_open_files to more than 16384 (request: 18416)
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Loading provider /usr/lib/galera/libgalera_smm.so initial position: c962afef-9de9-11ea-a9b2-2fa479531940:65554108
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: wsrep_load(): Galera 4.6(r323e509) by Codership Oy <info@codership.com> loaded successfully.
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Found saved state: c962afef-9de9-11ea-a9b2-2fa479531940:-1, safe_to_bootstrap: 0
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCache DEBUG: opened preamble:
            Feb 25 11:18:12 pochard mariadbd[112552]: Version: 2
            Feb 25 11:18:12 pochard mariadbd[112552]: UUID: c962afef-9de9-11ea-a9b2-2fa479531940
            Feb 25 11:18:12 pochard mariadbd[112552]: Seqno: -1 - -1
            Feb 25 11:18:12 pochard mariadbd[112552]: Offset: -1
            Feb 25 11:18:12 pochard mariadbd[112552]: Synced: 0
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: c962afef-9de9-11ea-a9b2-2fa479531940, offset: -1
            Feb 25 11:18:12 pochard mariadbd[112552]: 2021-02-25 11:18:12 0 [Note] WSREP: GCach0:0:0:0:0:0:0:0RingBuffer initial scan... 0.0% ( 0/134217752 bytes) complete.
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: st0:0:0:0:0:0:0:0_alloc
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to create a new provider '/usr/lib/galera/libgalera_smm.so' with options 'gcs.fc_factor=0.8;gcs.fc_limit=60;gmcast.listen_addr=0.0.0.0': Failed to initialize wsrep provider
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] WSREP: Failed to load provider
            Feb 25 11:18:13 pochard mariadbd[112552]: 2021-02-25 11:18:13 0 [ERROR] Aborting
            Feb 25 11:18:13 pochard mariadbd[112552]: Warning: Memory not freed: 56
            {noformat}

            I tried to truncate the galera.cache file because it was obviously corrupt (truncate /var/lib/mysql/galera.cache -s 0), but that didn't work. I had to destroy the mysql datadir and do a SST from scratch.

            This is a reproducible problem, it always happens with the same query, which comes from a cronjob that runs to clean the database, the code that does the cleaning is php, from the Roundcube project:

            {noformat}
                public static function db_clean($days)
                {
                    // mapping for table name => primary key
                    $primary_keys = array(
                        'contacts' => 'contact_id',
                        'contactgroups' => 'contactgroup_id',
                    );

                    $db = self::db();

                    $threshold = date('Y-m-d 00:00:00', time() - $days * 86400);

                    foreach (array('contacts','contactgroups','identities') as $table) {
                        $sqltable = $db->table_name($table, true);

                        // also delete linked records
                        // could be skipped for databases which respect foreign key constraints
                        if ($db->db_provider == 'sqlite' && ($table == 'contacts' || $table == 'contactgroups')) {
                            $pk = $primary_keys[$table];
                            $memberstable = $db->table_name('contactgroupmembers');

                            $db->query(
                                "DELETE FROM " . $db->quote_identifier($memberstable)
                                . " WHERE `$pk` IN ("
                                    . "SELECT `$pk` FROM $sqltable"
                                    . " WHERE `del` = 1 AND `changed` < ?"
                                . ")",
                                $threshold);

                            echo $db->affected_rows() . " records deleted from '$memberstable'\n";
                        }

                        // delete outdated records
                        $db->query("DELETE FROM $sqltable WHERE `del` = 1 AND `changed` < ?", $threshold);

                        echo $db->affected_rows() . " records deleted from '$table'\n";
                    }
                }
            {noformat}

            elenst Elena Stepanova made changes -
            Fix Version/s 10.5 [ 23123 ]
            Assignee Jan Lindström [ jplindst ]
            Priority Blocker [ 1 ] Critical [ 2 ]
            jplindst Jan Lindström (Inactive) made changes -
            issue.field.resolutiondate 2021-03-30 07:01:34.0 2021-03-30 07:01:34.569
            jplindst Jan Lindström (Inactive) made changes -
            Fix Version/s 10.5.10 [ 25204 ]
            Fix Version/s 10.5 [ 23123 ]
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 119538 ] MariaDB v4 [ 158964 ]

            People

              jplindst Jan Lindström (Inactive)
              micah Micah
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.