Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Duplicate
-
10.4.17
-
None
Description
After a manual SST with mariabackup the IST fails with a assertion failure at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.4.17/storage/innobase/lock/lock0lock.cc line 687.
We tried the manual SST + IST twice and the WSREP Error before the assertion concerned both times the same table and index with the conflicting lock and the same insert (with different values).
Attachments
Issue Links
- includes
-
MDEV-23851 Galera assertion at lock0lock.cc line 655 because of BF-BF lock wait
-
- Closed
-
- is duplicated by
-
MDEV-24406 Galera crashes when deleting records
-
- Closed
-
-
MDEV-24437 Galera 4 read node crashes after DML statement from writer node
-
- Closed
-
- relates to
-
MDEV-23851 Galera assertion at lock0lock.cc line 655 because of BF-BF lock wait
-
- Closed
-
-
MDEV-24989 Galera assertion at lock0lock.cc line 655
-
- Closed
-
Activity
We did some further testing and this error does not only happen during the IST. We stopped the Tool that creates the INSERT Statements which are causing the problem during backup and IST. The second node now can join. Then we started the tool again and after some time the second node crashed again with the same error.
2020-11-18 21:40:04 49 [ERROR] InnoDB: Conflicting lock on table: `web`.`locations` index: unique_locations that has lock |
RECORD LOCKS space id 32369 page no 177144 n bits 824 index unique_locations of table `web`.`locations` trx id 34438968344 lock mode S |
Record lock, heap no 380 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 |
0: len 4; hex 0000413f; asc A?;; |
1: len 5; hex 99a12e5ca7; asc .\ ;; |
2: len 4; hex 006fe223; asc o #;; |
|
Record lock, heap no 752 PHYSICAL RECORD: n_fields 3; compact format; info bits 32 |
0: len 4; hex 0000413d; asc A=;; |
1: len 5; hex 99a7e4ea66; asc f;; |
2: len 4; hex 06094458; asc DX;; |
|
2020-11-18 21:40:04 49 [ERROR] InnoDB: WSREP state: |
2020-11-18 21:40:04 49 [ERROR] WSREP: Thread BF trx_id: 34438968341 thread: 49 seqno: 85724971 client_state: exec client_mode: high priority transaction_mode: executing applier: 1 toi: 0 local: 0 query: INSERT INTO locations ( pointer_id, km_offset, timestamp_from, timestamp_to, lat, lon, car_id, uicstation_id, position_count, utilized, km, delta_km) VALUES (4492,(SELECT current_km_offset FROM cars WHERE id=16701),'2020-11-18 00:20:47.0','2020-11-18 05:15:44.0','51.3959065','11.933976000000001',16701,'44657','2',1,40527,GREATEST(0,(SELECT ROUND(40527) - IFNULL(MAX(km),0) FROM (SELECT km FROM locations WHERE timestamp_from < '2020-11-18 00:20:47.0' AND car_id=16701 ORDER BY timestamp_from DESC LIMIT 1) x)))£†µ_ |
2020-11-18 21:40:04 49 [ERROR] WSREP: Thread BF trx_id: 34438968344 thread: 33 seqno: 85724992 client_state: exec client_mode: high priority transaction_mode: executing applier: 1 toi: 0 local: 0 query: INSERT INTO locations ( pointer_id, km_offset, timestamp_from, timestamp_to, lat, lon, car_id, uicstation_id, position_count, utilized, km, delta_km) VALUES (4492,(SELECT current_km_offset FROM cars WHERE id=16701),'2020-11-18 14:41:38.0',null,'51.39018214285715','11.945860714285715',16701,'44646','7',1,40530,GREATEST(0,(SELECT ROUND(40530) - IFNULL(MAX(km),0) FROM (SELECT km FROM locations WHERE timestamp_from < '2020-11-18 14:41:38.0' AND car_id=16701 ORDER BY timestamp_from DESC LIMIT 1) x)))£†µ_ |
2020-11-18 21:40:04 0x7fc069fb0700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.4.17/storage/innobase/lock/lock0lock.cc line 687 |
InnoDB: We intentionally generate a memory trap.
|
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ |
InnoDB: If you get repeated assertion failures or crashes, even
|
InnoDB: immediately after the mysqld startup, there may be
|
InnoDB: corruption in the InnoDB tablespace. Please refer to
|
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/ |
InnoDB: about forcing recovery.
|
201118 21:40:04 [ERROR] mysqld got signal 6 ; |
This could be because you hit a bug. It is also possible that this binary |
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs |
|
We will try our best to scrape up some info that will hopefully help |
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail. |
|
Server version: 10.4.17-MariaDB-log |
key_buffer_size=134217728 |
read_buffer_size=131072 |
max_used_connections=2 |
max_threads=302 |
thread_count=49 |
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 795659 K bytes of memory |
Hope that's ok; if not, decrease some variables in the equation. |
|
Thread pointer: 0x7fbf300009a8 |
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went |
terribly wrong...
|
stack_bottom = 0x7fc069fafd20 thread_stack 0x49000 |
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x562997bf3ffe] |
/usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x5629976859ef] |
/lib64/libpthread.so.0(+0xf6d0)[0x7fd2537876d0] |
:0(__GI_raise)[0x7fd251a5b277] |
:0(__GI_abort)[0x7fd251a5c968] |
/usr/sbin/mysqld(+0x5c2de8)[0x56299737dde8] |
/usr/sbin/mysqld(+0xaba93a)[0x56299787593a] |
/usr/sbin/mysqld(+0xabc144)[0x562997877144] |
/usr/sbin/mysqld(+0xac30fb)[0x56299787e0fb] |
/usr/sbin/mysqld(+0xbe5638)[0x5629979a0638] |
/usr/sbin/mysqld(+0xb2ffde)[0x5629978eafde] |
/usr/sbin/mysqld(+0xb30cd3)[0x5629978ebcd3] |
/usr/sbin/mysqld(+0xb311a3)[0x5629978ec1a3] |
/usr/sbin/mysqld(+0xb4370f)[0x5629978fe70f] |
/usr/sbin/mysqld(+0xa83235)[0x56299783e235] |
/usr/sbin/mysqld(_ZN7handler12ha_write_rowEPKh+0x2d0)[0x5629976912f0] |
/usr/sbin/mysqld(_ZN14Rows_log_event9write_rowEP14rpl_group_infob+0x235)[0x56299778d8c5] |
/usr/sbin/mysqld(_ZN20Write_rows_log_event11do_exec_rowEP14rpl_group_info+0x7d)[0x56299778dd2d] |
/usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEP14rpl_group_info+0x2dc)[0x562997781cfc] |
/usr/sbin/mysqld(_Z18wsrep_apply_eventsP3THDP14Relay_log_infoPKvm+0x23d)[0x5629975f7edd] |
/usr/sbin/mysqld(_ZN21Wsrep_applier_service15apply_write_setERKN5wsrep7ws_metaERKNS0_12const_bufferERNS0_14mutable_bufferE+0x9d)[0x5629975e375d] |
/usr/sbin/mysqld(+0xecf3cf)[0x562997c8a3cf] |
/usr/sbin/mysqld(_ZN5wsrep12server_state8on_applyERNS_21high_priority_serviceERKNS_9ws_handleERKNS_7ws_metaERKNS_12const_bufferE+0xf5)[0x562997c8aef5] |
/usr/sbin/mysqld(+0xee6de8)[0x562997ca1de8] |
src/trx_handle.cpp:387(galera::TrxHandleSlave::apply(void*, wsrep_cb_status (*)(void*, wsrep_ws_handle const*, unsigned int, wsrep_buf const*, wsrep_trx_meta const*, bool*), wsrep_trx_meta const&, bool&))[0x7fd24dd6b560] |
src/replicator_smm.cpp:504(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandleSlave&))[0x7fd24dda349a] |
src/replicator_smm.cpp:2145(galera::ReplicatorSMM::process_trx(void*, boost::shared_ptr<galera::TrxHandleSlave> const&))[0x7fd24dda8fd1] |
src/gcs_action_source.cpp:63(galera::GcsActionSource::process_writeset(void*, gcs_action const&, bool&))[0x7fd24dd825f9] |
src/gcs_action_source.cpp:110(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&))[0x7fd24dd82c3a] |
src/gcs_action_source.cpp:29(~Release)[0x7fd24dd82e6e] |
src/replicator_smm.cpp:390(galera::ReplicatorSMM::async_recv(void*))[0x7fd24dda94bb] |
src/wsrep_provider.cpp:263(galera_recv)[0x7fd24ddbbf38] |
/usr/sbin/mysqld(_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0xe)[0x562997ca2c8e] |
/usr/sbin/mysqld(+0x83e9f2)[0x5629975f99f2] |
/usr/sbin/mysqld(_Z15start_wsrep_THDPv+0x35d)[0x5629975edd3d] |
pthread_create.c:0(start_thread)[0x7fd25377fe25] |
/lib64/libc.so.6(clone+0x6d)[0x7fd251b23bad] |
|
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x7fd10f9147db): INSERT INTO locations ( pointer_id, km_offset, timestamp_from, timestamp_to, lat, lon, car_id, uicstation_id, position_count, utilized, km, delta_km) VALUES (4492,(SELECT current_km_offset FROM cars WHERE id=16701),'2020-11-18 00:20:47.0','2020-11-18 05:15:44.0','51.3959065','11.933976000000001',16701,'44657','2',1,40527,GREATEST(0,(SELECT ROUND(40527) - IFNULL(MAX(km),0) FROM (SELECT km FROM locations WHERE timestamp_from < '2020-11-18 00:20:47.0' AND car_id=16701 ORDER BY timestamp_from DESC LIMIT 1) x))) |
|
Connection ID (thread ID): 49 |
Status: NOT_KILLED
|
|
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on |
|
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains |
information that should help you find out what is causing the crash.
|
Writing a core file...
|
Working directory at /data/mysql
|
Resource Limits:
|
Limit Soft Limit Hard Limit Units
|
Max cpu time unlimited unlimited seconds
|
Max file size unlimited unlimited bytes
|
Max data size unlimited unlimited bytes
|
Max stack size 8388608 unlimited bytes |
Max core file size 0 unlimited bytes |
Max resident set unlimited unlimited bytes
|
Max processes 514253 514253 processes |
Max open files 300000 300000 files |
Max locked memory 65536 65536 bytes |
Max address space unlimited unlimited bytes
|
Max file locks unlimited unlimited locks
|
Max pending signals 514253 514253 signals |
Max msgqueue size 819200 819200 bytes |
Max nice priority 0 0 |
Max realtime priority 0 0 |
Max realtime timeout unlimited unlimited us
|
Core pattern: |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h
|
We have now rewritten the INSERT Statement so that it does not use the destination table in the VALUES part. The node still crashes on when using this query. The definition for index "unique_locations" is UNIQUE INDEX `unique_locations` (`car_id`, `timestamp_from`) USING BTREE
2020-11-20 13:56:55 31 [ERROR] InnoDB: Conflicting lock on table: `web`.`locations` index: unique_locations that has lock |
RECORD LOCKS space id 32369 page no 514311 n bits 952 index unique_locations of table `web`.`locations` trx id 34536873708 lock mode S |
Record lock, heap no 80 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 |
0: len 4; hex 00017fca; asc ;; |
1: len 5; hex 99a6f6598a; asc Y ;; |
2: len 4; hex 0471a87f; asc q ;; |
|
Record lock, heap no 560 PHYSICAL RECORD: n_fields 3; compact format; info bits 32 |
0: len 4; hex 00017fc6; asc ;; |
1: len 5; hex 99a7e85a49; asc ZI;; |
2: len 4; hex 060c1142; asc B;; |
|
2020-11-20 13:56:55 31 [ERROR] InnoDB: WSREP state: |
2020-11-20 13:56:55 31 [ERROR] WSREP: Thread BF trx_id: 34536873707 thread: 31 seqno: 92889532 client_state: exec client_mode: high priority transaction_mode: executing applier: 1 toi: 0 local: 0 query: INSERT INTO locations ( pointer_id, km_offset, timestamp_from, timestamp_to, lat, lon, car_id, uicstation_id, position_count, utilized, km, delta_km) VALUES (115597,(SELECT current_km_offset FROM cars WHERE id=98246),'2020-11-20 07:18:40.0',null,'51.399487','12.039409',98246,'44646','1',1,4595,2)n·_ |
2020-11-20 13:56:55 31 [ERROR] WSREP: Thread BF trx_id: 34536873708 thread: 12 seqno: 92889530 client_state: exec client_mode: high priority transaction_mode: committing applier: 1 toi: 0 local: 0 query: NULL |
2020-11-20 13:56:55 0x7fcc9c357700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.4.17/storage/innobase/lock/lock0lock.cc line 687 |
InnoDB: We intentionally generate a memory trap.
|
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ |
InnoDB: If you get repeated assertion failures or crashes, even
|
InnoDB: immediately after the mysqld startup, there may be
|
InnoDB: corruption in the InnoDB tablespace. Please refer to
|
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/ |
InnoDB: about forcing recovery.
|
201120 13:56:55 [ERROR] mysqld got signal 6 ; |
Is there a way to prevent a complete loss of the cluster when something this bad happens? Every node will fail hard at the same time and then the originating node will be the only one online, but won't accept queries because it has no quorum. Bugs like these makes creating a cluster useless.
I know that things can go wrong all the time and I appreciate a free product like this, but how can I prevent such a situation from happening again? I can sustain loss of nodes for redundancy (even a single node can run my app, I've created a cluster for redundancy not for scaling). Do I need a read-only replica for hot-spare to became master? Could that fail too? How are you handling this situation?
Thanks everyone for the hard work
We have a similar error with the same version on galera cluster:
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 12 [Note] InnoDB: *** Priority TRANSACTION: |
Nov 22 23:01:25 server01.domain mysqld[59821]: TRANSACTION 69096951, ACTIVE 0 sec starting index read |
Nov 22 23:01:25 server01.domain mysqld[59821]: mysql tables in use 2, locked 2 |
Nov 22 23:01:25 server01.domain mysqld[59821]: 1 lock struct(s), heap size 1128, 0 row lock(s) |
Nov 22 23:01:25 server01.domain mysqld[59821]: MySQL thread id 12, OS thread handle 140242091112192, query id 7942930 Update_rows_log_event::find_row(3 |
Nov 22 23:01:25 server01.domain mysqld[59821]: INSERT INTO tab01 ( pk_userid, fk_orderinfoid, field01, homedir, clientip, logincount ) VALUES ( 'u |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 12 [Note] InnoDB: *** Victim TRANSACTION: |
Nov 22 23:01:25 server01.domain mysqld[59821]: TRANSACTION 69096950, ACTIVE 0 sec |
Nov 22 23:01:25 server01.domain mysqld[59821]: mysql tables in use 2, locked 2 |
Nov 22 23:01:25 server01.domain mysqld[59821]: 6 lock struct(s), heap size 1128, 3 row lock(s), undo log entries 1 |
Nov 22 23:01:25 server01.domain mysqld[59821]: MySQL thread id 54640, OS thread handle 140210684311296, query id 7942928 appserver01.domain |
Nov 22 23:01:25 server01.domain mysqld[59821]: INSERT IGNORE INTO tab02 ( fk_userid, fk_netareaid ) VALUES ( 'value01', ( SELECT pk_ne |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 12 [Note] InnoDB: *** WAITING FOR THIS LOCK TO BE GRANTED: |
Nov 22 23:01:25 server01.domain mysqld[59821]: RECORD LOCKS space id 257 page no 12 n bits 136 index PRIMARY of table `db01`.`tab01` trx id 690969 |
Nov 22 23:01:25 server01.domain mysqld[59821]: Record lock, heap no 64 PHYSICAL RECORD: n_fields 11; compact format; info bits 0 |
Nov 22 23:01:25 server01.domain mysqld[59821]: 0: len 10; hex 75746131383035303033; asc value01;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 1: len 6; hex 000000000000; asc ;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2: len 7; hex 80000000000000; asc ;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 3: len 4; hex 0052f5be; asc R ;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 4: len 30; hex 243624726f756e64733d35303030244b595931736d623345734767734d56; asc $6$rounds=5000$KYY1smb3 |
Nov 22 23:01:25 server01.domain mysqld[59821]: 5: len 30; hex 2f646174612f72656c65617365642f3230313132322d3233303032362d75; asc /data/released/201122-2 |
Nov 22 23:01:25 server01.domain mysqld[59821]: 6: SQL NULL; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 7: len 3; hex 000000; asc ;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 8: len 4; hex 00014370; asc Cp;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 9: len 2; hex 0000; asc ;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 10: len 4; hex 00014370; asc Cp;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: [788B blob data] |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 12 [Note] InnoDB: SQL2: INSERT IGNORE INTO tab02 ( fk_userid, fk_neta |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 12 [Note] WSREP: cluster conflict due to high priority abort for threads: |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 12 [Note] WSREP: Winning thread: |
Nov 22 23:01:25 server01.domain mysqld[59821]: THD: 12, mode: high priority, state: exec, conflict: executing, seqno: 34231195 |
Nov 22 23:01:25 server01.domain mysqld[59821]: [748B blob data] |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 12 [Note] WSREP: Victim thread: |
Nov 22 23:01:25 server01.domain mysqld[59821]: THD: 54640, mode: local, state: exec, conflict: certifying, seqno: -1 |
Nov 22 23:01:25 server01.domain mysqld[59821]: SQL: INSERT IGNORE INTO tab02 ( fk_userid, fk_netareaid ) VALUES ( 'value01', ( SELECT |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 12 [Note] WSREP: context: /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.4.17/storage/innobase/lock/lock0lock.cc line 687 |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 12 [Warning] Aborted connection 54640 to db: db01 |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 1 [ERROR] InnoDB: Conflicting lock on table: `db01`.`tab01` index: PRIMARY that |
Nov 22 23:01:25 server01.domain mysqld[59821]: RECORD LOCKS space id 257 page no 12 n bits 136 index PRIMARY of table `db01`.`tab01` trx id 690969 |
Nov 22 23:01:25 server01.domain mysqld[59821]: Record lock, heap no 64 PHYSICAL RECORD: n_fields 11; compact format; info bits 0 |
Nov 22 23:01:25 server01.domain mysqld[59821]: 0: len 10; hex 75746131383035303033; asc value01;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 1: len 6; hex 000000000000; asc ;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2: len 7; hex 80000000000000; asc ;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 3: len 4; hex 0052f5be; asc R ;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 4: len 30; hex 243624726f756e64733d35303030244b595931736d623345734767734d56; asc $6$rounds=5000$KYY1smb3 |
Nov 22 23:01:25 server01.domain mysqld[59821]: 5: len 30; hex 2f646174612f72656c65617365642f3230313132322d3233303032362d75; asc /data/released/201122-2 |
Nov 22 23:01:25 server01.domain mysqld[59821]: 6: SQL NULL; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 7: len 3; hex 000000; asc ;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 8: len 4; hex 00014370; asc Cp;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 9: len 2; hex 0000; asc ;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 10: len 4; hex 00014370; asc Cp;; |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 1 [ERROR] InnoDB: WSREP state: |
Nov 22 23:01:25 server01.domain mysqld[59821]: [362B blob data] |
Nov 22 23:01:25 server01.domain mysqld[59821]: [942B blob data] |
Nov 22 23:01:25 server01.domain mysqld[59821]: 2020-11-22 23:01:25 0x7f8558083700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_fo |
Nov 22 23:01:25 server01.domain mysqld[59821]: InnoDB: We intentionally generate a memory trap. |
Nov 22 23:01:25 server01.domain mysqld[59821]: InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ |
Nov 22 23:01:25 server01.domain mysqld[59821]: InnoDB: If you get repeated assertion failures or crashes, even |
Nov 22 23:01:25 server01.domain mysqld[59821]: InnoDB: immediately after the mysqld startup, there may be |
Nov 22 23:01:25 server01.domain mysqld[59821]: InnoDB: corruption in the InnoDB tablespace. Please refer to |
Nov 22 23:01:25 server01.domain mysqld[59821]: InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/ |
Nov 22 23:01:25 server01.domain mysqld[59821]: InnoDB: about forcing recovery. |
Nov 22 23:01:25 server01.domain mysqld[59821]: 201122 23:01:25 [ERROR] mysqld got signal 6 ; |
Nov 22 23:01:25 server01.domain mysqld[59821]: This could be because you hit a bug. It is also possible that this binary |
Nov 22 23:01:25 server01.domain mysqld[59821]: or one of the libraries it was linked against is corrupt, improperly built, |
Nov 22 23:01:25 server01.domain mysqld[59821]: or misconfigured. This error can also be caused by malfunctioning hardware. |
Nov 22 23:01:25 server01.domain mysqld[59821]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs |
Nov 22 23:01:25 server01.domain mysqld[59821]: We will try our best to scrape up some info that will hopefully help |
Nov 22 23:01:25 server01.domain mysqld[59821]: diagnose the problem, but since we have already crashed, |
Nov 22 23:01:25 server01.domain mysqld[59821]: something is definitely wrong and this may fail. |
Nov 22 23:01:25 server01.domain mysqld[59821]: Server version: 10.4.17-MariaDB-log |
Nov 22 23:01:25 server01.domain mysqld[59821]: key_buffer_size=134217728 |
Nov 22 23:01:25 server01.domain mysqld[59821]: read_buffer_size=262144 |
Nov 22 23:01:25 server01.domain mysqld[59821]: max_used_connections=20 |
Nov 22 23:01:25 server01.domain mysqld[59821]: max_threads=153 |
Nov 22 23:01:25 server01.domain mysqld[59821]: thread_count=37 |
Nov 22 23:01:25 server01.domain mysqld[59821]: It is possible that mysqld could use up to |
Nov 22 23:01:25 server01.domain mysqld[59821]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 487338 K bytes of memory |
Nov 22 23:01:25 server01.domain mysqld[59821]: Hope that's ok; if not, decrease some variables in the equation. |
Nov 22 23:01:25 server01.domain mysqld[59821]: Thread pointer: 0x7f8c80056148 |
Nov 22 23:01:25 server01.domain mysqld[59821]: Attempting backtrace. You can use the following information to find out |
Nov 22 23:01:25 server01.domain mysqld[59821]: where mysqld died. If you see no messages after this, something went |
Nov 22 23:01:25 server01.domain mysqld[59821]: terribly wrong... |
Nov 22 23:01:25 server01.domain mysqld[59821]: stack_bottom = 0x7f8558082cf0 thread_stack 0x49000 |
Nov 22 23:01:26 server01.domain mysqld[59821]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55fb71cefffe] |
Nov 22 23:01:26 server01.domain mysqld[59821]: /usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x55fb717819ef] |
Nov 22 23:01:26 server01.domain mysqld[59821]: /lib64/libpthread.so.0(+0xf630)[0x7f8cd8875630] |
Nov 22 23:01:27 server01.domain mysqld[59821]: :0(__GI_raise)[0x7f8cd6b45387] |
Nov 22 23:01:27 server01.domain mysqld[59821]: :0(__GI_abort)[0x7f8cd6b46a78] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0x5c2de8)[0x55fb71479de8] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0xaba93a)[0x55fb7197193a] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0xabc144)[0x55fb71973144] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0xac3a30)[0x55fb7197aa30] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0xb2b196)[0x55fb719e2196] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0xb2cf93)[0x55fb719e3f93] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0xb30da1)[0x55fb719e7da1] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0xb311a3)[0x55fb719e81a3] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0xb4370f)[0x55fb719fa70f] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0xa83235)[0x55fb7193a235] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(_ZN7handler12ha_write_rowEPKh+0x2d0)[0x55fb7178d2f0] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(_ZN14Rows_log_event9write_rowEP14rpl_group_infob+0x235)[0x55fb718898c5] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(_ZN20Write_rows_log_event11do_exec_rowEP14rpl_group_info+0x7d)[0x55fb71889d2d] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEP14rpl_group_info+0x2dc)[0x55fb7187dcfc] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(_Z18wsrep_apply_eventsP3THDP14Relay_log_infoPKvm+0x23d)[0x55fb716f3edd] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(_ZN22Wsrep_replayer_service15apply_write_setERKN5wsrep7ws_metaERKNS0_12const_bufferERNS |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0xecf3cf)[0x55fb71d863cf] |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(_ZN5wsrep12server_state8on_applyERNS_21high_priority_serviceERKNS_9ws_handleERKNS_7ws_m |
Nov 22 23:01:27 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0xee6de8)[0x55fb71d9dde8] |
Nov 22 23:01:28 server01.domain mysqld[59821]: /usr/lib64/galera-4/libgalera_smm.so(+0x167560)[0x7f8cd34f4560] |
Nov 22 23:01:28 server01.domain mysqld[59821]: src/trx_handle.cpp:387(galera::TrxHandleSlave::apply(void*, wsrep_cb_status (*)(void*, wsrep_ws_handle c |
Nov 22 23:01:28 server01.domain mysqld[59821]: src/trx_handle.hpp:826(galera::TrxHandleMaster::lock())[0x7f8cd3545c3b] |
Nov 22 23:01:28 server01.domain mysqld[59821]: /usr/sbin/mysqld(_ZN5wsrep18wsrep_provider_v266replayERKNS_9ws_handleEPNS_21high_priority_serviceE+0x29) |
Nov 22 23:01:28 server01.domain mysqld[59821]: /usr/sbin/mysqld(_ZN20Wsrep_client_service6replayEv+0xf7)[0x55fb716df097] |
Nov 22 23:01:28 server01.domain mysqld[59821]: /usr/sbin/mysqld(_ZN5wsrep11transaction6replayERNS_11unique_lockINS_5mutexEEE+0x58)[0x55fb71d98f88] |
Nov 22 23:01:28 server01.domain mysqld[59821]: /usr/sbin/mysqld(_ZN5wsrep11transaction15after_statementEv+0x1d4)[0x55fb71d9a4d4] |
Nov 22 23:01:28 server01.domain mysqld[59821]: /usr/sbin/mysqld(_ZN5wsrep12client_state15after_statementEv+0x4d)[0x55fb71d757bd] |
Nov 22 23:01:28 server01.domain mysqld[59821]: /usr/sbin/mysqld(+0x5adea0)[0x55fb71464ea0] |
Nov 22 23:01:28 server01.domain mysqld[59821]: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x2908)[0x55fb7157d3e8] |
Nov 22 23:01:28 server01.domain mysqld[59821]: /usr/sbin/mysqld(_Z10do_commandP3THD+0x109)[0x55fb7157d9a9] |
Nov 22 23:01:28 server01.domain mysqld[59821]: /usr/sbin/mysqld(_Z24do_handle_one_connectionP7CONNECT+0x221)[0x55fb7165dae1] |
Nov 22 23:01:28 server01.domain mysqld[59821]: /usr/sbin/mysqld(handle_one_connection+0x3d)[0x55fb7165db9d] |
Nov 22 23:01:28 server01.domain mysqld[59821]: pthread_create.c:0(start_thread)[0x7f8cd886dea5] |
Nov 22 23:01:29 server01.domain mysqld[59821]: /lib64/libc.so.6(clone+0x6d)[0x7f8cd6c0d96d] |
Nov 22 23:01:29 server01.domain mysqld[59821]: Trying to get some variables. |
Nov 22 23:01:29 server01.domain mysqld[59821]: Some pointers may be invalid and cause the dump to abort. |
Nov 22 23:01:29 server01.domain mysqld[59821]: Query (0x7f8cca895ff3): INSERT IGNORE INTO tab02 ( fk_userid, fk_netareaid ) VALUES ( 'db01 |
Nov 22 23:01:29 server01.domain mysqld[59821]: Connection ID (thread ID): 1 |
Nov 22 23:01:30 server01.domain systemd[1]: mariadb.service: main process exited, code=killed, status=6/ABRT |
Nov 22 23:01:30 server01.domain systemd[1]: Unit mariadb.service entered failed state. |
Nov 22 23:01:30 server01.domain systemd[1]: mariadb.service failed. |
Nov 22 23:01:35 server01.domain systemd[1]: mariadb.service holdoff time over, scheduling restart. |
Nov 22 23:01:35 server01.domain systemd[1]: Stopped MariaDB 10.4.17 database server. |
Nov 22 23:01:35 server01.domain systemd[1]: Starting MariaDB 10.4.17 database server... |
Nov 22 23:01:41 server01.domain sh[70368]: WSREP: Recovered position d62b1247-40e7-11e9-a393-137f199c8420:34231194 |
...
|
Nov 22 23:01:47 server01.domain mysqld[70635]: 2020-11-22 23:01:47 1 [Note] WSREP: Synchronized with group, ready for connections |
The s**t hit the fan also with 10.3.26 after we upgraded from 10.3.23:
2020-11-30 7:23:59 1 [Note] WSREP: Receiving IST: 55559 writesets, seqnos 338735304-338790863
|
2020-11-30 7:23:59 0 [Note] WSREP: Receiving IST... 0.0% ( 0/55559 events) complete.
|
2020-11-30 7:23:59 0 [Note] Reading of all Master_info entries succeeded
|
2020-11-30 7:23:59 0 [Note] Added new Master_info '' to hash table
|
2020-11-30 7:23:59 0 [Note] /usr/sbin/mysqld: ready for connections.
|
Version: '10.3.26-MariaDB-log' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server
|
2020-11-30 7:24:02 89 [ERROR] InnoDB: Conflicting lock on table: `webdb`.`ordered_contracts_orderrequest` index: PRIMARY that has lock
|
RECORD LOCKS space id 35143 page no 6345655 n bits 72 index PRIMARY of table `webdb`.`ordered_contracts_orderrequest` trx id 7193044632 lock_mode X
|
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
|
0: len 8; hex 73757072656d756d; asc supremum;;
|
|
2020-11-30 7:24:02 89 [ERROR] InnoDB: WSREP state:
|
2020-11-30 7:24:02 89 [ERROR] WSREP: Thread BF trx_id: 7193044625 thread: 89 seqno: 338782886 query_state: executing conf_state: no conflict exec_mode: applier applier: 1 query: INSERT INTO `ordered_contracts_orderrequest` (`contract_id`, `external_id`, `requested_at`, `finished_at`, `type`, `status`, `data`, `flow_data`) VALUES (xxxxxxxx, 'xxxxxxxxxxxxxxxxxxxxx', '2020-xx-xx xx:xx:32', NULL, 'create_contract', 'ongoing', '{\"tasks\": {\"nodes\": [{\"tasks\": [{\"task_type\": \"xxxxxxx\", \"data\": {\"pk\": xxxxxxxxxxxx, \"changes\": {\"confirmation_code\": \"xxxxxxxxxxxx\", \"offer\": \"C_ACQ_PRE_POST_B2C_ALL\", \"customer\": xxxxxxxxxxx, \"xxxxxxx\": {\"xxxxxxxx\": \"xxxxxxxxxxxx\", \"reseller\": \"xxxxxxxx\", \"chain\": \"xxxx\", \"channel\": \"xxxx\", \"sales_id\": \"xxxxxxxx\", \"sales_type\": \"acquisition\", \"xxxxx\": \"xxxxxxxxxxxx\", \"xxxxxxxxx\": null, \"xxxxxxxx\": null, \"extra_information\": {}, \"xxxxxxxxxx\": null}, \"products\": [{\"product_type\": \"xxxxxxxxx\",
|
2020-11-30 7:24:02 89 [ERROR] WSREP: Thread BF trx_id: 7193044632 thread: 1 seqno: 338782905 query_state: idle conf_state: no conflict exec_mode: applier applier: 1 query: NULL
|
2020-11-30 07:24:02 0x7eb44f94b700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.3.26/storage/innobase/lock/lock0lock.cc line 687
|
InnoDB: We intentionally generate a memory trap.
|
Note: Automatic IST with rsync.
We also experienced this after an upgrade from 10.5.5 to 10.5.8:
2020-12-04 8:58:21 0 [Note] WSREP: Receiving IST... 0.0% ( 0/160 events) complete.
|
2020-12-04 8:58:21 0 [Note] WSREP: IST preload starting at 11365559
|
2020-12-04 8:58:21 0 [Note] WSREP: Service thread queue flushed.
|
2020-12-04 8:58:21 0 [Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:11365558, protocol version: 5
|
2020-12-04 8:58:21 10 [ERROR] InnoDB: Conflicting lock on table: `app-scheduler`.`jobs` index: fk_schedule_id that has lock
|
RECORD LOCKS space id 3701 page no 1590 n bits 272 index fk_schedule_id of table `app-scheduler`.`jobs` trx id 24901901 lock_mode X locks rec but not gap
|
Record lock, heap no 121
|
2020-12-04 8:58:21 10 [ERROR] InnoDB: WSREP state:
|
2020-12-04 8:58:21 10 [ERROR] WSREP: Thread BF trx_id: 24901902 thread: 10 seqno: 11365642 client_state: exec client_mode: high priority transaction_mode: executing applier: 1 toi: 0 local: 0 query: /* delete net.iponly.app_scheduler.entities.Schedule */ delete from schedules where id='99576e11-b63b-44a8-a5b1-26cc583fe20d'<A3><EB><C9>_
|
^S^A
|
2020-12-04 8:58:21 10 [ERROR] WSREP: Thread BF trx_id: 24901901 thread: 6 seqno: 11365641 client_state: exec client_mode: high priority transaction_mode: executing applier: 1 toi: 0 local: 0 query: /* delete net.iponly.app_scheduler.entities.Job */ delete from jobs where id='11397042-f4c7-4243-8bfd-ad38c3cb8767'<A3><EB><C9>_^S^A
|
2020-12-04 08:58:21 0x7f93a80bf700 InnoDB: Assertion failure in file /home/buildbot/buildbot/build/mariadb-10.5.8/storage/innobase/lock/lock0lock.cc line 655
|
We're using mariabackup for SST.
Same issue over here, currently downgrading to 10.3.25 (oldest available in the repos) from 10.3.27 at the moment to see if that resolves our issues.
Upgraded a few weeks ago from 10.3.18 which did not present any issues, but the secondary cluster node now crashes multiple times per week. We're running a master/master setup (with garb as witness)
Our cluster successfully does SST or IST, but just crashes at random when some locks conflict. This happened every now and then already in the past but never resulted in a crash of the cluster
Submitted two pull requests to fix this issue and MDEV-23851, which appear to be duplicates. Reason for the crashes is recently added too strict assertion, which fires in safe execution. This is fixed by the pull request: https://github.com/MariaDB/server/pull/1723.
There is also a real bug resulting from a more rare scenario, where unique secondary key value is modified in certain way (often caused by REPLACE statement usage), resulting in secondary index gap locking and there happens simultaneous INSERT replicated from other node and inserting into the locked gap. This issue has a fix in pull request: https://github.com/MariaDB/server/pull/1724
two pull requests submitted for this:
https://github.com/MariaDB/server/pull/1723
https://github.com/MariaDB/server/pull/1724
Thanks for your work Seppo, much appreciated.
I'll keep an eye on future releases to see if your fix has been included, so we can update the cluster again to the latest release.
Faced absolutely same issue. Nodes can't perform neither IST nor SST, regardless of how it was started. As a result, cluster members can't start as they can't sync after restart. Crash always refer same line:
Another node:
Running mariadb 10.4.17 with galera-26.4.6 on Gentoo.