Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Duplicate
-
10.2.35
-
CentOS 7.9
Description
Hello,
Edit: This may be related to MDEV-23851 but we would like to have confirmation from your side.
We upgraded MariaDB from 10.2.24 to 10.2.35 and the nodes in Cluster started crashing one day after the update. It seems to happen when there is a conflicting lock during a DELETE.
It's a 3-Nodes cluster. Every single node may crash a couple of times during a day. It also resulted in a crash of the whole cluster few times during the last two weeks.
log-node1 |
2020-12-29 8:40:08 140172980565760 [ERROR] InnoDB: Conflicting lock on table: `$DB`.`$TABLE1` index: GEN_CLUST_INDEX that has lock
|
RECORD LOCKS space id 945 page no 3 n bits 168 index GEN_CLUST_INDEX of table `$DB`.`$TABLE1` trx id 275420837 lock_mode X locks rec but not gap
|
Record lock, heap no 2
|
Record lock, heap no 98
|
2020-12-29 8:40:08 140172980565760 [ERROR] InnoDB: WSREP state:
|
2020-12-29 8:40:08 140172980565760 [ERROR] WSREP: Thread BF trx_id: 275420838 thread: 2 seqno: 87923500 query_state: executing conf_state: no conflict exec_mode: applier applier: 1 query: DELETE FROM process_id
|
WHERE process_name = 'my-process'
|
AND process_host = 'my-app.example.com'XÝê_
|
2020-12-29 8:40:08 140172980565760 [ERROR] WSREP: Thread BF trx_id: 275420837 thread: 10 seqno: 87923499 query_state: executing conf_state: no conflict exec_mode: applier applier: 1 query: DELETE FROM process_id
|
WHERE process_name = 'my-process-2'
|
AND process_host = 'my-app.example.com'XÝê_
|
2020-12-29 08:40:08 0x7f7c90b6b700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
Stack:
Server version: 10.2.35-MariaDB-log
|
key_buffer_size=268435456
|
read_buffer_size=2097152
|
max_used_connections=15
|
max_threads=502
|
thread_count=26
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2329025 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
|
Thread pointer: 0x7f7c780009a8
|
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
stack_bottom = 0x7f7c90b6ad20 thread_stack 0x49000
|
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55587dc621ee]
|
/usr/sbin/mysqld(handle_fatal_signal+0x30d)[0x55587d6ff04d]
|
/lib64/libpthread.so.0(+0xf630)[0x7f7c9b2c0630]
|
:0(__GI_raise)[0x7f7c99590387]
|
:0(__GI_abort)[0x7f7c99591a78]
|
/usr/sbin/mysqld(+0x44918e)[0x55587d4a418e]
|
/usr/sbin/mysqld(+0x87cd6d)[0x55587d8d7d6d]
|
/usr/sbin/mysqld(+0x87d9b4)[0x55587d8d89b4]
|
/usr/sbin/mysqld(+0x884145)[0x55587d8df145]
|
/usr/sbin/mysqld(+0x884b2a)[0x55587d8dfb2a]
|
/usr/sbin/mysqld(+0x91a1ba)[0x55587d9751ba]
|
/usr/sbin/mysqld(+0x91d48f)[0x55587d97848f]
|
/usr/sbin/mysqld(+0x849855)[0x55587d8a4855]
|
/usr/sbin/mysqld(+0x82cbb7)[0x55587d887bb7]
|
/usr/sbin/mysqld(+0x841cc9)[0x55587d89ccc9]
|
/usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x1c7)[0x55587d703c37]
|
/usr/sbin/mysqld(_ZN14Rows_log_event8find_rowEP14rpl_group_info+0x50e)[0x55587d800efe]
|
/usr/sbin/mysqld(_ZN21Delete_rows_log_event11do_exec_rowEP14rpl_group_info+0x8e)[0x55587d80100e]
|
/usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEP14rpl_group_info+0x2fd)[0x55587d7f3e8d]
|
/usr/sbin/mysqld(wsrep_apply_cb+0x482)[0x55587d6a48c2]
|
src/trx_handle.cpp:312(galera::TrxHandle::apply(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_trx_meta const&) const)[0x7f7c93d47ef8]
|
src/replicator_smm.cpp:92(apply_trx_ws(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_cb_status (*)(void*, unsigned int, wsrep_trx_meta const*, bool*, bool), galera::TrxHandle const&, wsrep_trx_meta const&))[0x7f7c93d856f3]
|
src/replicator_smm.cpp:458(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandle*))[0x7f7c93d8877c]
|
src/replicator_smm.cpp:1258(galera::ReplicatorSMM::process_trx(void*, galera::TrxHandle*))[0x7f7c93d8b99e]
|
src/gcs_action_source.cpp:116(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&))[0x7f7c93d67078]
|
src/gcs_action_source.cpp:28(~Release)[0x7f7c93d6876c]
|
src/replicator_smm.cpp:362(galera::ReplicatorSMM::async_recv(void*))[0x7f7c93d8bf7b]
|
src/wsrep_provider.cpp:271(galera_recv)[0x7f7c93d99f38]
|
/usr/sbin/mysqld(+0x64a976)[0x55587d6a5976]
|
/usr/sbin/mysqld(start_wsrep_THD+0x3eb)[0x55587d698c5b]
|
pthread_create.c:0(start_thread)[0x7f7c9b2b8ea5]
|
/lib64/libc.so.6(clone+0x6d)[0x7f7c9965896d]
|
|
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x7f7c85a75fcb): DELETE FROM process_id
|
WHERE process_name = 'my-process'
|
AND process_host = 'my-app.example.com'
|
|
Connection ID (thread ID): 2
|
Status: NOT_KILLED
|
The following packages are installed on the servers:
galera-25.3.31-1.el7.centos.x86_64
|
MariaDB-client-10.2.36-1.el7.centos.x86_64
|
MariaDB-compat-10.2.36-1.el7.centos.x86_64
|
MariaDB-common-10.2.36-1.el7.centos.x86_64
|
MariaDB-server-10.2.35-1.el7.centos.x86_64
|
Taking 29.12.2020 as example, when the monitoring system alarmed few times about node1 and node2 with the following message:
wsrep_cluster_status: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock'
|
It happened on node1 during December, 29th at:
- 08:40am
- 09:40am
- 03:50pm
Also on Dec, 29th on node2:
- 12:40am
- 12:50am
MariaDB crashed more times, though.
20201229-node1-mariadb.err |
2020-12-29 08:40:08 0x7f7c90b6b700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 08:40:19 0x7fb11796e700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 09:40:12 0x7f937c61b700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 15:50:19 0x7f06184cf700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 16:30:02 0x7f80c7722700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 21:50:07 0x7f7d9062e700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
20201229-node2-mariadb.err |
2020-12-29 00:40:12 0x7f19f84c6700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 00:50:13 0x7f77084b1700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 00:50:26 0x7febec0bc700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 00:50:38 0x7f43207d5700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 00:50:52 0x7fcc8c395700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 00:51:03 0x7faa98c43700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 09:00:09 0x7f80983b7700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 14:10:11 0x7f2890261700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 22:40:10 0x7f296120d700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
20201229-node3-mariadb.err |
2020-12-29 08:10:06 0x7faa6473d700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 16:40:03 0x7f3d78084700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
2020-12-29 19:50:18 0x7f8944225700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
|
Attached the obfuscated logs from 29.12.2020 of all three nodes.
Is there any known workaround to avoid further crashes? I couldn't find any.
Many thanks in advance.
Attachments
Issue Links
- relates to
-
MDEV-23851 Galera assertion at lock0lock.cc line 655 because of BF-BF lock wait
- Closed