Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Incomplete
-
10.2.29
-
None
-
DB: MariaDB-server-10.2.29-1.el7.centos.x86_64
Galera: galera-25.3.28-1.rhel7.el7.centos.x86_64
OS: CentOS Linux release 7.7.1908 3.10.0-1062.9.1.el7.x86_64
proxysql-2.0.8-1.x86_64
Two-node Galera multi-master cluster with Garb and ProxySQL for write split. Reads go to both DB nodes, Writes go to the first DB node.DB: MariaDB-server-10.2.29-1.el7.centos.x86_64 Galera: galera-25.3.28-1.rhel7.el7.centos.x86_64 OS: CentOS Linux release 7.7.1908 3.10.0-1062.9.1.el7.x86_64 proxysql-2.0.8-1.x86_64 Two-node Galera multi-master cluster with Garb and ProxySQL for write split. Reads go to both DB nodes, Writes go to the first DB node.
Description
Galera node crashes spontaneously after a single delete operation.
There is a query on DB1:
#200110 13:26:47 server id 1 end_log_pos 61604233 CRC32 0x4802bc88 Annotate_rows:
|
#Q> DELETE FROM `users` WHERE `id`=579611
|
#200110 13:26:47 server id 1 end_log_pos 61604314 CRC32 0x8385669b Table_map: `XXX`.`users` mapped to number 93706
|
# at 61604314
|
#200110 13:26:47 server id 1 end_log_pos 61604395 CRC32 0xa8239d1b Table_map: `XXX`.`users_profiles` mapped to number 93729
|
# at 61604395
|
#200110 13:26:47 server id 1 end_log_pos 61604462 CRC32 0x9f3cacc3 Table_map: `XXX`.`documents` mapped to number 93711
|
# at 61604462
|
#200110 13:26:47 server id 1 end_log_pos 61604545 CRC32 0x62acaf45 Table_map: `XXX`.`shop_characteristics` mapped to number 93763
|
# at 61604545
|
#200110 13:26:47 server id 1 end_log_pos 61604626 CRC32 0xa235fd75 Table_map: `XXX`.`shop_reviews` mapped to number 93747
|
# at 61604626
|
#200110 13:26:47 server id 1 end_log_pos 61604709 CRC32 0x1f8ea18f Table_map: `XXX`.`youtube_videos` mapped to number 93694
|
# at 61604709
|
#200110 13:26:47 server id 1 end_log_pos 61604791 CRC32 0xa6bf9d19 Table_map: `XXX`.`shop_categories_characteristics` mapped to number 93764
|
# at 61604791
|
#200110 13:26:47 server id 1 end_log_pos 61604869 CRC32 0x32269d26 Table_map: `XXX`.`shop_characteristics_values` mapped to number 93765
|
# at 61604869
|
#200110 13:26:47 server id 1 end_log_pos 61604938 CRC32 0xf8ccacea Table_map: `XXX`.`shop_products_values` mapped to number 93746
|
# at 61604938
|
#200110 13:26:47 server id 1 end_log_pos 61605124 CRC32 0xe107f6a0 Delete_rows: table id 93706 flags: STMT_END_F
|
While this query is being processed by the second DB node, mysqld on DB #2 fails with signal 11:
mysql error log |
200110 13:51:22 [ERROR] mysqld got signal 11 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail.
|
|
Server version: 10.2.29-MariaDB-log
|
key_buffer_size=268435456
|
read_buffer_size=524288
|
max_used_connections=0
|
max_threads=4002
|
thread_count=15
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4445557 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
|
Thread pointer: 0x7f16f00009a8
|
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
stack_bottom = 0x7f18407da910 thread_stack 0x49000
|
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55671cadcf7e]
|
/usr/sbin/mysqld(handle_fatal_signal+0x30d)[0x55671c56210d]
|
sigaction.c:0(__restore_rt)[0x7f1846fc25f0]
|
:0(__memset_sse2)[0x7f18452eca0e]
|
/usr/sbin/mysqld(+0x8f01b1)[0x55671c7ce1b1]
|
/usr/sbin/mysqld(+0x832dc2)[0x55671c710dc2]
|
/usr/sbin/mysqld(+0x9258db)[0x55671c8038db]
|
/usr/sbin/mysqld(+0x926376)[0x55671c804376]
|
/usr/sbin/mysqld(+0x9280a9)[0x55671c8060a9]
|
/usr/sbin/mysqld(+0x8f2c9c)[0x55671c7d0c9c]
|
/usr/sbin/mysqld(+0x8de02d)[0x55671c7bc02d]
|
/usr/sbin/mysqld(+0x921b2e)[0x55671c7ffb2e]
|
/usr/sbin/mysqld(+0x926897)[0x55671c804897]
|
/usr/sbin/mysqld(+0x9280a9)[0x55671c8060a9]
|
/usr/sbin/mysqld(+0x8f288b)[0x55671c7d088b]
|
/usr/sbin/mysqld(+0x82c11c)[0x55671c70a11c]
|
/usr/sbin/mysqld(_ZN7handler13ha_delete_rowEPKh+0x44b)[0x55671c56d94b]
|
/usr/sbin/mysqld(_ZN21Delete_rows_log_event11do_exec_rowEP14rpl_group_info+0x15b)[0x55671c66460b]
|
/usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEP14rpl_group_info+0x31c)[0x55671c65609c]
|
/usr/sbin/mysqld(wsrep_apply_cb+0x502)[0x55671c506982]
|
src/trx_handle.cpp:312(galera::TrxHandle::apply(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_trx_meta const&) const)[0x7f1841a30ee8]
|
src/replicator_smm.cpp:92(apply_trx_ws(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_cb_status (*)(void*, unsigned int, wsrep_trx_meta const*, bool*, bool), galera::TrxHandle const&, wsrep_trx_meta const&))[0x7f1841a6e063]
|
src/replicator_smm.cpp:450(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandle*))[0x7f1841a711bc]
|
src/gu_mutex.hpp:38(gu::Mutex::unlock() const)[0x7f1841a7ed1e]
|
src/replicator_str.cpp:751(galera::ReplicatorSMM::request_state_transfer(void*, wsrep_uuid const&, long, void const*, long))[0x7f1841a8025d]
|
src/replicator_smm.cpp:1483(galera::ReplicatorSMM::process_conf_change(void*, wsrep_view_info const&, int, galera::Replicator::State, long))[0x7f1841a755d6]
|
src/gcs_action_source.cpp:139(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&))[0x7f1841a4fbdc]
|
src/gcs_action_source.cpp:28(~Release)[0x7f1841a50f7c]
|
src/replicator_smm.cpp:362(galera::ReplicatorSMM::async_recv(void*))[0x7f1841a7497b]
|
src/wsrep_provider.cpp:271(galera_recv)[0x7f1841a82978]
|
/usr/sbin/mysqld(+0x629b97)[0x55671c507b97]
|
/usr/sbin/mysqld(start_wsrep_THD+0x4fb)[0x55671c4f6eeb]
|
pthread_create.c:0(start_thread)[0x7f1846fbae65]
|
/lib64/libc.so.6(clone+0x6d)[0x7f184535b88d]
|
|
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x7f16e001300b): DELETE FROM `users` WHERE `id`=579611
|
Connection ID (thread ID): 2
|
Status: NOT_KILLED
|
It tries to restart but always get the same error.
/var/log/messages |
Jan 10 13:26:51 cls-db2 kernel: mysqld[17419]: segfault at 42e ip 00007fb934efea0e sp 00007fb6248a52d8 error 6 in libc-2.17.so[7fb934e6f000+1c3000]
|
Jan 10 13:26:52 cls-db2 systemd: mariadb.service: main process exited, code=killed, status=11/SEGV
|
Jan 10 13:26:52 cls-db2 systemd: Unit mariadb.service entered failed state.
|
Jan 10 13:26:52 cls-db2 systemd: mariadb.service failed.
|
Jan 10 13:26:57 cls-db2 systemd: mariadb.service holdoff time over, scheduling restart.
|
Jan 10 13:26:57 cls-db2 systemd: Stopped MariaDB 10.2.29 database server.
|
Jan 10 13:26:57 cls-db2 systemd: Starting MariaDB 10.2.29 database server...
|
Jan 10 13:27:03 cls-db2 sh: WSREP: Recovered position 4dcee1ec-1f7f-11ea-acd7-e66ec9f0752e:90361997
|
Jan 10 13:27:03 cls-db2 mysqld: 2020-01-10 13:27:03 140420010002624 [Note] /usr/sbin/mysqld (mysqld 10.2.29-MariaDB-log) starting as process 25738 ...
|
Jan 10 13:27:06 cls-db2 systemd: Started MariaDB 10.2.29 database server.
|
Jan 10 13:27:07 cls-db2 kernel: mysqld[25745]: segfault at 42e ip 00007fb612c1fa0e sp 00007fb60c1027e8 error 6 in libc-2.17.so[7fb612b90000+1c3000]
|
Jan 10 13:27:07 cls-db2 systemd: mariadb.service: main process exited, code=killed, status=11/SEGV
|
Jan 10 13:27:07 cls-db2 systemd: Unit mariadb.service entered failed state.
|
Jan 10 13:27:07 cls-db2 systemd: mariadb.service failed.
|
Jan 10 13:27:12 cls-db2 systemd: mariadb.service holdoff time over, scheduling restart.
|
Jan 10 13:27:12 cls-db2 systemd: Stopped MariaDB 10.2.29 database server.
|
Jan 10 13:27:12 cls-db2 systemd: Starting MariaDB 10.2.29 database server...
|
mysql config |
[mysqld]
|
server-id=1
|
bind-address=XXXX
|
datadir=/var/lib/mysql
|
user = mysql
|
socket = /var/lib/mysql/mysql.sock
|
pid-file = /var/lib/mysql/mysql.pid
|
collation-server = utf8_unicode_ci
|
init-connect = 'SET NAMES utf8'
|
character-set-server = utf8
|
key-buffer-size = 256M
|
join_buffer_size = 512K
|
read_buffer_size = 512K
|
read_rnd_buffer_size = 512K
|
sort_buffer_size = 512K
|
myisam-recover-options = FORCE,BACKUP
|
skip-host-cache
|
skip-name-resolve
|
max_connections = 4000
|
max_allowed_packet = 512M
|
max_binlog_size = 100M
|
sysdate-is-now = 1
|
innodb-autoinc-lock-mode = 2
|
innodb_autoinc_lock_mode = 2
|
innodb-doublewrite = 1
|
innodb_flush_log_at_trx_commit = 0
|
datadir = /var/lib/mysql
|
tmp-table-size = 32M
|
max-heap-table-size = 32M
|
query_cache_type = 1
|
query_cache_size = 1M
|
query_cache_limit = 16M
|
thread-cache-size = 32
|
open-files-limit = 65535
|
table-definition-cache = 4096
|
table-open-cache = 4096
|
innodb-flush-method = O_DIRECT
|
innodb-log-files-in-group = 2
|
innodb-log-file-size = 256M
|
innodb-flush-log-at-trx-commit = 2
|
innodb-file-per-table = 1
|
innodb-buffer-pool-size = 6G
|
innodb_io_capacity = 2000
|
log-queries-not-using-indexes = 1
|
slow-query-log = 1
|
slow_query_log_file = /var/log/mysql/mysql_slow.log
|
binlog_format = ROW
|
skip_log_bin
|
expire_logs_days = 1
|
log_error = /var/log/mysql/mysql_error.log
|
|
[galera]
|
wsrep_cluster_address = gcomm://XXXX,XXXX
|
default_storage_engine=InnoDB
|
wsrep_on=ON
|
wsrep_node_address = XXX
|
wsrep_provider = /usr/lib64/galera/libgalera_smm.so
|
wsrep_slave_threads = 8
|
wsrep_node_name = XXX
|
wsrep_sst_method = mariabackup
|
wsrep_sst_auth = "XXXX"
|
wsrep_cluster_name = XXX
|
wsrep_log_conflicts = 1
|
wsrep_provider_options="gcache.size = 5G"
|
We're not using MyISAM and query cache.