[MDEV-16510] wsrep_thd.cc:446: void wsrep_create_appliers(long int): Assertion `0' failed , mysqld got signal 6 - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Won't Fix
Affects Version/s: 10.3.7
Fix Version/s: N/A
Component/s: Galera
Labels:
None
Environment:
3 Master-Master Galera Nodes, CentOS 7.4

Description

wsrep_thd.cc:446: void wsrep_create_appliers(long int): Assertion `0' failed , mysqld got signal 6

3 Master-Master Galera Nodes was running, synced and standby .
After restarting cluster mysqld crashed initially on Node 2 ,
finally mysqld didn't started on Node 1 due to timeout, although neither SST nor IST are
required or initiated.

note: attached logs from all Nodes

Node2

2018-06-16  2:29:06 0 [ERROR] WSREP: Trying to launch slave threads before creating connection at 'gcomm://192.168.104.193,192.168.104.195,192.168.104.196'

mysqld: /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.3.7/sql/wsrep_thd.cc:446: void wsrep_create_appliers(long int): Assertion `0' failed.

180616  2:29:06 [ERROR] mysqld got signal 6 ;

This could be because you hit a bug. It is also possible that this binary

or one of the libraries it was linked against is corrupt, improperly built,

or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help

diagnose the problem, but since we have already crashed,

something is definitely wrong and this may fail.

Server version: 10.3.7-MariaDB

key_buffer_size=134217728

read_buffer_size=131072

max_used_connections=0

max_threads=153

thread_count=7

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467389 K  bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 0x0 thread_stack 0x49000

2018-06-16  2:29:06 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

2018-06-16  2:29:06 2 [Note] WSREP: REPL Protocols: 7 (3, 2)

2018-06-16  2:29:06 2 [Note] WSREP: Assign initial position for certification: 26995, protocol version: 3

2018-06-16  2:29:06 0 [Note] WSREP: Service thread queue flushed.

2018-06-16  2:29:06 2 [Note] WSREP: Synchronized with group, ready for connections

2018-06-16  2:29:06 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

2018-06-16  2:29:06 2 [Note] WSREP: New cluster view: global state: 3c15149f-5766-11e8-9a99-22bc53d40581:26995, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3

2018-06-16  2:29:06 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

2018-06-16  2:29:06 2 [Note] WSREP: New cluster view: global state: 3c15149f-5766-11e8-9a99-22bc53d40581:26995, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version 3

2018-06-16  2:29:06 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

2018-06-16  2:29:06 2 [Note] WSREP: applier thread exiting (code:0)

/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55bc6b93d60e]

/usr/sbin/mysqld(handle_fatal_signal+0x357)[0x55bc6b3dc117]

2018-06-16  2:29:08 1 [Note] WSREP: rollbacker thread exiting

sigaction.c:0(__restore_rt)[0x7f05956205e0]

/lib64/libc.so.6(gsignal+0x37)[0x7f0593b2d1f7]

:0(__GI_raise)[0x7f0593b2e8e8]

:0(__GI_abort)[0x7f0593b26266]

:0(__assert_fail_base)[0x7f0593b26312]

/usr/sbin/mysqld(+0x6fb10f)[0x55bc6b35910f]

/usr/sbin/mysqld(_Z11mysqld_mainiPPc+0xd3a)[0x55bc6b153c6a]

/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f0593b19c05]

/usr/sbin/mysqld(+0x4e8e7d)[0x55bc6b146e7d]

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains

information that should help you find out what is causing the crash.

system logs

Node1

Jun 16 02:29:02 localhost systemd: mariadb.service start operation timed out. Terminating.

Jun 16 02:29:15 localhost systemd: Failed to start MariaDB 10.3.7 database server.

Jun 16 02:29:15 localhost systemd: Unit mariadb.service entered failed state.

Jun 16 02:29:15 localhost systemd: mariadb.service failed.

Jun 16 02:29:15 localhost systemd: Reached target Multi-User System.

Jun 16 02:29:15 localhost systemd: Starting Multi-User System.

Jun 16 02:29:15 localhost systemd: Starting Update UTMP about System Runlevel Changes...

Jun 16 02:29:15 localhost systemd: Started Stop Read-Ahead Data Collection 10s After Completed Startup.

Jun 16 02:29:15 localhost systemd: Starting Stop Read-Ahead Data Collection 10s After Completed Startup.

Jun 16 02:29:15 localhost systemd: Started Update UTMP about System Runlevel Changes.

Jun 16 02:29:15 localhost systemd: Startup finished in 941ms (kernel) + 3.558s (initrd) + 2min 32.125s (userspace) = 2min 36.624s.

Node2

Jun 16 02:28:05 localhost systemd: Started Network Manager Script Dispatcher Service.

Jun 16 02:28:05 localhost nm-dispatcher: req:1 'up' [enp3s0]: new request (3 scripts)

Jun 16 02:28:05 localhost nm-dispatcher: req:1 'up' [enp3s0]: start running ordered scripts...

Jun 16 02:28:05 localhost nm-dispatcher: req:2 'hostname': new request (3 scripts)

Jun 16 02:28:05 localhost nm-dispatcher: req:3 'connectivity-change': new request (3 scripts)

Jun 16 02:28:05 localhost nm-dispatcher: req:2 'hostname': start running ordered scripts...

Jun 16 02:28:05 localhost nm-dispatcher: req:3 'connectivity-change': start running ordered scripts...

Jun 16 02:28:10 localhost chronyd[523]: Selected source 87.120.166.8

Jun 16 02:28:57 localhost systemd: mariadb.service start operation timed out. Terminating.

Jun 16 02:29:08 localhost abrt-hook-ccpp: Process 973 (mysqld) of user 996 killed by SIGABRT - dumping core

Jun 16 02:29:10 localhost systemd: mariadb.service: main process exited, code=dumped, status=6/ABRT

Jun 16 02:29:10 localhost systemd: Failed to start MariaDB 10.3.7 database server.

Jun 16 02:29:10 localhost systemd: Unit mariadb.service entered failed state.

Jun 16 02:29:10 localhost systemd: mariadb.service failed.

Jun 16 02:29:10 localhost systemd: Reached target Multi-User System.

Jun 16 02:29:10 localhost systemd: Starting Multi-User System.

Jun 16 02:29:10 localhost systemd: Starting Update UTMP about System Runlevel Changes...

Jun 16 02:29:10 localhost systemd: Started Stop Read-Ahead Data Collection 10s After Completed Startup.

Jun 16 02:29:10 localhost systemd: Starting Stop Read-Ahead Data Collection 10s After Completed Startup.

Jun 16 02:29:10 localhost systemd: Started Update UTMP about System Runlevel Changes.

Jun 16 02:29:11 localhost systemd: Startup finished in 921ms (kernel) + 4.505s (initrd) + 2min 26.990s (userspace) = 2min 32.417s.

Jun 16 02:29:11 localhost abrt-server: Package 'MariaDB-server' isn't signed with proper key

Jun 16 02:29:11 localhost abrt-server: 'post-create' on '/var/spool/abrt/ccpp-2018-06-16-02:29:08-973' exited with 1

Jun 16 02:29:11 localhost abrt-server: Deleting problem directory '/var/spool/abrt/ccpp-2018-06-16-02:29:08-973'

Jun 16 02:29:16 localhost systemd: mariadb.service holdoff time over, scheduling restart.

Jun 16 02:29:16 localhost systemd: Starting MariaDB 10.3.7 database server...

Jun 16 02:29:18 localhost sh: WSREP: Recovered position 3c15149f-5766-11e8-9a99-22bc53d40581:26995

Jun 16 02:29:18 localhost mysqld: 2018-06-16  2:29:18 0 [Note] /usr/sbin/mysqld (mysqld 10.3.7-MariaDB) starting as process 1197 ...

Jun 16 02:29:19 localhost systemd: Started MariaDB 10.3.7 database server.

Node3

Jun 16 02:29:04 localhost sh: WSREP: Recovered position 3c15149f-5766-11e8-9a99-22bc53d40581:26995

Jun 16 02:29:04 localhost mysqld: 2018-06-16  2:29:04 0 [Note] /usr/sbin/mysqld (mysqld 10.3.7-MariaDB) starting as process 1038 ...

Jun 16 02:29:06 localhost systemd: Started MariaDB 10.3.7 database server.

Node1

[root@t4w3 ~]# systemctl status mariadb.service

● mariadb.service - MariaDB 10.3.7 database server

   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)

  Drop-In: /etc/systemd/system/mariadb.service.d

           └─migrated-from-my.cnf-settings.conf

   Active: failed (Result: timeout) since Sat 2018-06-16 02:29:15 EEST; 2 days ago

     Docs: man:mysqld(8)

           https://mariadb.com/kb/en/library/systemd/

  Process: 985 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=exited, status=0/SUCCESS)

  Process: 874 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)

  Process: 871 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)

 Main PID: 985 (code=exited, status=0/SUCCESS)

   Status: "MariaDB server is down"

Jun 16 02:27:27 localhost.localdomain systemd[1]: Starting MariaDB 10.3.7 database server...

Jun 16 02:27:32 localhost.localdomain sh[874]: WSREP: Recovered position 3c15149f-5766-11e8-9a99-22bc53d40581:26995

Jun 16 02:27:32 localhost.localdomain mysqld[985]: 2018-06-16  2:27:32 0 [Note] /usr/sbin/mysqld (mysqld 10.3.7-MariaDB) starting as process 985 ...

Jun 16 02:29:02 t4w3.xentio.lan systemd[1]: mariadb.service start operation timed out. Terminating.

Jun 16 02:29:15 t4w3.xentio.lan systemd[1]: Failed to start MariaDB 10.3.7 database server.

Jun 16 02:29:15 t4w3.xentio.lan systemd[1]: Unit mariadb.service entered failed state.

Jun 16 02:29:15 t4w3.xentio.lan systemd[1]: mariadb.service failed.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

logs_Node_1.txt
22 kB
2018-06-18 11:34
logs_Node_2.txt
27 kB
2018-06-18 11:34
logs_Node_3.txt
20 kB
2018-06-18 11:34

wsrep_thd.cc:446: void wsrep_create_appliers(long int): Assertion `0' failed , mysqld got signal 6

Details

Description

Attachments

Attachments

Activity

People

Dates

Git Integration