[MDEV-9510] Segmentation fault in binlog thread causes crash Created: 2016-02-03  Updated: 2020-08-25  Resolved: 2017-11-15

Status: Closed
Project: MariaDB Server
Component/s: Galera, Replication
Affects Version/s: 10.1.10, 10.1.20, 10.1.23, 10.2
Fix Version/s: 10.1.30, 10.2.11

Type: Bug Priority: Critical
Reporter: Geoff Montee (Inactive) Assignee: Andrei Elkin
Resolution: Fixed Votes: 14
Labels: binlog, replication

Issue Links:
Duplicate
duplicates MDEV-14395 171114 7:17:09 [ERROR] mysqld got si... Closed
is duplicated by MDEV-9901 Mariadb crashes Closed
is duplicated by MDEV-13058 MariaDB crashing Closed
is duplicated by MDEV-14521 [ERROR] mysqld got signal 11 Closed
is duplicated by MDEV-15307 MariaDB 10.1 crashing on binlog switc... Closed
PartOf
is part of MDEV-9960 Binary log not rotating Closed
Relates
relates to MDEV-9498 MariaDB server with Galera replicatio... Closed
relates to MDEV-10301 Signal 11 crash at random times Closed
relates to MDEV-10501 MariaDB crash Closed
relates to MDEV-13747 MariaDB 10.1.21 server sudden crash Closed
relates to MDEV-10196 Binlog checkpoint segfault Closed
relates to MDEV-10276 Crash signal 11 during logrotate Closed
relates to MDEV-10601 Mariadb 10.1.16 got signal 11 Closed
relates to MDEV-10645 Crash signal 11 during SST on DONOR node Closed
relates to MDEV-10656 Got Signal 11 on one node randomly ev... Closed
relates to MDEV-10809 Logrotates flush-logs process never ends Closed
relates to MDEV-11053 [ERROR] mysqld got signal 11 ; Closed
relates to MDEV-11595 MariaDB server cluster regularly stop... Closed
relates to MDEV-11860 Signal 11 - Buffer overflow detected Closed
Sprint: 10.1.19, 10.1.22, 10.2.6-2, 10.2.6-3, 10.3.1-2, 10.2.10, 10.1.29, 10.2.11

 Description   

A user saw this crash in MariaDB 10.1.10 on Ubuntu:

160201 21:22:07 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.1.10-MariaDB-1~trusty-log
key_buffer_size=134217728
read_buffer_size=2097152
max_used_connections=30
max_threads=102
thread_count=15
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 759823 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7f1818883008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f182b2dce10 thread_stack 0x48000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f182acf3a6e]
/usr/sbin/mysqld(handle_fatal_signal+0x38d)[0x7f182a82216d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f1828d7c340]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0xc7)[0x7f182a8e2cb7]
/usr/sbin/mysqld(binlog_background_thread+0x2b5)[0x7f182a8e3275]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f1828d74182]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f182849747d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x0): is an invalid pointer
Connection ID (thread ID): 3
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

I will try to get more information.



 Comments   
Comment by Geoff Montee (Inactive) [ 2016-02-03 ]

The user also saw this crash that may be related:

160203  7:48:06 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.1.10-MariaDB-1~trusty-log
key_buffer_size=134217728
read_buffer_size=2097152
max_used_connections=29
max_threads=102
thread_count=8
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 759823 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7f2f030dd008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f2f3bd98df0 thread_stack 0x48000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f2f49924a6e]
/usr/sbin/mysqld(handle_fatal_signal+0x38d)[0x7f2f4945316d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f2f479ad340]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG21do_checkpoint_requestEm+0x9d)[0x7f2f49513edd]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG20checkpoint_and_purgeEm+0x11)[0x7f2f49513f71]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG16rotate_and_purgeEb+0xc2)[0x7f2f495164c2]
/usr/sbin/mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x12c)[0x7f2f493bc7dc]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1327)[0x7f2f492d3e57]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x20d)[0x7f2f492dc1cd]
/usr/sbin/mysqld(+0x4369d9)[0x7f2f492dc9d9]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1e1b)[0x7f2f492deecb]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x169)[0x7f2f492dfca9]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x18a)[0x7f2f493a292a]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x7f2f493a2b00]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f2f479a5182]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f2f470c847d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f2f04938020): is an invalid pointer
Connection ID (thread ID): 1283552
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

Comment by Elena Stepanova [ 2016-02-03 ]

support issue

Comment by Lixun Peng [ 2016-05-11 ]

Hi!

If these two crash are related, I think the problem maybe is that binlog_xid_count_list contains NULL or illegal entry point.
I checked MYSQL_BIN_LOG::do_checkpoint_request(), I think only "entry->binlog_id" can trigger this crash, and in the TC_LOG_BINLOG::mark_xid_done(), it also accessed "b->binlog_id".
So I think it maybe has an illegal point of xid_count_per_binlog in binlog_xid_count_list.
But we have no core file, I can't verify it.

Comment by Elena Stepanova [ 2016-06-09 ]

The crash was actually happening on a Galera cluster node; and we got another report with the stack trace identical to above, MDEV-10196, which also affects a Galera cluster. So, the suspicion is Galera might be an important factor here.

Reassigning to nirbhay_c to take a look.

Comment by Elena Stepanova [ 2016-06-23 ]

Yet another one, MDEV-10276.

Comment by Elena Stepanova [ 2016-06-29 ]

And another one, MDEV-10301

Comment by YuanPeng [ 2016-07-29 ]

This is a very serious issue!!! Last month, we updated galera from 10.0.x to 10.1.14 in our production environment, but already had crashed 3 times so far, which we never met on 10.0.x version. All crash is inexplicable with signal 11 error. There are several issues related, which looks the same. We want to wait for fix version, but why this serious issue is still "open and unresolved" after this long ?

Comment by Nirbhay Choubey (Inactive) [ 2016-07-29 ]

ypnuaa037 That's because we haven't been able to reproduce this issue on our end.
Will it be possible for you to run a debug node and report a better stacktrace in case
the crash happens again?

Comment by Florian Engelmann [ 2016-12-07 ]

Ubuntu 14.04.4
libc6:amd64 2.19-0ubuntu6.9
mariadb-server-10.1 10.1.14+maria-1~trusty
mariadb-server-core-10.1 10.1.14+maria-1~trusty

5 Node mariadb galera cluster - crashes sometimes while logrotation is done

syslog:

Dec 7 02:16:55 galera04 mysqld: 161207 2:16:55 [ERROR] mysqld got signal 11 ;
Dec 7 02:16:55 galera04 mysqld: This could be because you hit a bug. It is also possible that this binary
Dec 7 02:16:55 galera04 mysqld: or one of the libraries it was linked against is corrupt, improperly built,
Dec 7 02:16:55 galera04 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
Dec 7 02:16:55 galera04 mysqld:
Dec 7 02:16:55 galera04 mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Dec 7 02:16:55 galera04 mysqld:
Dec 7 02:16:55 galera04 mysqld: We will try our best to scrape up some info that will hopefully help
Dec 7 02:16:55 galera04 mysqld: diagnose the problem, but since we have already crashed,
Dec 7 02:16:55 galera04 mysqld: something is definitely wrong and this may fail.
Dec 7 02:16:55 galera04 mysqld:
Dec 7 02:16:55 galera04 mysqld: Server version: 10.1.14-MariaDB-1~trusty
Dec 7 02:16:55 galera04 mysqld: key_buffer_size=134217728
Dec 7 02:16:55 galera04 mysqld: read_buffer_size=2097152
Dec 7 02:16:55 galera04 mysqld: max_used_connections=14
Dec 7 02:16:55 galera04 mysqld: max_threads=262
Dec 7 02:16:55 galera04 mysqld: thread_count=19
Dec 7 02:16:55 galera04 mysqld: It is possible that mysqld could use up to
Dec 7 02:16:55 galera04 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2819298 K bytes of memory
Dec 7 02:16:55 galera04 mysqld: Hope that's ok; if not, decrease some variables in the equation.

caused by logrotation:
/var/log/mysql.log /var/log/mysql/mysql.log /var/log/mysql/mariadb-slow.log

{ daily rotate 7 missingok create 640 mysql adm compress sharedscripts postrotate test -x /usr/bin/mysqladmin || exit 0 if [ -f `my_print_defaults --mysqld | grep -oP "pid-file=\K[^$]+"` ]; then # If this fails, check debian.conf! mysqladmin --defaults-file=/etc/mysql/debian.cnf flush-logs fi endscript }
Comment by Vincent Enjalbert [ 2016-12-19 ]

Ubuntu 14.04 LTS
Mariadb 10.1.19
3 nodes
Main app running : Magento1.9.3.1

Dec 18 13:48:22 mariadb1 mysqld: 161218 13:48:22 [ERROR] mysqld got signal 11 ;
Dec 18 13:48:22 mariadb1 mysqld: This could be because you hit a bug. It is also possible that this binary
Dec 18 13:48:22 mariadb1 mysqld: or one of the libraries it was linked against is corrupt, improperly built,
Dec 18 13:48:22 mariadb1 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
Dec 18 13:48:22 mariadb1 mysqld:
Dec 18 13:48:22 mariadb1 mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Dec 18 13:48:22 mariadb1 mysqld:
Dec 18 13:48:22 mariadb1 mysqld: We will try our best to scrape up some info that will hopefully help
Dec 18 13:48:22 mariadb1 mysqld: diagnose the problem, but since we have already crashed,
Dec 18 13:48:22 mariadb1 mysqld: something is definitely wrong and this may fail.
Dec 18 13:48:22 mariadb1 mysqld:
Dec 18 13:48:22 mariadb1 mysqld: Server version: 10.1.19-MariaDB-1~trusty
Dec 18 13:48:22 mariadb1 mysqld: key_buffer_size=134217728
Dec 18 13:48:22 mariadb1 mysqld: read_buffer_size=2097152
Dec 18 13:48:22 mariadb1 mysqld: max_used_connections=14
Dec 18 13:48:22 mariadb1 mysqld: max_threads=302
Dec 18 13:48:22 mariadb1 mysqld: thread_count=11
Dec 18 13:48:22 mariadb1 mysqld: It is possible that mysqld could use up to
Dec 18 13:48:22 mariadb1 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1992742 K  bytes of memory
Dec 18 13:48:22 mariadb1 mysqld: Hope that's ok; if not, decrease some variables in the equation.
Dec 18 13:48:22 mariadb1 mysqld:
Dec 18 13:48:22 mariadb1 mysqld: Thread pointer: 0x0x7efd58812008
Dec 18 13:48:22 mariadb1 mysqld: Attempting backtrace. You can use the following information to find out
Dec 18 13:48:22 mariadb1 mysqld: where mysqld died. If you see no messages after this, something went
Dec 18 13:48:22 mariadb1 mysqld: terribly wrong...
Dec 18 13:48:22 mariadb1 mysqld: stack_bottom = 0x7eff93b6c210 thread_stack 0x48400
Dec 18 13:48:22 mariadb1 mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7eff946dd67e]
Dec 18 13:48:22 mariadb1 mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x2d5)[0x7eff94200ba5]
Dec 18 13:48:22 mariadb1 mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7eff92755330]
Dec 18 13:48:22 mariadb1 mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0xe8)[0x7eff942c2e88]
Dec 18 13:48:22 mariadb1 mysqld: /usr/sbin/mysqld(binlog_background_thread+0x2c5)[0x7eff942c3525]
Dec 18 13:48:22 mariadb1 mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7eff9274d184]
Dec 18 13:48:22 mariadb1 mysqld: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7eff91c6c37d]
Dec 18 13:48:22 mariadb1 mysqld:
Dec 18 13:48:22 mariadb1 mysqld: Trying to get some variables.
Dec 18 13:48:22 mariadb1 mysqld: Some pointers may be invalid and cause the dump to abort.
Dec 18 13:48:22 mariadb1 mysqld: Query (0x0): is an invalid pointer
Dec 18 13:48:22 mariadb1 mysqld: Connection ID (thread ID): 3
Dec 18 13:48:22 mariadb1 mysqld: Status: NOT_KILLED
Dec 18 13:48:22 mariadb1 mysqld:
Dec 18 13:48:22 mariadb1 mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
Dec 18 13:48:22 mariadb1 mysqld:
Dec 18 13:48:22 mariadb1 mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Dec 18 13:48:22 mariadb1 mysqld: information that should help you find out what is causing the crash.
Dec 18 13:48:22 mariadb1 mysqld:
Dec 18 13:48:22 mariadb1 mysqld: We think the query pointer is invalid, but we will try to print it anyway.
Dec 18 13:48:22 mariadb1 mysqld: Query:
Dec 18 13:48:22 mariadb1 mysqld:
Dec 18 13:48:22 mariadb1 mysqld_safe: Number of processes running now: 0
Dec 18 13:48:22 mariadb1 mysqld_safe: WSREP: not restarting wsrep node automatically
Dec 18 13:48:22 mariadb1 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended

I don't know how to find the cause, it seems quite random. How can I help you fix this ?

Comment by Florian Engelmann [ 2017-01-25 ]

I did upload a core dump to MDEV-10809 which seems to be related to MDEV-9510

Has anybody solved this issue?

Comment by Florian Engelmann [ 2017-01-25 ]

Affected versions are at least (tested by us) 10.1.10, 10.1.14, 10.1.19 and 10.1.20.

Comment by Florian Engelmann [ 2017-01-26 ]

@eVincent, GeoffMontee, plinux

Do you use HAproxy to do loadbalancing to the galera cluster? How is your healthcheck implemented?

Comment by Benjamin MALYNOVYTCH [ 2017-01-26 ]

@engel75: I manage the infrastructure mentionned by eVincent
We use a simple "mysql-check", with a sort of static "one to one" binding between web servers and mariadb servers, like so:

  • web1 => mariadb1 with failover on mariadb2 & 3
  • web2 => mariadb2 with failover on mariadb1 & 3
  • web3 => mariadb3 with failover on mariadb1 & 2

Here the HAProxy snippet:

haproxy.cfg

listen mariadb_cluster
  bind *:3307
  balance source
  mode tcp
  option tcpka
  option mysql-check user haproxy
  server mariadb1 mariadb1:3306 check
  server mariadb2 mariadb2:3306 check backup
  server mariadb3 mariadb3:3306 check backup

Benjamin

PS: downgrading to MariaDB 5.5 + Galera Cluster seemed to fix the issue (no crashes since the last startup of the cluster, more than 24h ago)

Comment by Florian Engelmann [ 2017-01-26 ]

Thank you Benjamin. I asked because after disabling haproxy health checks on one of our nodes that node did not have a stuck "flush logs" this morning. I will test this with two nodes till Monday to proof it. As mysql-check does not check if a node is "Synced" we do use a shell script initiated by xinetd to check the status:

show global status where variable_name='wsrep_local_state';

We do use 3 haproxy loadbalancers ("clusterd" via keepalived) to route traffic to the cluster. Thats a lot of checks incoming. We might use a perl daemon to handle those requests kind of "man in the middle".

Comment by Florian Engelmann [ 2017-01-30 ]

The problem with flush logs has nothing to do with the health checks. It seems to be load related as only nodes with enabled haproxy health checks get stuck with flush logs.

Comment by Florian Engelmann [ 2017-02-01 ]

is anybody working on this one? We are still forced to "clean up" the production Galera Cluster every Morning.

As "service mysql stop" gets stuck as well we have to use "kill -9"

echo disable haproxy health check
sed -i -e "s/^#echo disabled/echo disabled/" -e "s/^#exit 0/exit 0/" /usr/local/bin/mysqlchk.sh
 
echo wait 10s
sleep 10
 
echo stop mysql
timeout 5s service mysql stop
 
echo send TERM
kill $(pidof mysqld)
 
echo wait 5s
sleep 5
 
echo send KILL
kill -9 $(pidof mysqld)
 
sleep 1
 
echo check mysql is gone
ps faux | grep mysql
 
 
sleep 2
 
echo start mysql
service mysql start
 
sleep 6s
 
echo check galera is synced
mysql -e "SHOW STATUS WHERE variable_name ='wsrep_local_state_comment';"
 
echo enable haproxy health check
sed -i -e "s/^echo disabled/#echo disabled/" -e "s/^exit 0/#exit 0/" /usr/local/bin/mysqlchk.sh

We consider using Percona XtraDB Cluster if this issue is not treated as highest priority. Could anybody please let everyone know what's going on?

Comment by Benjamin MALYNOVYTCH [ 2017-02-01 ]

As a reminder, if downgrading to MariaDB 5.5 is an option, it is stable.

Comment by Nirbhay Choubey (Inactive) [ 2017-02-01 ]

engel75 yes, I am looking into it. Unfortunately, the core dump that you submitted did
not help and so far I have not been able repeat this issue on my end.

Comment by Florian Engelmann [ 2017-02-06 ]

Dear nirbhay_c,

I am glad you are working on this issue. Today xtrabackup crashed one node of that cluster with an error that might be related to the same issue:

Feb  6 09:50:23 galera01-customerA-prod mysqld: 2017-02-06  9:50:23 140112419612416 [Note] WSREP: Member 3.0 (galera01-customerA-prod-back) desyncs itself from group
Feb  6 09:50:23 galera01-customerA-prod mysqld: 2017-02-06  9:50:23 140112419612416 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 123190807)
Feb  6 09:50:23 galera01-customerA-prod mysqld: 2017-02-06  9:50:23 140112468671232 [Note] WSREP: Provider paused at 202dc465-47c0-11e4-b99e-9b1f0ffcf741:123190807 (440)
Feb  6 09:50:23 galera01-customerA-prod mysqld: 170206  9:50:23 [ERROR] mysqld got signal 11 ;
Feb  6 09:50:23 galera01-customerA-prod mysqld: This could be because you hit a bug. It is also possible that this binary
Feb  6 09:50:23 galera01-customerA-prod mysqld: or one of the libraries it was linked against is corrupt, improperly built,
Feb  6 09:50:23 galera01-customerA-prod mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
Feb  6 09:50:23 galera01-customerA-prod mysqld: 
Feb  6 09:50:23 galera01-customerA-prod mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Feb  6 09:50:23 galera01-customerA-prod mysqld: 
Feb  6 09:50:23 galera01-customerA-prod mysqld: We will try our best to scrape up some info that will hopefully help
Feb  6 09:50:23 galera01-customerA-prod mysqld: diagnose the problem, but since we have already crashed, 
Feb  6 09:50:23 galera01-customerA-prod mysqld: something is definitely wrong and this may fail.
Feb  6 09:50:23 galera01-customerA-prod mysqld: 
Feb  6 09:50:23 galera01-customerA-prod mysqld: Server version: 10.1.20-MariaDB-1~trusty
Feb  6 09:50:23 galera01-customerA-prod mysqld: key_buffer_size=134217728
Feb  6 09:50:23 galera01-customerA-prod mysqld: read_buffer_size=2097152
Feb  6 09:50:23 galera01-customerA-prod mysqld: max_used_connections=13
Feb  6 09:50:23 galera01-customerA-prod mysqld: max_threads=262
Feb  6 09:50:23 galera01-customerA-prod mysqld: thread_count=21
Feb  6 09:50:23 galera01-customerA-prod mysqld: It is possible that mysqld could use up to 
Feb  6 09:50:23 galera01-customerA-prod mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2819312 K  bytes of memory
Feb  6 09:50:23 galera01-customerA-prod mysqld: Hope that's ok; if not, decrease some variables in the equation.
Feb  6 09:50:23 galera01-customerA-prod mysqld: 
Feb  6 09:50:23 galera01-customerA-prod mysqld: Thread pointer: 0x0x7f6c47ad8008
Feb  6 09:50:23 galera01-customerA-prod mysqld: Attempting backtrace. You can use the following information to find out
Feb  6 09:50:23 galera01-customerA-prod mysqld: where mysqld died. If you see no messages after this, something went
Feb  6 09:50:23 galera01-customerA-prod mysqld: terribly wrong...
Feb  6 09:50:23 galera01-customerA-prod mysqld: stack_bottom = 0x7f6e79ec71f0 thread_stack 0x48400
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f6fbf90bc2e]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x305)[0x7f6fbf42ea95]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f6fbd779330]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG21do_checkpoint_requestEm+0x9d)[0x7f6fbf4f0aad]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG20checkpoint_and_purgeEm+0x11)[0x7f6fbf4f0b41]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG16rotate_and_purgeEb+0xc2)[0x7f6fbf4f31b2]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x130)[0x7f6fbf396f10]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1314)[0x7f6fbf2a9cd4]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x331)[0x7f6fbf2b22a1]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(+0x439ac9)[0x7f6fbf2b2ac9]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1e2b)[0x7f6fbf2b4fcb]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z10do_commandP3THD+0x169)[0x7f6fbf2b5dc9]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x18a)[0x7f6fbf37cefa]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(handle_one_connection+0x40)[0x7f6fbf37d0d0]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7f6fbd771184]
Feb  6 09:50:23 galera01-customerA-prod mysqld: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f6fbcc9037d]
Feb  6 09:50:23 galera01-customerA-prod mysqld: 
Feb  6 09:50:23 galera01-customerA-prod mysqld: Trying to get some variables.
Feb  6 09:50:23 galera01-customerA-prod mysqld: Some pointers may be invalid and cause the dump to abort.
Feb  6 09:50:23 galera01-customerA-prod mysqld: Query (0x7f6c33021020): FLUSH BINARY LOGS
Feb  6 09:50:23 galera01-customerA-prod mysqld: Connection ID (thread ID): 284
Feb  6 09:50:23 galera01-customerA-prod mysqld: Status: NOT_KILLED
Feb  6 09:50:23 galera01-customerA-prod mysqld: 
Feb  6 09:50:23 galera01-customerA-prod mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
Feb  6 09:50:23 galera01-customerA-prod mysqld: 
Feb  6 09:50:23 galera01-customerA-prod mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Feb  6 09:50:23 galera01-customerA-prod mysqld: information that should help you find out what is causing the crash.
Feb  6 09:50:23 galera01-customerA-prod mysqld: Writing a core file

Without disabling the nodes HAProxy health check I was not able to finish the backup successful.
I've got two core dumps related to the issue:

rw------ 1 mysql mysql 9.6G Feb 6 09:46 core.mysqld.49551.11.1486370801
rw------ 1 mysql mysql 9.5G Feb 6 09:50 core.mysqld.9877.11.1486371023

Should I upload both?

Comment by Nathan Madpsy [ 2017-02-24 ]

I was hit by this last night - 1 out of 3 nodes in a Galera cluster.

At that time the cluster would not be performing anything unusual and in fact should have been relatively quiescent.

Version 10.1.18 on Ubuntu 16.04

There is nothing else in the logs before this time.

Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: 170224  1:26:33 [ERROR] mysqld got signal 11 ;
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: This could be because you hit a bug. It is also possible that this binary
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: or one of the libraries it was linked against is corrupt, improperly built,
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: or misconfigured. This error can also be caused by malfunctioning hardware.
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: We will try our best to scrape up some info that will hopefully help
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: diagnose the problem, but since we have already crashed,
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: something is definitely wrong and this may fail.
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Server version: 10.1.18-MariaDB-1~xenial
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: key_buffer_size=134217728
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: read_buffer_size=2097152
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: max_used_connections=13
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: max_threads=202
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: thread_count=7
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: It is possible that mysqld could use up to
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1376288 K  bytes of memory
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Hope that's ok; if not, decrease some variables in the equation.
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Thread pointer: 0x0x7f58c3849008
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Attempting backtrace. You can use the following information to find out
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: where mysqld died. If you see no messages after this, something went
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: terribly wrong...
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: stack_bottom = 0x7f595dc5a0a8 thread_stack 0x48400
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55638521562e]
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(handle_fatal_signal+0x2d5)[0x556384d643d5]
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x113d0)[0x7f595c3cc3d0]
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b)[0x556384e1931b]
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(binlog_background_thread+0x2f3)[0x556384e198f3]
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(+0x705fb1)[0x556384ed4fb1]
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76fa)[0x7f595c3c26fa]
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f595ba6db5d]
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Trying to get some variables.
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Some pointers may be invalid and cause the dump to abort.
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Query (0x0):
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Connection ID (thread ID): 3
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Status: NOT_KILLED
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: information that should help you find out what is causing the crash.
Feb 24 01:26:33 ams-prod1-db01 systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE
Feb 24 01:26:33 ams-prod1-db01 systemd[1]: mariadb.service: Unit entered failed state.
Feb 24 01:26:33 ams-prod1-db01 systemd[1]: mariadb.service: Failed with result 'exit-code'.

Comment by Nathan Madpsy [ 2017-03-02 ]

This has reoccurred twice since my last comment. As a means to automatically restart the process I have resorted to setting the following for MariaDB's systemd config:

Restart=on-failure

Originally it is set to 'on-abort' but it seems with a signal 11, which occurs with this bug, the default value will not restart the process. It's not pretty and it's not advised but at least it provides an easy way to ensure the daemon is restarted upon hitting this bug.

Comment by Daniel Black [ 2017-03-15 ]

madpsy FYI the systemd config related to Restarts/aborts has been fixed in MDEV-10382 (10.1.20)

Comment by Nathan Madpsy [ 2017-03-22 ]

Reading some of the other related bugs there is comment that disabling binary logs prevents the crashes (which makes sense). I thought binary logging was somewhat essential for Galera - the simple question is:

Can binary logging be disabled and if so what does that mean for the cluster?

Thanks

Comment by Florian Engelmann [ 2017-03-22 ]

see https://www.percona.com/blog/2015/07/16/bypassing-sst-pxc-binary-logs/

not needed but risky...

Comment by Nathan Madpsy [ 2017-03-22 ]

Thanks. It sounds to me like binary logs are only required for point-in-time backups and not needed for Galera to operate nor for the xtrabackup SST method.

Assuming one has backups of individual DBs, is there still any risk involved in disabling bin log? What I don't want to happen is SST to be triggered just because one of the Galera members restarts.

Comment by Stefan Midjich [ 2017-03-23 ]

I'm one of the people who disabled binary logs in their mariadb 10.x cluster to workaround this bug.

It's worked fine for me, I use xtrabackup to take incremental backups on another server. I've done restore tests of these backups and I'm satisfied.

Comment by Nathan Madpsy [ 2017-03-24 ]

Thanks for the info. I've now disabled binary logging and will monitor. Of course that's a workaround which happens to be satisfactory for some but this bug really needs fixed (over a year old!).

Comment by Stijn [ 2017-06-26 ]

We also had this bug twice in production for a 10.1.22 server (no replication/cluster). The mysqld service did not restart automatically. The second crash caused a table to become corrupted.

170626 10:22:06 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.1.22-MariaDB
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=227
max_threads=1502
thread_count=133
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3430253 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x40000
*** buffer overflow detected ***: /usr/sbin/mysqld terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x7fee0c5e4047]
/lib64/libc.so.6(+0x10d200)[0x7fee0c5e2200]
/lib64/libc.so.6(+0x10efb7)[0x7fee0c5e3fb7]
/usr/sbin/mysqld(my_addr_resolve+0x48)[0x7fee0f08ac28]
/usr/sbin/mysqld(my_print_stacktrace+0x1c2)[0x7fee0f0772a2]
/usr/sbin/mysqld(handle_fatal_signal+0x305)[0x7fee0eb9a9d5]
/lib64/libpthread.so.0(+0xf370)[0x7fee0e1b5370]
/usr/sbin/mysqld(+0x88b37c)[0x7fee0ee6f37c]
/usr/sbin/mysqld(+0x8911c3)[0x7fee0ee751c3]
/usr/sbin/mysqld(+0x8ebbe8)[0x7fee0eecfbe8]
/usr/sbin/mysqld(+0x817d70)[0x7fee0edfbd70]
/lib64/libpthread.so.0(+0x7dc5)[0x7fee0e1addc5]
/lib64/libc.so.6(clone+0x6d)[0x7fee0c5cc73d]
======= Memory map: ========

Do you have any idea when this issue will be fixed?

Comment by Sachin Setiya (Inactive) [ 2017-06-27 ]

Hi stijn

Unfortunately we are not able to produce bug from our , If it is possible can you upload debug core dump

Comment by Tim White [ 2017-07-12 ]

We are also now having this issue on Ubuntu 16.04, package version 10.1.25+maria-1~xenial from the MariaDB repositories, on a 3 node cluster.

Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: 170713  8:16:35 [ERROR] mysqld got signal 11 ;
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: This could be because you hit a bug. It is also possible that this binary
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: or one of the libraries it was linked against is corrupt, improperly built,
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: or misconfigured. This error can also be caused by malfunctioning hardware.
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: We will try our best to scrape up some info that will hopefully help
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: diagnose the problem, but since we have already crashed,
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: something is definitely wrong and this may fail.
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Server version: 10.1.25-MariaDB-1~xenial
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: key_buffer_size=268435456
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: read_buffer_size=2097152
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: max_used_connections=23
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: max_threads=258
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: thread_count=22
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: It is possible that mysqld could use up to
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1852580 K  bytes of memory
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Hope that's ok; if not, decrease some variables in the equation.
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Thread pointer: 0x7fa017c12008
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Attempting backtrace. You can use the following information to find out
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: where mysqld died. If you see no messages after this, something went
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: terribly wrong...
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: stack_bottom = 0x7fa3052bf0d8 thread_stack 0x48400
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55bb69dc6bde]
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x55bb69912415]
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fa303a30390]
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b)[0x55bb699c779b]
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(binlog_background_thread+0x2f3)[0x55bb699c7dd3]
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fa303a266ba]
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fa3030d13dd]
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Trying to get some variables.
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Some pointers may be invalid and cause the dump to abort.
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Query (0x0):
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Connection ID (thread ID): 3
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Status: NOT_KILLED
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: information that should help you find out what is causing the crash.
Jul 13 08:16:35 suq-mariadb-2-2 systemd[1]: mariadb.service: Main process exited, code=dumped, status=11/SEGV
Jul 13 08:16:35 suq-mariadb-2-2 systemd[1]: mariadb.service: Unit entered failed state.
Jul 13 08:16:35 suq-mariadb-2-2 systemd[1]: mariadb.service: Failed with result 'core-dump'.
Jul 13 08:16:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Service hold-off time over, scheduling restart.
Jul 13 08:16:40 suq-mariadb-2-2 systemd[1]: Stopped MariaDB database server.
Jul 13 08:16:40 suq-mariadb-2-2 systemd[1]: Starting MariaDB database server...
Jul 13 08:16:41 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:41 140694126172416 [Note] /usr/sbin/mysqld (mysqld 10.1.25-MariaDB-1~
xenial) starting as process 38882 ...
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Segmentation fault (core dumped)
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: WSREP: Failed to start mysqld for wsrep recovery: '2017-07-13  8:16:42 140694126172416
 [Note] InnoDB: Using mutexes to ref count buffer pool pages
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: The InnoDB memory heap is disabled
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Mutexes and rw_locks use GCC atomic
 builtins
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: GCC builtin __atomic_thread_fence()
 is used for memory barrier
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Compressed tables use zlib 1.2.8
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Using Linux native AIO
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Using SSE crc32 instructions
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Initializing buffer pool, size = 10
.0G
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Completed initialization of buffer 
pool
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Highest supported file format is Ba
rracuda.
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Starting crash recovery from checkp
oint LSN=12274681738
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:49 140694126172416 [Note] InnoDB: Restoring possible half-written dat
a pages from the doublewrite buffer...
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: Transaction 3771194 was in the XA prepared state.
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: 1 transaction(s) which must be rolled back or cleaned up
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: in total 0 row operations to undo
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: Trx id counter is 3771648
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:51 140694126172416 [Note] InnoDB: Starting final batch to recover 17 
pages from redo log
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: Last MySQL binlog file position 0 104878519, file name /var/log/mysql/mariadb-
bin.000201
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] InnoDB: 128 rollback segment(s) are active.
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140681746048768 [Note] InnoDB: Starting in background the rollback
 of recovered transactions
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140681746048768 [Note] InnoDB: Rollback of non-prepared transactio
ns completed
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] InnoDB: Waiting for purge to start
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] InnoDB:  Percona XtraDB (http://www.percona
.com) 5.6.36-82.0 started; log sequence number 12274782858
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Warning] InnoDB: Skipping buffer pool dump/restor
e during wsrep recovery.
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] Plugin 'FEEDBACK' is disabled.
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] Recovering after a crash using /var/log/mys
ql/mariadb-bin
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] Starting crash recovery...
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900  InnoDB: Starting recovery for XA transactions...
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900  InnoDB: Transaction 3771194 in prepared state after recovery
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900  InnoDB: Transaction contains changes to 1 rows
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900  InnoDB: 1 transactions in prepared state after recovery
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] Found 1 prepared transaction(s) in InnoDB
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 170713  8:16:55 [ERROR] mysqld got signal 11 ;
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: This could be because you hit a bug. It is also possible that this binary
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: or one of the libraries it was linked against is corrupt, improperly built,
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: or misconfigured. This error can also be caused by malfunctioning hardware.
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: We will try our best to scrape up some info that will hopefully help
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: diagnose the problem, but since we have already crashed,
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: something is definitely wrong and this may fail.
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Server version: 10.1.25-MariaDB-1~xenial
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: key_buffer_size=268435456
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: read_buffer_size=2097152
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: max_used_connections=0
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: max_threads=258
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: thread_count=0
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: It is possible that mysqld could use up to
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1852580 K  bytes of memory
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Hope that's ok; if not, decrease some variables in the equation.
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Thread pointer: 0x0
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Attempting backtrace. You can use the following information to find out
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: where mysqld died. If you see no messages after this, something went
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: terribly wrong...
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: stack_bottom = 0x0 thread_stack 0x48400
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5604d1131bde]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x5604d0c7d415]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7ff5e5d3c390]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(wsrep_on+0x18)[0x5604d0c17988]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x805792)[0x5604d0eea792]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x805dc1)[0x5604d0eeadc1]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x599bff)[0x5604d0c7ebff]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_Z24plugin_foreach_with_maskP3THDPFcS0_P13st_plugin_intPvEijS3_+0x2cf)[0x5604d0afcdff]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_Z10ha_recoverP7st_hash+0xc6)[0x5604d0c80b06]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG7recoverEP11st_log_infoPKcP11st_io_cacheP28Format_description_log_eventb+0x58f)[0x5604d0d3359f]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG18do_binlog_recoveryEPKcb+0x372)[0x5604d0d33ba2]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG4openEPKc+0x30)[0x5604d0d36da0]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x376eb5)[0x5604d0a5beb5]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_Z11mysqld_mainiPPc+0x1af9)[0x5604d0a5f149]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7ff5e52f6830]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_start+0x29)[0x5604d0a532a9]
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: information that should help you find out what is causing the crash.'
Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Control process exited, code=exited status=1
Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: Failed to start MariaDB database server.
Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Unit entered failed state.
Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Failed with result 'exit-code'.

Comment by Andrei Elkin [ 2017-07-13 ]

Hello, Tim.

Technically speaking your report must be different from the original one that mentions
binlog_background_thread.
But at the same yours should be easier to fix. It would be great to have a copy of your server env
to explore what causes this crash at server recovery. It feels this crash might be repeatable, that is it would
show up again should you restart the server upon the first crash.

Cheers,

Andrei

Comment by Brendan P [ 2017-07-13 ]

See my comment here: https://jira.mariadb.org/browse/MDEV-11053?focusedCommentId=97602&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-97602

This is affecting production badly as we cannot alter a table with pt-online-schema-change without having it randomly crash on us.

Comment by Brendan P [ 2017-10-15 ]

Happened again on two nodes after trying to alter a 10GB table:

Node1 roughly 5 minutes after starting pt-osc:
171014 21:44:55 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.1.28-MariaDB-1~xenial
key_buffer_size=16777216
read_buffer_size=524288
max_used_connections=2049
max_threads=4097
thread_count=314
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 18937620 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f7aefcd9008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fc8bde3fff8 thread_stack 0x48400
2017-10-14 21:44:57 140500174632704 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.57534S), skipping check
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5622aabd254e]
/usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x5622aa71b005]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fc9c649c390]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG21do_checkpoint_requestEm+0x92)[0x5622aa7d1332]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG20checkpoint_and_purgeEm+0x11)[0x5622aa7d13c1]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG23trx_group_commit_leaderEPNS_18group_commit_entryE+0x4ec)[0x5622aa7d421c]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG34write_transaction_to_binlog_eventsEPNS_18group_commit_entryE+0x133)[0x5622aa7d4593]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG27write_transaction_to_binlogEP3THDP17binlog_cache_mngrP9Log_eventbbb+0xd0)[0x5622aa7d4840]
/usr/sbin/mysqld(+0x659a06)[0x5622aa7d4a06]
/usr/sbin/mysqld(+0x65a582)[0x5622aa7d5582]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13log_and_orderEP3THDybbb+0x44)[0x5622aa7d56e4]
/usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x7ab)[0x5622aa71ebfb]
/usr/sbin/mysqld(_Z17trans_commit_stmtP3THD+0x57)[0x5622aa66eb37]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0xa9c)[0x5622aa58526c]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x311)[0x5622aa58dce1]
/usr/sbin/mysqld(+0x413635)[0x5622aa58e635]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1ea5)[0x5622aa590d35]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x139)[0x5622aa591c09]
/usr/sbin/mysqld(_Z26threadpool_process_requestP3THD+0xff)[0x5622aa6a0c9f]
/usr/sbin/mysqld(+0x577805)[0x5622aa6f2805]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fc9c64926ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc9c5b3d3dd]

Node1 again shortly after recovering, deleting old triggers and restarting pt-osc alter:
171014 21:59:46 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.1.28-MariaDB-1~xenial
key_buffer_size=16777216
read_buffer_size=524288
max_used_connections=938
max_threads=4097
thread_count=315
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 18937620 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f314dc12008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f66911c10d8 thread_stack 0x48400
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55f6a68af54e]
/usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x55f6a63f8005]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f66941ef390]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b)[0x55f6a64ae08b]
/usr/sbin/mysqld(binlog_background_thread+0x35b)[0x55f6a64ae72b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f66941e56ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f66938903dd]

Node2 with no alter running but merely during SST to Node1 since Node1 wouldn't recover after 2nd crash:
WSREP_SST: [INFO] Rate-limiting SST to 60M (20171014 22:11:31.060)
WSREP_SST: [INFO] Sleeping before data transfer for SST (20171014 22:11:31.062)
2017-10-14 22:11:32 140016479106816 [Note] WSREP: (06100a81, 'tcp://0.0.0.0:4567') turning message relay requesting off
WSREP_SST: [INFO] Streaming the backup to joiner at 10.0.0.16 4444 (20171014 22:11:41.066)
WSREP_SST: [INFO] Evaluating innobackupex --no-version-check $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | snzip | pv -q -L $rlimit | socat -u stdio TCP:10.0.0.16:4444; RC=( ${PIPESTATUS[@]} ) (20171014 22:11:41.068)
171014 22:25:06 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.1.28-MariaDB-1~xenial
key_buffer_size=16777216
read_buffer_size=524288
max_used_connections=1501
max_threads=4097
thread_count=375
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 18926318 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f400c812008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f5924bbd0d8 thread_stack 0x48400
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55e5180db54e]
/usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x55e517c24005]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f5927c04390]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b)[0x55e517cda08b]
/usr/sbin/mysqld(binlog_background_thread+0x35b)[0x55e517cda72b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f5927bfa6ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f59272a53dd]

Comment by Andrei Elkin [ 2017-10-15 ]

Good news. One cause of the crashes is identified and the patch must be out in couple of days.

Comment by Andrei Elkin [ 2017-10-16 ]

The patch is being tried by BB now:

git push https://github.com/MariaDB/server bb-10.1-mdev9510-andrei:bb-10.1-mdev9510-andrei

Comment by Brendan P [ 2017-10-17 ]

That is great news, I've went over the patch and indeed it makes sense. I am curious though as I don't believe any of our applications actually call any session-only disabling of sql_log_bin directly, would this be something that galera makes use of internally to isolate queries to specific nodes? Or could this also be triggered by another condition? I know much of the crashes have been triggered by pt-online-schema-change but I've went over that also and don't see where sql_log_bin=0 would be getting called. Thank you!

Comment by Brendan P [ 2017-10-17 ]

I've also determined the "random" timeframe before the crash gets triggered after the alter or SST begins certainly concides with when the binary logs get rotated after reaching their 1GB set size, which can be anywhere from 3-5 to 10-15 minutes depending on write load...

Comment by Brendan P [ 2017-10-19 ]

Had 2 crashes today and have decided to run with the patch, so far its been running fine without incident, will keep you updated.

Comment by Andrei Elkin [ 2017-10-23 ]

Brendan, hello.

No news from you for these days I read as no bad ones .
Anyway I am submitting the patch for review very shortly.

Thanks for helping!

Andrei

Comment by Brendan P [ 2017-10-23 ]

Thank you, indeed no news = good. This is promising so far.

I've done a few table alters over the weekend, restarted some nodes due to some innodb tunings, haven't had any crashes since the patch, I will however continue to run this build for now to make sure all is good. Thanks again!

Comment by Andrei Elkin [ 2017-10-25 ]

The patch is polished a bit and a regression test is made to reused an existing one.
Here is the pull request:
https://github.com/MariaDB/server/pull/473
I hope sachin.setiya.007 will review it promptly once back to the office.

Comment by Andrei Elkin [ 2017-10-25 ]

Hello, Sachin.

I see you're back to work on Nov 2nd, Thur. I wonder if you could review mdev-9510 patch on that 44th week (and then it would be practically Thur and Fri for that). The patch is for 10.1, it's rather small.

Cheers,

Andrei

Comment by Sachin Setiya (Inactive) [ 2017-11-03 ]

Okay to push, Sorry for delay.

Regards
sachin

Comment by Andrei Elkin [ 2017-11-06 ]

Changing the post-review status accordingly.

Comment by Andrei Elkin [ 2017-11-15 ]

Pushed c7e38076f33.

Comment by Brendan P [ 2017-11-23 ]

Thanks everyone, I am still running the same build with the original patch, not a single crash since. 100% fixed for sure.

Generated at Thu Feb 08 07:35:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.