Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9510

Segmentation fault in binlog thread causes crash

Details

    • 10.1.19, 10.1.22, 10.2.6-2, 10.2.6-3, 10.3.1-2, 10.2.10, 10.1.29, 10.2.11

    Description

      A user saw this crash in MariaDB 10.1.10 on Ubuntu:

      160201 21:22:07 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see http://kb.askmonty.org/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.
       
      Server version: 10.1.10-MariaDB-1~trusty-log
      key_buffer_size=134217728
      read_buffer_size=2097152
      max_used_connections=30
      max_threads=102
      thread_count=15
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 759823 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0x7f1818883008
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f182b2dce10 thread_stack 0x48000
      /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f182acf3a6e]
      /usr/sbin/mysqld(handle_fatal_signal+0x38d)[0x7f182a82216d]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f1828d7c340]
      /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0xc7)[0x7f182a8e2cb7]
      /usr/sbin/mysqld(binlog_background_thread+0x2b5)[0x7f182a8e3275]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f1828d74182]
      /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f182849747d]
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x0): is an invalid pointer
      Connection ID (thread ID): 3
      Status: NOT_KILLED
       
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
       
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      

      I will try to get more information.

      Attachments

        Issue Links

          Activity

            GeoffMontee Geoff Montee (Inactive) created issue -
            GeoffMontee Geoff Montee (Inactive) made changes -
            Field Original Value New Value
            Description A user saw this crash in MariaDB 10.1.0 on Ubuntu:

            {noformat}
            160201 21:22:07 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see http://kb.askmonty.org/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.1.10-MariaDB-1~trusty-log
            key_buffer_size=134217728
            read_buffer_size=2097152
            max_used_connections=30
            max_threads=102
            thread_count=15
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 759823 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x0x7f1818883008
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7f182b2dce10 thread_stack 0x48000
            /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f182acf3a6e]
            /usr/sbin/mysqld(handle_fatal_signal+0x38d)[0x7f182a82216d]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f1828d7c340]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0xc7)[0x7f182a8e2cb7]
            /usr/sbin/mysqld(binlog_background_thread+0x2b5)[0x7f182a8e3275]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f1828d74182]
            /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f182849747d]

            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x0): is an invalid pointer
            Connection ID (thread ID): 3
            Status: NOT_KILLED

            Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on

            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.
            {noformat}

            I will try to get more information.
            A user saw this crash in MariaDB 10.1.10 on Ubuntu:

            {noformat}
            160201 21:22:07 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see http://kb.askmonty.org/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.1.10-MariaDB-1~trusty-log
            key_buffer_size=134217728
            read_buffer_size=2097152
            max_used_connections=30
            max_threads=102
            thread_count=15
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 759823 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x0x7f1818883008
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7f182b2dce10 thread_stack 0x48000
            /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f182acf3a6e]
            /usr/sbin/mysqld(handle_fatal_signal+0x38d)[0x7f182a82216d]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f1828d7c340]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0xc7)[0x7f182a8e2cb7]
            /usr/sbin/mysqld(binlog_background_thread+0x2b5)[0x7f182a8e3275]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f1828d74182]
            /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f182849747d]

            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x0): is an invalid pointer
            Connection ID (thread ID): 3
            Status: NOT_KILLED

            Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on

            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.
            {noformat}

            I will try to get more information.
            elenst Elena Stepanova made changes -
            Labels binlog replication binlog need_feedback replication

            The user also saw this crash that may be related:

            160203  7:48:06 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.
             
            To report this bug, see http://kb.askmonty.org/en/reporting-bugs
             
            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed, 
            something is definitely wrong and this may fail.
             
            Server version: 10.1.10-MariaDB-1~trusty-log
            key_buffer_size=134217728
            read_buffer_size=2097152
            max_used_connections=29
            max_threads=102
            thread_count=8
            It is possible that mysqld could use up to 
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 759823 K  bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.
             
            Thread pointer: 0x0x7f2f030dd008
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7f2f3bd98df0 thread_stack 0x48000
            /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f2f49924a6e]
            /usr/sbin/mysqld(handle_fatal_signal+0x38d)[0x7f2f4945316d]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f2f479ad340]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG21do_checkpoint_requestEm+0x9d)[0x7f2f49513edd]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG20checkpoint_and_purgeEm+0x11)[0x7f2f49513f71]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG16rotate_and_purgeEb+0xc2)[0x7f2f495164c2]
            /usr/sbin/mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x12c)[0x7f2f493bc7dc]
            /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1327)[0x7f2f492d3e57]
            /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x20d)[0x7f2f492dc1cd]
            /usr/sbin/mysqld(+0x4369d9)[0x7f2f492dc9d9]
            /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1e1b)[0x7f2f492deecb]
            /usr/sbin/mysqld(_Z10do_commandP3THD+0x169)[0x7f2f492dfca9]
            /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x18a)[0x7f2f493a292a]
            /usr/sbin/mysqld(handle_one_connection+0x40)[0x7f2f493a2b00]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f2f479a5182]
            /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f2f470c847d]
             
            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x7f2f04938020): is an invalid pointer
            Connection ID (thread ID): 1283552
            Status: NOT_KILLED
             
            Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
             
            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.

            GeoffMontee Geoff Montee (Inactive) added a comment - The user also saw this crash that may be related: 160203 7:48:06 [ERROR] mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware.   To report this bug, see http://kb.askmonty.org/en/reporting-bugs   We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.   Server version: 10.1.10-MariaDB-1~trusty-log key_buffer_size=134217728 read_buffer_size=2097152 max_used_connections=29 max_threads=102 thread_count=8 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 759823 K bytes of memory Hope that's ok; if not, decrease some variables in the equation.   Thread pointer: 0x0x7f2f030dd008 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f2f3bd98df0 thread_stack 0x48000 /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f2f49924a6e] /usr/sbin/mysqld(handle_fatal_signal+0x38d)[0x7f2f4945316d] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f2f479ad340] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG21do_checkpoint_requestEm+0x9d)[0x7f2f49513edd] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG20checkpoint_and_purgeEm+0x11)[0x7f2f49513f71] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG16rotate_and_purgeEb+0xc2)[0x7f2f495164c2] /usr/sbin/mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x12c)[0x7f2f493bc7dc] /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1327)[0x7f2f492d3e57] /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x20d)[0x7f2f492dc1cd] /usr/sbin/mysqld(+0x4369d9)[0x7f2f492dc9d9] /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1e1b)[0x7f2f492deecb] /usr/sbin/mysqld(_Z10do_commandP3THD+0x169)[0x7f2f492dfca9] /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x18a)[0x7f2f493a292a] /usr/sbin/mysqld(handle_one_connection+0x40)[0x7f2f493a2b00] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f2f479a5182] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f2f470c847d]   Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7f2f04938020): is an invalid pointer Connection ID (thread ID): 1283552 Status: NOT_KILLED   Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on   The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash.
            elenst Elena Stepanova made changes -
            Labels binlog need_feedback replication binlog replication
            elenst Elena Stepanova added a comment - - edited

            support issue

            elenst Elena Stepanova added a comment - - edited support issue
            ratzpo Rasmus Johansson (Inactive) made changes -
            Assignee Kristian Nielsen [ knielsen ]
            plinux Lixun Peng added a comment -

            Hi!

            If these two crash are related, I think the problem maybe is that binlog_xid_count_list contains NULL or illegal entry point.
            I checked MYSQL_BIN_LOG::do_checkpoint_request(), I think only "entry->binlog_id" can trigger this crash, and in the TC_LOG_BINLOG::mark_xid_done(), it also accessed "b->binlog_id".
            So I think it maybe has an illegal point of xid_count_per_binlog in binlog_xid_count_list.
            But we have no core file, I can't verify it.

            plinux Lixun Peng added a comment - Hi! If these two crash are related, I think the problem maybe is that binlog_xid_count_list contains NULL or illegal entry point. I checked MYSQL_BIN_LOG::do_checkpoint_request(), I think only "entry->binlog_id" can trigger this crash, and in the TC_LOG_BINLOG::mark_xid_done(), it also accessed "b->binlog_id". So I think it maybe has an illegal point of xid_count_per_binlog in binlog_xid_count_list. But we have no core file, I can't verify it.
            elenst Elena Stepanova made changes -
            Component/s Galera [ 10124 ]

            The crash was actually happening on a Galera cluster node; and we got another report with the stack trace identical to above, MDEV-10196, which also affects a Galera cluster. So, the suspicion is Galera might be an important factor here.

            Reassigning to nirbhay_c to take a look.

            elenst Elena Stepanova added a comment - The crash was actually happening on a Galera cluster node; and we got another report with the stack trace identical to above, MDEV-10196 , which also affects a Galera cluster. So, the suspicion is Galera might be an important factor here. Reassigning to nirbhay_c to take a look.
            elenst Elena Stepanova made changes -
            elenst Elena Stepanova made changes -
            Fix Version/s 10.1 [ 16100 ]
            Assignee Kristian Nielsen [ knielsen ] Nirbhay Choubey [ nirbhay_c ]
            elenst Elena Stepanova made changes -
            elenst Elena Stepanova made changes -

            Yet another one, MDEV-10276.

            elenst Elena Stepanova added a comment - Yet another one, MDEV-10276 .
            stemid Stefan Midjich made changes -

            And another one, MDEV-10301

            elenst Elena Stepanova added a comment - And another one, MDEV-10301
            ypnuaa037 YuanPeng added a comment -

            This is a very serious issue!!! Last month, we updated galera from 10.0.x to 10.1.14 in our production environment, but already had crashed 3 times so far, which we never met on 10.0.x version. All crash is inexplicable with signal 11 error. There are several issues related, which looks the same. We want to wait for fix version, but why this serious issue is still "open and unresolved" after this long ?

            ypnuaa037 YuanPeng added a comment - This is a very serious issue!!! Last month, we updated galera from 10.0.x to 10.1.14 in our production environment, but already had crashed 3 times so far, which we never met on 10.0.x version. All crash is inexplicable with signal 11 error. There are several issues related, which looks the same. We want to wait for fix version, but why this serious issue is still "open and unresolved" after this long ?

            ypnuaa037 That's because we haven't been able to reproduce this issue on our end.
            Will it be possible for you to run a debug node and report a better stacktrace in case
            the crash happens again?

            nirbhay_c Nirbhay Choubey (Inactive) added a comment - ypnuaa037 That's because we haven't been able to reproduce this issue on our end. Will it be possible for you to run a debug node and report a better stacktrace in case the crash happens again?
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.19 [ 109 ]

            Ubuntu 14.04.4
            libc6:amd64 2.19-0ubuntu6.9
            mariadb-server-10.1 10.1.14+maria-1~trusty
            mariadb-server-core-10.1 10.1.14+maria-1~trusty

            5 Node mariadb galera cluster - crashes sometimes while logrotation is done

            syslog:

            Dec 7 02:16:55 galera04 mysqld: 161207 2:16:55 [ERROR] mysqld got signal 11 ;
            Dec 7 02:16:55 galera04 mysqld: This could be because you hit a bug. It is also possible that this binary
            Dec 7 02:16:55 galera04 mysqld: or one of the libraries it was linked against is corrupt, improperly built,
            Dec 7 02:16:55 galera04 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
            Dec 7 02:16:55 galera04 mysqld:
            Dec 7 02:16:55 galera04 mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
            Dec 7 02:16:55 galera04 mysqld:
            Dec 7 02:16:55 galera04 mysqld: We will try our best to scrape up some info that will hopefully help
            Dec 7 02:16:55 galera04 mysqld: diagnose the problem, but since we have already crashed,
            Dec 7 02:16:55 galera04 mysqld: something is definitely wrong and this may fail.
            Dec 7 02:16:55 galera04 mysqld:
            Dec 7 02:16:55 galera04 mysqld: Server version: 10.1.14-MariaDB-1~trusty
            Dec 7 02:16:55 galera04 mysqld: key_buffer_size=134217728
            Dec 7 02:16:55 galera04 mysqld: read_buffer_size=2097152
            Dec 7 02:16:55 galera04 mysqld: max_used_connections=14
            Dec 7 02:16:55 galera04 mysqld: max_threads=262
            Dec 7 02:16:55 galera04 mysqld: thread_count=19
            Dec 7 02:16:55 galera04 mysqld: It is possible that mysqld could use up to
            Dec 7 02:16:55 galera04 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2819298 K bytes of memory
            Dec 7 02:16:55 galera04 mysqld: Hope that's ok; if not, decrease some variables in the equation.

            caused by logrotation:
            /var/log/mysql.log /var/log/mysql/mysql.log /var/log/mysql/mariadb-slow.log

            { daily rotate 7 missingok create 640 mysql adm compress sharedscripts postrotate test -x /usr/bin/mysqladmin || exit 0 if [ -f `my_print_defaults --mysqld | grep -oP "pid-file=\K[^$]+"` ]; then # If this fails, check debian.conf! mysqladmin --defaults-file=/etc/mysql/debian.cnf flush-logs fi endscript }
            engel75 Florian Engelmann added a comment - Ubuntu 14.04.4 libc6:amd64 2.19-0ubuntu6.9 mariadb-server-10.1 10.1.14+maria-1~trusty mariadb-server-core-10.1 10.1.14+maria-1~trusty 5 Node mariadb galera cluster - crashes sometimes while logrotation is done syslog: Dec 7 02:16:55 galera04 mysqld: 161207 2:16:55 [ERROR] mysqld got signal 11 ; Dec 7 02:16:55 galera04 mysqld: This could be because you hit a bug. It is also possible that this binary Dec 7 02:16:55 galera04 mysqld: or one of the libraries it was linked against is corrupt, improperly built, Dec 7 02:16:55 galera04 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware. Dec 7 02:16:55 galera04 mysqld: Dec 7 02:16:55 galera04 mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs Dec 7 02:16:55 galera04 mysqld: Dec 7 02:16:55 galera04 mysqld: We will try our best to scrape up some info that will hopefully help Dec 7 02:16:55 galera04 mysqld: diagnose the problem, but since we have already crashed, Dec 7 02:16:55 galera04 mysqld: something is definitely wrong and this may fail. Dec 7 02:16:55 galera04 mysqld: Dec 7 02:16:55 galera04 mysqld: Server version: 10.1.14-MariaDB-1~trusty Dec 7 02:16:55 galera04 mysqld: key_buffer_size=134217728 Dec 7 02:16:55 galera04 mysqld: read_buffer_size=2097152 Dec 7 02:16:55 galera04 mysqld: max_used_connections=14 Dec 7 02:16:55 galera04 mysqld: max_threads=262 Dec 7 02:16:55 galera04 mysqld: thread_count=19 Dec 7 02:16:55 galera04 mysqld: It is possible that mysqld could use up to Dec 7 02:16:55 galera04 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2819298 K bytes of memory Dec 7 02:16:55 galera04 mysqld: Hope that's ok; if not, decrease some variables in the equation. caused by logrotation: /var/log/mysql.log /var/log/mysql/mysql.log /var/log/mysql/mariadb-slow.log { daily rotate 7 missingok create 640 mysql adm compress sharedscripts postrotate test -x /usr/bin/mysqladmin || exit 0 if [ -f `my_print_defaults --mysqld | grep -oP "pid-file=\K[^$]+"` ]; then # If this fails, check debian.conf! mysqladmin --defaults-file=/etc/mysql/debian.cnf flush-logs fi endscript }
            eVincent Vincent Enjalbert added a comment - - edited

            Ubuntu 14.04 LTS
            Mariadb 10.1.19
            3 nodes
            Main app running : Magento1.9.3.1

            Dec 18 13:48:22 mariadb1 mysqld: 161218 13:48:22 [ERROR] mysqld got signal 11 ;
            Dec 18 13:48:22 mariadb1 mysqld: This could be because you hit a bug. It is also possible that this binary
            Dec 18 13:48:22 mariadb1 mysqld: or one of the libraries it was linked against is corrupt, improperly built,
            Dec 18 13:48:22 mariadb1 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
            Dec 18 13:48:22 mariadb1 mysqld:
            Dec 18 13:48:22 mariadb1 mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
            Dec 18 13:48:22 mariadb1 mysqld:
            Dec 18 13:48:22 mariadb1 mysqld: We will try our best to scrape up some info that will hopefully help
            Dec 18 13:48:22 mariadb1 mysqld: diagnose the problem, but since we have already crashed,
            Dec 18 13:48:22 mariadb1 mysqld: something is definitely wrong and this may fail.
            Dec 18 13:48:22 mariadb1 mysqld:
            Dec 18 13:48:22 mariadb1 mysqld: Server version: 10.1.19-MariaDB-1~trusty
            Dec 18 13:48:22 mariadb1 mysqld: key_buffer_size=134217728
            Dec 18 13:48:22 mariadb1 mysqld: read_buffer_size=2097152
            Dec 18 13:48:22 mariadb1 mysqld: max_used_connections=14
            Dec 18 13:48:22 mariadb1 mysqld: max_threads=302
            Dec 18 13:48:22 mariadb1 mysqld: thread_count=11
            Dec 18 13:48:22 mariadb1 mysqld: It is possible that mysqld could use up to
            Dec 18 13:48:22 mariadb1 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1992742 K  bytes of memory
            Dec 18 13:48:22 mariadb1 mysqld: Hope that's ok; if not, decrease some variables in the equation.
            Dec 18 13:48:22 mariadb1 mysqld:
            Dec 18 13:48:22 mariadb1 mysqld: Thread pointer: 0x0x7efd58812008
            Dec 18 13:48:22 mariadb1 mysqld: Attempting backtrace. You can use the following information to find out
            Dec 18 13:48:22 mariadb1 mysqld: where mysqld died. If you see no messages after this, something went
            Dec 18 13:48:22 mariadb1 mysqld: terribly wrong...
            Dec 18 13:48:22 mariadb1 mysqld: stack_bottom = 0x7eff93b6c210 thread_stack 0x48400
            Dec 18 13:48:22 mariadb1 mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7eff946dd67e]
            Dec 18 13:48:22 mariadb1 mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x2d5)[0x7eff94200ba5]
            Dec 18 13:48:22 mariadb1 mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7eff92755330]
            Dec 18 13:48:22 mariadb1 mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0xe8)[0x7eff942c2e88]
            Dec 18 13:48:22 mariadb1 mysqld: /usr/sbin/mysqld(binlog_background_thread+0x2c5)[0x7eff942c3525]
            Dec 18 13:48:22 mariadb1 mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7eff9274d184]
            Dec 18 13:48:22 mariadb1 mysqld: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7eff91c6c37d]
            Dec 18 13:48:22 mariadb1 mysqld:
            Dec 18 13:48:22 mariadb1 mysqld: Trying to get some variables.
            Dec 18 13:48:22 mariadb1 mysqld: Some pointers may be invalid and cause the dump to abort.
            Dec 18 13:48:22 mariadb1 mysqld: Query (0x0): is an invalid pointer
            Dec 18 13:48:22 mariadb1 mysqld: Connection ID (thread ID): 3
            Dec 18 13:48:22 mariadb1 mysqld: Status: NOT_KILLED
            Dec 18 13:48:22 mariadb1 mysqld:
            Dec 18 13:48:22 mariadb1 mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
            Dec 18 13:48:22 mariadb1 mysqld:
            Dec 18 13:48:22 mariadb1 mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            Dec 18 13:48:22 mariadb1 mysqld: information that should help you find out what is causing the crash.
            Dec 18 13:48:22 mariadb1 mysqld:
            Dec 18 13:48:22 mariadb1 mysqld: We think the query pointer is invalid, but we will try to print it anyway.
            Dec 18 13:48:22 mariadb1 mysqld: Query:
            Dec 18 13:48:22 mariadb1 mysqld:
            Dec 18 13:48:22 mariadb1 mysqld_safe: Number of processes running now: 0
            Dec 18 13:48:22 mariadb1 mysqld_safe: WSREP: not restarting wsrep node automatically
            Dec 18 13:48:22 mariadb1 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended
            

            I don't know how to find the cause, it seems quite random. How can I help you fix this ?

            eVincent Vincent Enjalbert added a comment - - edited Ubuntu 14.04 LTS Mariadb 10.1.19 3 nodes Main app running : Magento1.9.3.1 Dec 18 13 : 48 : 22 mariadb1 mysqld: 161218 13 : 48 : 22 [ERROR] mysqld got signal 11 ; Dec 18 13 : 48 : 22 mariadb1 mysqld: This could be because you hit a bug. It is also possible that this binary Dec 18 13 : 48 : 22 mariadb1 mysqld: or one of the libraries it was linked against is corrupt, improperly built, Dec 18 13 : 48 : 22 mariadb1 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware. Dec 18 13 : 48 : 22 mariadb1 mysqld: Dec 18 13 : 48 : 22 mariadb1 mysqld: To report this bug, see https: //mariadb.com/kb/en/reporting-bugs Dec 18 13 : 48 : 22 mariadb1 mysqld: Dec 18 13 : 48 : 22 mariadb1 mysqld: We will try our best to scrape up some info that will hopefully help Dec 18 13 : 48 : 22 mariadb1 mysqld: diagnose the problem, but since we have already crashed, Dec 18 13 : 48 : 22 mariadb1 mysqld: something is definitely wrong and this may fail. Dec 18 13 : 48 : 22 mariadb1 mysqld: Dec 18 13 : 48 : 22 mariadb1 mysqld: Server version: 10.1 . 19 -MariaDB- 1 ~trusty Dec 18 13 : 48 : 22 mariadb1 mysqld: key_buffer_size= 134217728 Dec 18 13 : 48 : 22 mariadb1 mysqld: read_buffer_size= 2097152 Dec 18 13 : 48 : 22 mariadb1 mysqld: max_used_connections= 14 Dec 18 13 : 48 : 22 mariadb1 mysqld: max_threads= 302 Dec 18 13 : 48 : 22 mariadb1 mysqld: thread_count= 11 Dec 18 13 : 48 : 22 mariadb1 mysqld: It is possible that mysqld could use up to Dec 18 13 : 48 : 22 mariadb1 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1992742 K bytes of memory Dec 18 13 : 48 : 22 mariadb1 mysqld: Hope that's ok; if not, decrease some variables in the equation. Dec 18 13 : 48 : 22 mariadb1 mysqld: Dec 18 13 : 48 : 22 mariadb1 mysqld: Thread pointer: 0x0x7efd58812008 Dec 18 13 : 48 : 22 mariadb1 mysqld: Attempting backtrace. You can use the following information to find out Dec 18 13 : 48 : 22 mariadb1 mysqld: where mysqld died. If you see no messages after this , something went Dec 18 13 : 48 : 22 mariadb1 mysqld: terribly wrong... Dec 18 13 : 48 : 22 mariadb1 mysqld: stack_bottom = 0x7eff93b6c210 thread_stack 0x48400 Dec 18 13 : 48 : 22 mariadb1 mysqld: /usr/sbin/mysqld(my_print_stacktrace+ 0x2e )[ 0x7eff946dd67e ] Dec 18 13 : 48 : 22 mariadb1 mysqld: /usr/sbin/mysqld(handle_fatal_signal+ 0x2d5 )[ 0x7eff94200ba5 ] Dec 18 13 : 48 : 22 mariadb1 mysqld: /lib/x86_64-linux-gnu/libpthread.so. 0 (+ 0x10330 )[ 0x7eff92755330 ] Dec 18 13 : 48 : 22 mariadb1 mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+ 0xe8 )[ 0x7eff942c2e88 ] Dec 18 13 : 48 : 22 mariadb1 mysqld: /usr/sbin/mysqld(binlog_background_thread+ 0x2c5 )[ 0x7eff942c3525 ] Dec 18 13 : 48 : 22 mariadb1 mysqld: /lib/x86_64-linux-gnu/libpthread.so. 0 (+ 0x8184 )[ 0x7eff9274d184 ] Dec 18 13 : 48 : 22 mariadb1 mysqld: /lib/x86_64-linux-gnu/libc.so. 6 (clone+ 0x6d )[ 0x7eff91c6c37d ] Dec 18 13 : 48 : 22 mariadb1 mysqld: Dec 18 13 : 48 : 22 mariadb1 mysqld: Trying to get some variables. Dec 18 13 : 48 : 22 mariadb1 mysqld: Some pointers may be invalid and cause the dump to abort. Dec 18 13 : 48 : 22 mariadb1 mysqld: Query ( 0x0 ): is an invalid pointer Dec 18 13 : 48 : 22 mariadb1 mysqld: Connection ID (thread ID): 3 Dec 18 13 : 48 : 22 mariadb1 mysqld: Status: NOT_KILLED Dec 18 13 : 48 : 22 mariadb1 mysqld: Dec 18 13 : 48 : 22 mariadb1 mysqld: Optimizer switch : index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off Dec 18 13 : 48 : 22 mariadb1 mysqld: Dec 18 13 : 48 : 22 mariadb1 mysqld: The manual page at http: //dev.mysql.com/doc/mysql/en/crashing.html contains Dec 18 13 : 48 : 22 mariadb1 mysqld: information that should help you find out what is causing the crash. Dec 18 13 : 48 : 22 mariadb1 mysqld: Dec 18 13 : 48 : 22 mariadb1 mysqld: We think the query pointer is invalid, but we will try to print it anyway. Dec 18 13 : 48 : 22 mariadb1 mysqld: Query: Dec 18 13 : 48 : 22 mariadb1 mysqld: Dec 18 13 : 48 : 22 mariadb1 mysqld_safe: Number of processes running now: 0 Dec 18 13 : 48 : 22 mariadb1 mysqld_safe: WSREP: not restarting wsrep node automatically Dec 18 13 : 48 : 22 mariadb1 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended I don't know how to find the cause, it seems quite random. How can I help you fix this ?

            I did upload a core dump to MDEV-10809 which seems to be related to MDEV-9510

            Has anybody solved this issue?

            engel75 Florian Engelmann added a comment - I did upload a core dump to MDEV-10809 which seems to be related to MDEV-9510 Has anybody solved this issue?

            Affected versions are at least (tested by us) 10.1.10, 10.1.14, 10.1.19 and 10.1.20.

            engel75 Florian Engelmann added a comment - Affected versions are at least (tested by us) 10.1.10, 10.1.14, 10.1.19 and 10.1.20.

            @eVincent, GeoffMontee, plinux

            Do you use HAproxy to do loadbalancing to the galera cluster? How is your healthcheck implemented?

            engel75 Florian Engelmann added a comment - @ eVincent , GeoffMontee , plinux Do you use HAproxy to do loadbalancing to the galera cluster? How is your healthcheck implemented?
            bmalynovytch Benjamin MALYNOVYTCH added a comment - - edited

            @engel75: I manage the infrastructure mentionned by eVincent
            We use a simple "mysql-check", with a sort of static "one to one" binding between web servers and mariadb servers, like so:

            • web1 => mariadb1 with failover on mariadb2 & 3
            • web2 => mariadb2 with failover on mariadb1 & 3
            • web3 => mariadb3 with failover on mariadb1 & 2

            Here the HAProxy snippet:

            haproxy.cfg

            listen mariadb_cluster
              bind *:3307
              balance source
              mode tcp
              option tcpka
              option mysql-check user haproxy
              server mariadb1 mariadb1:3306 check
              server mariadb2 mariadb2:3306 check backup
              server mariadb3 mariadb3:3306 check backup
            

            Benjamin

            PS: downgrading to MariaDB 5.5 + Galera Cluster seemed to fix the issue (no crashes since the last startup of the cluster, more than 24h ago)

            bmalynovytch Benjamin MALYNOVYTCH added a comment - - edited @ engel75 : I manage the infrastructure mentionned by eVincent We use a simple "mysql-check", with a sort of static "one to one" binding between web servers and mariadb servers, like so: web1 => mariadb1 with failover on mariadb2 & 3 web2 => mariadb2 with failover on mariadb1 & 3 web3 => mariadb3 with failover on mariadb1 & 2 Here the HAProxy snippet: haproxy.cfg listen mariadb_cluster bind *:3307 balance source mode tcp option tcpka option mysql-check user haproxy server mariadb1 mariadb1:3306 check server mariadb2 mariadb2:3306 check backup server mariadb3 mariadb3:3306 check backup Benjamin PS: downgrading to MariaDB 5.5 + Galera Cluster seemed to fix the issue (no crashes since the last startup of the cluster, more than 24h ago)

            Thank you Benjamin. I asked because after disabling haproxy health checks on one of our nodes that node did not have a stuck "flush logs" this morning. I will test this with two nodes till Monday to proof it. As mysql-check does not check if a node is "Synced" we do use a shell script initiated by xinetd to check the status:

            show global status where variable_name='wsrep_local_state';

            We do use 3 haproxy loadbalancers ("clusterd" via keepalived) to route traffic to the cluster. Thats a lot of checks incoming. We might use a perl daemon to handle those requests kind of "man in the middle".

            engel75 Florian Engelmann added a comment - Thank you Benjamin. I asked because after disabling haproxy health checks on one of our nodes that node did not have a stuck "flush logs" this morning. I will test this with two nodes till Monday to proof it. As mysql-check does not check if a node is "Synced" we do use a shell script initiated by xinetd to check the status: show global status where variable_name='wsrep_local_state'; We do use 3 haproxy loadbalancers ("clusterd" via keepalived) to route traffic to the cluster. Thats a lot of checks incoming. We might use a perl daemon to handle those requests kind of "man in the middle".

            The problem with flush logs has nothing to do with the health checks. It seems to be load related as only nodes with enabled haproxy health checks get stuck with flush logs.

            engel75 Florian Engelmann added a comment - The problem with flush logs has nothing to do with the health checks. It seems to be load related as only nodes with enabled haproxy health checks get stuck with flush logs.
            danblack Daniel Black made changes -
            engel75 Florian Engelmann added a comment - - edited

            is anybody working on this one? We are still forced to "clean up" the production Galera Cluster every Morning.

            As "service mysql stop" gets stuck as well we have to use "kill -9"

            echo disable haproxy health check
            sed -i -e "s/^#echo disabled/echo disabled/" -e "s/^#exit 0/exit 0/" /usr/local/bin/mysqlchk.sh
             
            echo wait 10s
            sleep 10
             
            echo stop mysql
            timeout 5s service mysql stop
             
            echo send TERM
            kill $(pidof mysqld)
             
            echo wait 5s
            sleep 5
             
            echo send KILL
            kill -9 $(pidof mysqld)
             
            sleep 1
             
            echo check mysql is gone
            ps faux | grep mysql
             
             
            sleep 2
             
            echo start mysql
            service mysql start
             
            sleep 6s
             
            echo check galera is synced
            mysql -e "SHOW STATUS WHERE variable_name ='wsrep_local_state_comment';"
             
            echo enable haproxy health check
            sed -i -e "s/^echo disabled/#echo disabled/" -e "s/^exit 0/#exit 0/" /usr/local/bin/mysqlchk.sh
            

            We consider using Percona XtraDB Cluster if this issue is not treated as highest priority. Could anybody please let everyone know what's going on?

            engel75 Florian Engelmann added a comment - - edited is anybody working on this one? We are still forced to "clean up" the production Galera Cluster every Morning. As "service mysql stop" gets stuck as well we have to use "kill -9" echo disable haproxy health check sed -i -e "s/^#echo disabled/echo disabled/" -e "s/^#exit 0/exit 0/" /usr/local/bin/mysqlchk .sh   echo wait 10s sleep 10   echo stop mysql timeout 5s service mysql stop   echo send TERM kill $(pidof mysqld)   echo wait 5s sleep 5   echo send KILL kill -9 $(pidof mysqld)   sleep 1   echo check mysql is gone ps faux | grep mysql     sleep 2   echo start mysql service mysql start   sleep 6s   echo check galera is synced mysql -e "SHOW STATUS WHERE variable_name ='wsrep_local_state_comment';"   echo enable haproxy health check sed -i -e "s/^echo disabled/#echo disabled/" -e "s/^exit 0/#exit 0/" /usr/local/bin/mysqlchk .sh We consider using Percona XtraDB Cluster if this issue is not treated as highest priority. Could anybody please let everyone know what's going on?

            As a reminder, if downgrading to MariaDB 5.5 is an option, it is stable.

            bmalynovytch Benjamin MALYNOVYTCH added a comment - As a reminder, if downgrading to MariaDB 5.5 is an option, it is stable.

            engel75 yes, I am looking into it. Unfortunately, the core dump that you submitted did
            not help and so far I have not been able repeat this issue on my end.

            nirbhay_c Nirbhay Choubey (Inactive) added a comment - engel75 yes, I am looking into it. Unfortunately, the core dump that you submitted did not help and so far I have not been able repeat this issue on my end.
            engel75 Florian Engelmann added a comment - - edited

            Dear nirbhay_c,

            I am glad you are working on this issue. Today xtrabackup crashed one node of that cluster with an error that might be related to the same issue:

            Feb  6 09:50:23 galera01-customerA-prod mysqld: 2017-02-06  9:50:23 140112419612416 [Note] WSREP: Member 3.0 (galera01-customerA-prod-back) desyncs itself from group
            Feb  6 09:50:23 galera01-customerA-prod mysqld: 2017-02-06  9:50:23 140112419612416 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 123190807)
            Feb  6 09:50:23 galera01-customerA-prod mysqld: 2017-02-06  9:50:23 140112468671232 [Note] WSREP: Provider paused at 202dc465-47c0-11e4-b99e-9b1f0ffcf741:123190807 (440)
            Feb  6 09:50:23 galera01-customerA-prod mysqld: 170206  9:50:23 [ERROR] mysqld got signal 11 ;
            Feb  6 09:50:23 galera01-customerA-prod mysqld: This could be because you hit a bug. It is also possible that this binary
            Feb  6 09:50:23 galera01-customerA-prod mysqld: or one of the libraries it was linked against is corrupt, improperly built,
            Feb  6 09:50:23 galera01-customerA-prod mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
            Feb  6 09:50:23 galera01-customerA-prod mysqld: 
            Feb  6 09:50:23 galera01-customerA-prod mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
            Feb  6 09:50:23 galera01-customerA-prod mysqld: 
            Feb  6 09:50:23 galera01-customerA-prod mysqld: We will try our best to scrape up some info that will hopefully help
            Feb  6 09:50:23 galera01-customerA-prod mysqld: diagnose the problem, but since we have already crashed, 
            Feb  6 09:50:23 galera01-customerA-prod mysqld: something is definitely wrong and this may fail.
            Feb  6 09:50:23 galera01-customerA-prod mysqld: 
            Feb  6 09:50:23 galera01-customerA-prod mysqld: Server version: 10.1.20-MariaDB-1~trusty
            Feb  6 09:50:23 galera01-customerA-prod mysqld: key_buffer_size=134217728
            Feb  6 09:50:23 galera01-customerA-prod mysqld: read_buffer_size=2097152
            Feb  6 09:50:23 galera01-customerA-prod mysqld: max_used_connections=13
            Feb  6 09:50:23 galera01-customerA-prod mysqld: max_threads=262
            Feb  6 09:50:23 galera01-customerA-prod mysqld: thread_count=21
            Feb  6 09:50:23 galera01-customerA-prod mysqld: It is possible that mysqld could use up to 
            Feb  6 09:50:23 galera01-customerA-prod mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2819312 K  bytes of memory
            Feb  6 09:50:23 galera01-customerA-prod mysqld: Hope that's ok; if not, decrease some variables in the equation.
            Feb  6 09:50:23 galera01-customerA-prod mysqld: 
            Feb  6 09:50:23 galera01-customerA-prod mysqld: Thread pointer: 0x0x7f6c47ad8008
            Feb  6 09:50:23 galera01-customerA-prod mysqld: Attempting backtrace. You can use the following information to find out
            Feb  6 09:50:23 galera01-customerA-prod mysqld: where mysqld died. If you see no messages after this, something went
            Feb  6 09:50:23 galera01-customerA-prod mysqld: terribly wrong...
            Feb  6 09:50:23 galera01-customerA-prod mysqld: stack_bottom = 0x7f6e79ec71f0 thread_stack 0x48400
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f6fbf90bc2e]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x305)[0x7f6fbf42ea95]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f6fbd779330]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG21do_checkpoint_requestEm+0x9d)[0x7f6fbf4f0aad]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG20checkpoint_and_purgeEm+0x11)[0x7f6fbf4f0b41]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG16rotate_and_purgeEb+0xc2)[0x7f6fbf4f31b2]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x130)[0x7f6fbf396f10]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1314)[0x7f6fbf2a9cd4]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x331)[0x7f6fbf2b22a1]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(+0x439ac9)[0x7f6fbf2b2ac9]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1e2b)[0x7f6fbf2b4fcb]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z10do_commandP3THD+0x169)[0x7f6fbf2b5dc9]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x18a)[0x7f6fbf37cefa]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(handle_one_connection+0x40)[0x7f6fbf37d0d0]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7f6fbd771184]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f6fbcc9037d]
            Feb  6 09:50:23 galera01-customerA-prod mysqld: 
            Feb  6 09:50:23 galera01-customerA-prod mysqld: Trying to get some variables.
            Feb  6 09:50:23 galera01-customerA-prod mysqld: Some pointers may be invalid and cause the dump to abort.
            Feb  6 09:50:23 galera01-customerA-prod mysqld: Query (0x7f6c33021020): FLUSH BINARY LOGS
            Feb  6 09:50:23 galera01-customerA-prod mysqld: Connection ID (thread ID): 284
            Feb  6 09:50:23 galera01-customerA-prod mysqld: Status: NOT_KILLED
            Feb  6 09:50:23 galera01-customerA-prod mysqld: 
            Feb  6 09:50:23 galera01-customerA-prod mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
            Feb  6 09:50:23 galera01-customerA-prod mysqld: 
            Feb  6 09:50:23 galera01-customerA-prod mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            Feb  6 09:50:23 galera01-customerA-prod mysqld: information that should help you find out what is causing the crash.
            Feb  6 09:50:23 galera01-customerA-prod mysqld: Writing a core file
            

            Without disabling the nodes HAProxy health check I was not able to finish the backup successful.
            I've got two core dumps related to the issue:

            rw------ 1 mysql mysql 9.6G Feb 6 09:46 core.mysqld.49551.11.1486370801
            rw------ 1 mysql mysql 9.5G Feb 6 09:50 core.mysqld.9877.11.1486371023

            Should I upload both?

            engel75 Florian Engelmann added a comment - - edited Dear nirbhay_c , I am glad you are working on this issue. Today xtrabackup crashed one node of that cluster with an error that might be related to the same issue: Feb 6 09:50:23 galera01-customerA-prod mysqld: 2017-02-06 9:50:23 140112419612416 [Note] WSREP: Member 3.0 (galera01-customerA-prod-back) desyncs itself from group Feb 6 09:50:23 galera01-customerA-prod mysqld: 2017-02-06 9:50:23 140112419612416 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 123190807) Feb 6 09:50:23 galera01-customerA-prod mysqld: 2017-02-06 9:50:23 140112468671232 [Note] WSREP: Provider paused at 202dc465-47c0-11e4-b99e-9b1f0ffcf741:123190807 (440) Feb 6 09:50:23 galera01-customerA-prod mysqld: 170206 9:50:23 [ERROR] mysqld got signal 11 ; Feb 6 09:50:23 galera01-customerA-prod mysqld: This could be because you hit a bug. It is also possible that this binary Feb 6 09:50:23 galera01-customerA-prod mysqld: or one of the libraries it was linked against is corrupt, improperly built, Feb 6 09:50:23 galera01-customerA-prod mysqld: or misconfigured. This error can also be caused by malfunctioning hardware. Feb 6 09:50:23 galera01-customerA-prod mysqld: Feb 6 09:50:23 galera01-customerA-prod mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs Feb 6 09:50:23 galera01-customerA-prod mysqld: Feb 6 09:50:23 galera01-customerA-prod mysqld: We will try our best to scrape up some info that will hopefully help Feb 6 09:50:23 galera01-customerA-prod mysqld: diagnose the problem, but since we have already crashed, Feb 6 09:50:23 galera01-customerA-prod mysqld: something is definitely wrong and this may fail. Feb 6 09:50:23 galera01-customerA-prod mysqld: Feb 6 09:50:23 galera01-customerA-prod mysqld: Server version: 10.1.20-MariaDB-1~trusty Feb 6 09:50:23 galera01-customerA-prod mysqld: key_buffer_size=134217728 Feb 6 09:50:23 galera01-customerA-prod mysqld: read_buffer_size=2097152 Feb 6 09:50:23 galera01-customerA-prod mysqld: max_used_connections=13 Feb 6 09:50:23 galera01-customerA-prod mysqld: max_threads=262 Feb 6 09:50:23 galera01-customerA-prod mysqld: thread_count=21 Feb 6 09:50:23 galera01-customerA-prod mysqld: It is possible that mysqld could use up to Feb 6 09:50:23 galera01-customerA-prod mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2819312 K bytes of memory Feb 6 09:50:23 galera01-customerA-prod mysqld: Hope that's ok; if not, decrease some variables in the equation. Feb 6 09:50:23 galera01-customerA-prod mysqld: Feb 6 09:50:23 galera01-customerA-prod mysqld: Thread pointer: 0x0x7f6c47ad8008 Feb 6 09:50:23 galera01-customerA-prod mysqld: Attempting backtrace. You can use the following information to find out Feb 6 09:50:23 galera01-customerA-prod mysqld: where mysqld died. If you see no messages after this, something went Feb 6 09:50:23 galera01-customerA-prod mysqld: terribly wrong... Feb 6 09:50:23 galera01-customerA-prod mysqld: stack_bottom = 0x7f6e79ec71f0 thread_stack 0x48400 Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f6fbf90bc2e] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x305)[0x7f6fbf42ea95] Feb 6 09:50:23 galera01-customerA-prod mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f6fbd779330] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG21do_checkpoint_requestEm+0x9d)[0x7f6fbf4f0aad] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG20checkpoint_and_purgeEm+0x11)[0x7f6fbf4f0b41] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG16rotate_and_purgeEb+0xc2)[0x7f6fbf4f31b2] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x130)[0x7f6fbf396f10] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1314)[0x7f6fbf2a9cd4] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x331)[0x7f6fbf2b22a1] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(+0x439ac9)[0x7f6fbf2b2ac9] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1e2b)[0x7f6fbf2b4fcb] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z10do_commandP3THD+0x169)[0x7f6fbf2b5dc9] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x18a)[0x7f6fbf37cefa] Feb 6 09:50:23 galera01-customerA-prod mysqld: /usr/sbin/mysqld(handle_one_connection+0x40)[0x7f6fbf37d0d0] Feb 6 09:50:23 galera01-customerA-prod mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7f6fbd771184] Feb 6 09:50:23 galera01-customerA-prod mysqld: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f6fbcc9037d] Feb 6 09:50:23 galera01-customerA-prod mysqld: Feb 6 09:50:23 galera01-customerA-prod mysqld: Trying to get some variables. Feb 6 09:50:23 galera01-customerA-prod mysqld: Some pointers may be invalid and cause the dump to abort. Feb 6 09:50:23 galera01-customerA-prod mysqld: Query (0x7f6c33021020): FLUSH BINARY LOGS Feb 6 09:50:23 galera01-customerA-prod mysqld: Connection ID (thread ID): 284 Feb 6 09:50:23 galera01-customerA-prod mysqld: Status: NOT_KILLED Feb 6 09:50:23 galera01-customerA-prod mysqld: Feb 6 09:50:23 galera01-customerA-prod mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off Feb 6 09:50:23 galera01-customerA-prod mysqld: Feb 6 09:50:23 galera01-customerA-prod mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains Feb 6 09:50:23 galera01-customerA-prod mysqld: information that should help you find out what is causing the crash. Feb 6 09:50:23 galera01-customerA-prod mysqld: Writing a core file Without disabling the nodes HAProxy health check I was not able to finish the backup successful. I've got two core dumps related to the issue: rw ------ 1 mysql mysql 9.6G Feb 6 09:46 core.mysqld.49551.11.1486370801 rw ------ 1 mysql mysql 9.5G Feb 6 09:50 core.mysqld.9877.11.1486371023 Should I upload both?
            madpsy Nathan Madpsy added a comment -

            I was hit by this last night - 1 out of 3 nodes in a Galera cluster.

            At that time the cluster would not be performing anything unusual and in fact should have been relatively quiescent.

            Version 10.1.18 on Ubuntu 16.04

            There is nothing else in the logs before this time.

            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: 170224  1:26:33 [ERROR] mysqld got signal 11 ;
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: This could be because you hit a bug. It is also possible that this binary
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: or one of the libraries it was linked against is corrupt, improperly built,
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: or misconfigured. This error can also be caused by malfunctioning hardware.
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: We will try our best to scrape up some info that will hopefully help
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: diagnose the problem, but since we have already crashed,
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: something is definitely wrong and this may fail.
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Server version: 10.1.18-MariaDB-1~xenial
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: key_buffer_size=134217728
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: read_buffer_size=2097152
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: max_used_connections=13
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: max_threads=202
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: thread_count=7
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: It is possible that mysqld could use up to
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1376288 K  bytes of memory
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Hope that's ok; if not, decrease some variables in the equation.
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Thread pointer: 0x0x7f58c3849008
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Attempting backtrace. You can use the following information to find out
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: where mysqld died. If you see no messages after this, something went
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: terribly wrong...
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: stack_bottom = 0x7f595dc5a0a8 thread_stack 0x48400
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55638521562e]
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(handle_fatal_signal+0x2d5)[0x556384d643d5]
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x113d0)[0x7f595c3cc3d0]
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b)[0x556384e1931b]
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(binlog_background_thread+0x2f3)[0x556384e198f3]
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(+0x705fb1)[0x556384ed4fb1]
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76fa)[0x7f595c3c26fa]
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f595ba6db5d]
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Trying to get some variables.
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Some pointers may be invalid and cause the dump to abort.
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Query (0x0):
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Connection ID (thread ID): 3
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Status: NOT_KILLED
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: information that should help you find out what is causing the crash.
            Feb 24 01:26:33 ams-prod1-db01 systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE
            Feb 24 01:26:33 ams-prod1-db01 systemd[1]: mariadb.service: Unit entered failed state.
            Feb 24 01:26:33 ams-prod1-db01 systemd[1]: mariadb.service: Failed with result 'exit-code'.
            

            madpsy Nathan Madpsy added a comment - I was hit by this last night - 1 out of 3 nodes in a Galera cluster. At that time the cluster would not be performing anything unusual and in fact should have been relatively quiescent. Version 10.1.18 on Ubuntu 16.04 There is nothing else in the logs before this time. Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: 170224 1:26:33 [ERROR] mysqld got signal 11 ; Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: This could be because you hit a bug. It is also possible that this binary Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: or one of the libraries it was linked against is corrupt, improperly built, Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: or misconfigured. This error can also be caused by malfunctioning hardware. Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: We will try our best to scrape up some info that will hopefully help Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: diagnose the problem, but since we have already crashed, Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: something is definitely wrong and this may fail. Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Server version: 10.1.18-MariaDB-1~xenial Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: key_buffer_size=134217728 Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: read_buffer_size=2097152 Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: max_used_connections=13 Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: max_threads=202 Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: thread_count=7 Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: It is possible that mysqld could use up to Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1376288 K bytes of memory Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Hope that's ok; if not, decrease some variables in the equation. Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Thread pointer: 0x0x7f58c3849008 Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Attempting backtrace. You can use the following information to find out Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: where mysqld died. If you see no messages after this, something went Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: terribly wrong... Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: stack_bottom = 0x7f595dc5a0a8 thread_stack 0x48400 Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55638521562e] Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(handle_fatal_signal+0x2d5)[0x556384d643d5] Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x113d0)[0x7f595c3cc3d0] Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b)[0x556384e1931b] Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(binlog_background_thread+0x2f3)[0x556384e198f3] Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /usr/sbin/mysqld(+0x705fb1)[0x556384ed4fb1] Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76fa)[0x7f595c3c26fa] Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f595ba6db5d] Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Trying to get some variables. Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Some pointers may be invalid and cause the dump to abort. Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Query (0x0): Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Connection ID (thread ID): 3 Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Status: NOT_KILLED Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains Feb 24 01:26:33 ams-prod1-db01 mysqld[13237]: information that should help you find out what is causing the crash. Feb 24 01:26:33 ams-prod1-db01 systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE Feb 24 01:26:33 ams-prod1-db01 systemd[1]: mariadb.service: Unit entered failed state. Feb 24 01:26:33 ams-prod1-db01 systemd[1]: mariadb.service: Failed with result 'exit-code'.
            serg Sergei Golubchik made changes -
            Assignee Nirbhay Choubey [ nirbhay_c ]
            serg Sergei Golubchik made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.19 [ 109 ] 10.1.19, 10.1.22 [ 109, 143 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            madpsy Nathan Madpsy added a comment -

            This has reoccurred twice since my last comment. As a means to automatically restart the process I have resorted to setting the following for MariaDB's systemd config:

            Restart=on-failure

            Originally it is set to 'on-abort' but it seems with a signal 11, which occurs with this bug, the default value will not restart the process. It's not pretty and it's not advised but at least it provides an easy way to ensure the daemon is restarted upon hitting this bug.

            madpsy Nathan Madpsy added a comment - This has reoccurred twice since my last comment. As a means to automatically restart the process I have resorted to setting the following for MariaDB's systemd config: Restart=on-failure Originally it is set to 'on-abort' but it seems with a signal 11, which occurs with this bug, the default value will not restart the process. It's not pretty and it's not advised but at least it provides an easy way to ensure the daemon is restarted upon hitting this bug.
            danblack Daniel Black made changes -
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            danblack Daniel Black added a comment -

            madpsy FYI the systemd config related to Restarts/aborts has been fixed in MDEV-10382 (10.1.20)

            danblack Daniel Black added a comment - madpsy FYI the systemd config related to Restarts/aborts has been fixed in MDEV-10382 (10.1.20)
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.19, 10.1.22 [ 109, 143 ] 10.1.19, 10.1.22, 10.2.6-1 [ 109, 143, 146 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.19, 10.1.22, 10.2.6-1 [ 109, 143, 146 ] 10.1.19, 10.1.22, 10.2.6-2 [ 109, 143, 148 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            madpsy Nathan Madpsy added a comment -

            Reading some of the other related bugs there is comment that disabling binary logs prevents the crashes (which makes sense). I thought binary logging was somewhat essential for Galera - the simple question is:

            Can binary logging be disabled and if so what does that mean for the cluster?

            Thanks

            madpsy Nathan Madpsy added a comment - Reading some of the other related bugs there is comment that disabling binary logs prevents the crashes (which makes sense). I thought binary logging was somewhat essential for Galera - the simple question is: Can binary logging be disabled and if so what does that mean for the cluster? Thanks
            engel75 Florian Engelmann added a comment - see https://www.percona.com/blog/2015/07/16/bypassing-sst-pxc-binary-logs/ not needed but risky...
            madpsy Nathan Madpsy added a comment -

            Thanks. It sounds to me like binary logs are only required for point-in-time backups and not needed for Galera to operate nor for the xtrabackup SST method.

            Assuming one has backups of individual DBs, is there still any risk involved in disabling bin log? What I don't want to happen is SST to be triggered just because one of the Galera members restarts.

            madpsy Nathan Madpsy added a comment - Thanks. It sounds to me like binary logs are only required for point-in-time backups and not needed for Galera to operate nor for the xtrabackup SST method. Assuming one has backups of individual DBs, is there still any risk involved in disabling bin log? What I don't want to happen is SST to be triggered just because one of the Galera members restarts.

            I'm one of the people who disabled binary logs in their mariadb 10.x cluster to workaround this bug.

            It's worked fine for me, I use xtrabackup to take incremental backups on another server. I've done restore tests of these backups and I'm satisfied.

            stemid Stefan Midjich added a comment - I'm one of the people who disabled binary logs in their mariadb 10.x cluster to workaround this bug. It's worked fine for me, I use xtrabackup to take incremental backups on another server. I've done restore tests of these backups and I'm satisfied.
            madpsy Nathan Madpsy added a comment -

            Thanks for the info. I've now disabled binary logging and will monitor. Of course that's a workaround which happens to be satisfactory for some but this bug really needs fixed (over a year old!).

            madpsy Nathan Madpsy added a comment - Thanks for the info. I've now disabled binary logging and will monitor. Of course that's a workaround which happens to be satisfactory for some but this bug really needs fixed (over a year old!).
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.19, 10.1.22, 10.2.6-2 [ 109, 143, 148 ] 10.1.19, 10.1.22, 10.2.6-2, 10.2.6-3 [ 109, 143, 148, 150 ]
            GeoffMontee Geoff Montee (Inactive) made changes -
            Affects Version/s 10.1.20 [ 22112 ]
            danblack Daniel Black made changes -
            stijn Stijn added a comment -

            We also had this bug twice in production for a 10.1.22 server (no replication/cluster). The mysqld service did not restart automatically. The second crash caused a table to become corrupted.

            170626 10:22:06 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.
             
            To report this bug, see https://mariadb.com/kb/en/reporting-bugs
             
            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.
             
            Server version: 10.1.22-MariaDB
            key_buffer_size=134217728
            read_buffer_size=131072
            max_used_connections=227
            max_threads=1502
            thread_count=133
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3430253 K  bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.
             
            Thread pointer: 0x0
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x0 thread_stack 0x40000
            *** buffer overflow detected ***: /usr/sbin/mysqld terminated
            ======= Backtrace: =========
            /lib64/libc.so.6(__fortify_fail+0x37)[0x7fee0c5e4047]
            /lib64/libc.so.6(+0x10d200)[0x7fee0c5e2200]
            /lib64/libc.so.6(+0x10efb7)[0x7fee0c5e3fb7]
            /usr/sbin/mysqld(my_addr_resolve+0x48)[0x7fee0f08ac28]
            /usr/sbin/mysqld(my_print_stacktrace+0x1c2)[0x7fee0f0772a2]
            /usr/sbin/mysqld(handle_fatal_signal+0x305)[0x7fee0eb9a9d5]
            /lib64/libpthread.so.0(+0xf370)[0x7fee0e1b5370]
            /usr/sbin/mysqld(+0x88b37c)[0x7fee0ee6f37c]
            /usr/sbin/mysqld(+0x8911c3)[0x7fee0ee751c3]
            /usr/sbin/mysqld(+0x8ebbe8)[0x7fee0eecfbe8]
            /usr/sbin/mysqld(+0x817d70)[0x7fee0edfbd70]
            /lib64/libpthread.so.0(+0x7dc5)[0x7fee0e1addc5]
            /lib64/libc.so.6(clone+0x6d)[0x7fee0c5cc73d]
            ======= Memory map: ========
            

            Do you have any idea when this issue will be fixed?

            stijn Stijn added a comment - We also had this bug twice in production for a 10.1.22 server (no replication/cluster). The mysqld service did not restart automatically. The second crash caused a table to become corrupted. 170626 10:22:06 [ERROR] mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware.   To report this bug, see https://mariadb.com/kb/en/reporting-bugs   We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.   Server version: 10.1.22-MariaDB key_buffer_size=134217728 read_buffer_size=131072 max_used_connections=227 max_threads=1502 thread_count=133 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3430253 K bytes of memory Hope that's ok; if not, decrease some variables in the equation.   Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x0 thread_stack 0x40000 *** buffer overflow detected ***: /usr/sbin/mysqld terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x7fee0c5e4047] /lib64/libc.so.6(+0x10d200)[0x7fee0c5e2200] /lib64/libc.so.6(+0x10efb7)[0x7fee0c5e3fb7] /usr/sbin/mysqld(my_addr_resolve+0x48)[0x7fee0f08ac28] /usr/sbin/mysqld(my_print_stacktrace+0x1c2)[0x7fee0f0772a2] /usr/sbin/mysqld(handle_fatal_signal+0x305)[0x7fee0eb9a9d5] /lib64/libpthread.so.0(+0xf370)[0x7fee0e1b5370] /usr/sbin/mysqld(+0x88b37c)[0x7fee0ee6f37c] /usr/sbin/mysqld(+0x8911c3)[0x7fee0ee751c3] /usr/sbin/mysqld(+0x8ebbe8)[0x7fee0eecfbe8] /usr/sbin/mysqld(+0x817d70)[0x7fee0edfbd70] /lib64/libpthread.so.0(+0x7dc5)[0x7fee0e1addc5] /lib64/libc.so.6(clone+0x6d)[0x7fee0c5cc73d] ======= Memory map: ======== Do you have any idea when this issue will be fixed?

            Hi stijn

            Unfortunately we are not able to produce bug from our , If it is possible can you upload debug core dump

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Hi stijn Unfortunately we are not able to produce bug from our , If it is possible can you upload debug core dump
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.19, 10.1.22, 10.2.6-2, 10.2.6-3 [ 109, 143, 148, 150 ] 10.1.19, 10.1.22, 10.2.6-2, 10.2.6-3, 10.3.1-2 [ 109, 143, 148, 150, 174 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ] Andrei Elkin [ elkin ]
            timw_suqld Tim White added a comment -

            We are also now having this issue on Ubuntu 16.04, package version 10.1.25+maria-1~xenial from the MariaDB repositories, on a 3 node cluster.

            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: 170713  8:16:35 [ERROR] mysqld got signal 11 ;
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: This could be because you hit a bug. It is also possible that this binary
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: or one of the libraries it was linked against is corrupt, improperly built,
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: or misconfigured. This error can also be caused by malfunctioning hardware.
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: We will try our best to scrape up some info that will hopefully help
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: diagnose the problem, but since we have already crashed,
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: something is definitely wrong and this may fail.
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Server version: 10.1.25-MariaDB-1~xenial
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: key_buffer_size=268435456
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: read_buffer_size=2097152
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: max_used_connections=23
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: max_threads=258
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: thread_count=22
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: It is possible that mysqld could use up to
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1852580 K  bytes of memory
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Hope that's ok; if not, decrease some variables in the equation.
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Thread pointer: 0x7fa017c12008
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Attempting backtrace. You can use the following information to find out
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: where mysqld died. If you see no messages after this, something went
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: terribly wrong...
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: stack_bottom = 0x7fa3052bf0d8 thread_stack 0x48400
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55bb69dc6bde]
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x55bb69912415]
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fa303a30390]
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b)[0x55bb699c779b]
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(binlog_background_thread+0x2f3)[0x55bb699c7dd3]
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fa303a266ba]
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fa3030d13dd]
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Trying to get some variables.
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Some pointers may be invalid and cause the dump to abort.
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Query (0x0):
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Connection ID (thread ID): 3
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Status: NOT_KILLED
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: information that should help you find out what is causing the crash.
            Jul 13 08:16:35 suq-mariadb-2-2 systemd[1]: mariadb.service: Main process exited, code=dumped, status=11/SEGV
            Jul 13 08:16:35 suq-mariadb-2-2 systemd[1]: mariadb.service: Unit entered failed state.
            Jul 13 08:16:35 suq-mariadb-2-2 systemd[1]: mariadb.service: Failed with result 'core-dump'.
            Jul 13 08:16:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Service hold-off time over, scheduling restart.
            Jul 13 08:16:40 suq-mariadb-2-2 systemd[1]: Stopped MariaDB database server.
            Jul 13 08:16:40 suq-mariadb-2-2 systemd[1]: Starting MariaDB database server...
            Jul 13 08:16:41 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:41 140694126172416 [Note] /usr/sbin/mysqld (mysqld 10.1.25-MariaDB-1~
            xenial) starting as process 38882 ...
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Segmentation fault (core dumped)
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: WSREP: Failed to start mysqld for wsrep recovery: '2017-07-13  8:16:42 140694126172416
             [Note] InnoDB: Using mutexes to ref count buffer pool pages
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: The InnoDB memory heap is disabled
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Mutexes and rw_locks use GCC atomic
             builtins
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: GCC builtin __atomic_thread_fence()
             is used for memory barrier
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Compressed tables use zlib 1.2.8
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Using Linux native AIO
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Using SSE crc32 instructions
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Initializing buffer pool, size = 10
            .0G
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Completed initialization of buffer 
            pool
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Highest supported file format is Ba
            rracuda.
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:42 140694126172416 [Note] InnoDB: Starting crash recovery from checkp
            oint LSN=12274681738
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:49 140694126172416 [Note] InnoDB: Restoring possible half-written dat
            a pages from the doublewrite buffer...
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: Transaction 3771194 was in the XA prepared state.
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: 1 transaction(s) which must be rolled back or cleaned up
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: in total 0 row operations to undo
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: Trx id counter is 3771648
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:51 140694126172416 [Note] InnoDB: Starting final batch to recover 17 
            pages from redo log
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: Last MySQL binlog file position 0 104878519, file name /var/log/mysql/mariadb-
            bin.000201
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] InnoDB: 128 rollback segment(s) are active.
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140681746048768 [Note] InnoDB: Starting in background the rollback
             of recovered transactions
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140681746048768 [Note] InnoDB: Rollback of non-prepared transactio
            ns completed
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] InnoDB: Waiting for purge to start
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] InnoDB:  Percona XtraDB (http://www.percona
            .com) 5.6.36-82.0 started; log sequence number 12274782858
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Warning] InnoDB: Skipping buffer pool dump/restor
            e during wsrep recovery.
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] Plugin 'FEEDBACK' is disabled.
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] Recovering after a crash using /var/log/mys
            ql/mariadb-bin
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] Starting crash recovery...
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900  InnoDB: Starting recovery for XA transactions...
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900  InnoDB: Transaction 3771194 in prepared state after recovery
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900  InnoDB: Transaction contains changes to 1 rows
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900  InnoDB: 1 transactions in prepared state after recovery
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13  8:16:55 140694126172416 [Note] Found 1 prepared transaction(s) in InnoDB
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 170713  8:16:55 [ERROR] mysqld got signal 11 ;
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: This could be because you hit a bug. It is also possible that this binary
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: or one of the libraries it was linked against is corrupt, improperly built,
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: or misconfigured. This error can also be caused by malfunctioning hardware.
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: We will try our best to scrape up some info that will hopefully help
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: diagnose the problem, but since we have already crashed,
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: something is definitely wrong and this may fail.
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Server version: 10.1.25-MariaDB-1~xenial
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: key_buffer_size=268435456
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: read_buffer_size=2097152
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: max_used_connections=0
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: max_threads=258
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: thread_count=0
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: It is possible that mysqld could use up to
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1852580 K  bytes of memory
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Hope that's ok; if not, decrease some variables in the equation.
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Thread pointer: 0x0
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Attempting backtrace. You can use the following information to find out
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: where mysqld died. If you see no messages after this, something went
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: terribly wrong...
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: stack_bottom = 0x0 thread_stack 0x48400
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5604d1131bde]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x5604d0c7d415]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7ff5e5d3c390]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(wsrep_on+0x18)[0x5604d0c17988]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x805792)[0x5604d0eea792]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x805dc1)[0x5604d0eeadc1]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x599bff)[0x5604d0c7ebff]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_Z24plugin_foreach_with_maskP3THDPFcS0_P13st_plugin_intPvEijS3_+0x2cf)[0x5604d0afcdff]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_Z10ha_recoverP7st_hash+0xc6)[0x5604d0c80b06]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG7recoverEP11st_log_infoPKcP11st_io_cacheP28Format_description_log_eventb+0x58f)[0x5604d0d3359f]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG18do_binlog_recoveryEPKcb+0x372)[0x5604d0d33ba2]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG4openEPKc+0x30)[0x5604d0d36da0]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x376eb5)[0x5604d0a5beb5]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_Z11mysqld_mainiPPc+0x1af9)[0x5604d0a5f149]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7ff5e52f6830]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_start+0x29)[0x5604d0a532a9]
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: information that should help you find out what is causing the crash.'
            Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Control process exited, code=exited status=1
            Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: Failed to start MariaDB database server.
            Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Unit entered failed state.
            Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Failed with result 'exit-code'.
            

            timw_suqld Tim White added a comment - We are also now having this issue on Ubuntu 16.04, package version 10.1.25+maria-1~xenial from the MariaDB repositories, on a 3 node cluster. Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: 170713 8:16:35 [ERROR] mysqld got signal 11 ; Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: This could be because you hit a bug. It is also possible that this binary Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: or one of the libraries it was linked against is corrupt, improperly built, Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: or misconfigured. This error can also be caused by malfunctioning hardware. Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: We will try our best to scrape up some info that will hopefully help Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: diagnose the problem, but since we have already crashed, Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: something is definitely wrong and this may fail. Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Server version: 10.1.25-MariaDB-1~xenial Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: key_buffer_size=268435456 Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: read_buffer_size=2097152 Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: max_used_connections=23 Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: max_threads=258 Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: thread_count=22 Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: It is possible that mysqld could use up to Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1852580 K bytes of memory Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Hope that's ok; if not, decrease some variables in the equation. Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Thread pointer: 0x7fa017c12008 Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Attempting backtrace. You can use the following information to find out Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: where mysqld died. If you see no messages after this, something went Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: terribly wrong... Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: stack_bottom = 0x7fa3052bf0d8 thread_stack 0x48400 Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55bb69dc6bde] Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x55bb69912415] Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fa303a30390] Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b)[0x55bb699c779b] Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /usr/sbin/mysqld(binlog_background_thread+0x2f3)[0x55bb699c7dd3] Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fa303a266ba] Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fa3030d13dd] Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Trying to get some variables. Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Some pointers may be invalid and cause the dump to abort. Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Query (0x0): Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Connection ID (thread ID): 3 Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Status: NOT_KILLED Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains Jul 13 08:16:35 suq-mariadb-2-2 mysqld[2400]: information that should help you find out what is causing the crash. Jul 13 08:16:35 suq-mariadb-2-2 systemd[1]: mariadb.service: Main process exited, code=dumped, status=11/SEGV Jul 13 08:16:35 suq-mariadb-2-2 systemd[1]: mariadb.service: Unit entered failed state. Jul 13 08:16:35 suq-mariadb-2-2 systemd[1]: mariadb.service: Failed with result 'core-dump'. Jul 13 08:16:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Service hold-off time over, scheduling restart. Jul 13 08:16:40 suq-mariadb-2-2 systemd[1]: Stopped MariaDB database server. Jul 13 08:16:40 suq-mariadb-2-2 systemd[1]: Starting MariaDB database server... Jul 13 08:16:41 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:41 140694126172416 [Note] /usr/sbin/mysqld (mysqld 10.1.25-MariaDB-1~ xenial) starting as process 38882 ... Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Segmentation fault (core dumped) Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: WSREP: Failed to start mysqld for wsrep recovery: '2017-07-13 8:16:42 140694126172416 [Note] InnoDB: Using mutexes to ref count buffer pool pages Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:42 140694126172416 [Note] InnoDB: The InnoDB memory heap is disabled Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:42 140694126172416 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:42 140694126172416 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:42 140694126172416 [Note] InnoDB: Compressed tables use zlib 1.2.8 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:42 140694126172416 [Note] InnoDB: Using Linux native AIO Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:42 140694126172416 [Note] InnoDB: Using SSE crc32 instructions Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:42 140694126172416 [Note] InnoDB: Initializing buffer pool, size = 10 .0G Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:42 140694126172416 [Note] InnoDB: Completed initialization of buffer pool Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:42 140694126172416 [Note] InnoDB: Highest supported file format is Ba rracuda. Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:42 140694126172416 [Note] InnoDB: Starting crash recovery from checkp oint LSN=12274681738 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:49 140694126172416 [Note] InnoDB: Restoring possible half-written dat a pages from the doublewrite buffer... Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: Transaction 3771194 was in the XA prepared state. Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: 1 transaction(s) which must be rolled back or cleaned up Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: in total 0 row operations to undo Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: Trx id counter is 3771648 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:51 140694126172416 [Note] InnoDB: Starting final batch to recover 17 pages from redo log Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: InnoDB: Last MySQL binlog file position 0 104878519, file name /var/log/mysql/mariadb- bin.000201 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:55 140694126172416 [Note] InnoDB: 128 rollback segment(s) are active. Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:55 140681746048768 [Note] InnoDB: Starting in background the rollback of recovered transactions Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:55 140681746048768 [Note] InnoDB: Rollback of non-prepared transactio ns completed Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:55 140694126172416 [Note] InnoDB: Waiting for purge to start Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:55 140694126172416 [Note] InnoDB: Percona XtraDB (http://www.percona .com) 5.6.36-82.0 started; log sequence number 12274782858 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:55 140694126172416 [Warning] InnoDB: Skipping buffer pool dump/restor e during wsrep recovery. Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:55 140694126172416 [Note] Plugin 'FEEDBACK' is disabled. Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:55 140694126172416 [Note] Recovering after a crash using /var/log/mys ql/mariadb-bin Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:55 140694126172416 [Note] Starting crash recovery... Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900 InnoDB: Starting recovery for XA transactions... Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900 InnoDB: Transaction 3771194 in prepared state after recovery Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900 InnoDB: Transaction contains changes to 1 rows Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 08:16:55 7ff5e7699900 InnoDB: 1 transactions in prepared state after recovery Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 2017-07-13 8:16:55 140694126172416 [Note] Found 1 prepared transaction(s) in InnoDB Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: 170713 8:16:55 [ERROR] mysqld got signal 11 ; Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: This could be because you hit a bug. It is also possible that this binary Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: or one of the libraries it was linked against is corrupt, improperly built, Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: or misconfigured. This error can also be caused by malfunctioning hardware. Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: We will try our best to scrape up some info that will hopefully help Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: diagnose the problem, but since we have already crashed, Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: something is definitely wrong and this may fail. Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Server version: 10.1.25-MariaDB-1~xenial Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: key_buffer_size=268435456 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: read_buffer_size=2097152 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: max_used_connections=0 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: max_threads=258 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: thread_count=0 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: It is possible that mysqld could use up to Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1852580 K bytes of memory Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Hope that's ok; if not, decrease some variables in the equation. Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Thread pointer: 0x0 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: Attempting backtrace. You can use the following information to find out Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: where mysqld died. If you see no messages after this, something went Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: terribly wrong... Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: stack_bottom = 0x0 thread_stack 0x48400 Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5604d1131bde] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x5604d0c7d415] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7ff5e5d3c390] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(wsrep_on+0x18)[0x5604d0c17988] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x805792)[0x5604d0eea792] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x805dc1)[0x5604d0eeadc1] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x599bff)[0x5604d0c7ebff] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_Z24plugin_foreach_with_maskP3THDPFcS0_P13st_plugin_intPvEijS3_+0x2cf)[0x5604d0afcdff] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_Z10ha_recoverP7st_hash+0xc6)[0x5604d0c80b06] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG7recoverEP11st_log_infoPKcP11st_io_cacheP28Format_description_log_eventb+0x58f)[0x5604d0d3359f] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG18do_binlog_recoveryEPKcb+0x372)[0x5604d0d33ba2] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG4openEPKc+0x30)[0x5604d0d36da0] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(+0x376eb5)[0x5604d0a5beb5] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_Z11mysqld_mainiPPc+0x1af9)[0x5604d0a5f149] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7ff5e52f6830] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: /usr/sbin/mysqld(_start+0x29)[0x5604d0a532a9] Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains Jul 13 08:17:40 suq-mariadb-2-2 mysqld[38628]: information that should help you find out what is causing the crash.' Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Control process exited, code=exited status=1 Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: Failed to start MariaDB database server. Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Unit entered failed state. Jul 13 08:17:40 suq-mariadb-2-2 systemd[1]: mariadb.service: Failed with result 'exit-code'.
            Elkin Andrei Elkin added a comment -

            Hello, Tim.

            Technically speaking your report must be different from the original one that mentions
            binlog_background_thread.
            But at the same yours should be easier to fix. It would be great to have a copy of your server env
            to explore what causes this crash at server recovery. It feels this crash might be repeatable, that is it would
            show up again should you restart the server upon the first crash.

            Cheers,

            Andrei

            Elkin Andrei Elkin added a comment - Hello, Tim. Technically speaking your report must be different from the original one that mentions binlog_background_thread. But at the same yours should be easier to fix. It would be great to have a copy of your server env to explore what causes this crash at server recovery. It feels this crash might be repeatable, that is it would show up again should you restart the server upon the first crash. Cheers, Andrei
            spikestabber Brendan P added a comment -

            See my comment here: https://jira.mariadb.org/browse/MDEV-11053?focusedCommentId=97602&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-97602

            This is affecting production badly as we cannot alter a table with pt-online-schema-change without having it randomly crash on us.

            spikestabber Brendan P added a comment - See my comment here: https://jira.mariadb.org/browse/MDEV-11053?focusedCommentId=97602&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-97602 This is affecting production badly as we cannot alter a table with pt-online-schema-change without having it randomly crash on us.
            anikitin Andrii Nikitin (Inactive) made changes -
            GeoffMontee Geoff Montee (Inactive) made changes -
            Affects Version/s 10.1.23 [ 22503 ]
            serg Sergei Golubchik made changes -
            Affects Version/s 10.2 [ 14601 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.2 [ 14601 ]
            serg Sergei Golubchik made changes -
            Sprint 10.1.19, 10.1.22, 10.2.6-2, 10.2.6-3, 10.3.1-2 [ 109, 143, 148, 150, 174 ] 10.1.19, 10.1.22, 10.2.6-2, 10.2.6-3, 10.3.1-2, 10.2.10 [ 109, 143, 148, 150, 174, 183 ]
            spikestabber Brendan P added a comment -

            Happened again on two nodes after trying to alter a 10GB table:

            Node1 roughly 5 minutes after starting pt-osc:
            171014 21:44:55 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.1.28-MariaDB-1~xenial
            key_buffer_size=16777216
            read_buffer_size=524288
            max_used_connections=2049
            max_threads=4097
            thread_count=314
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 18937620 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x7f7aefcd9008
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7fc8bde3fff8 thread_stack 0x48400
            2017-10-14 21:44:57 140500174632704 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.57534S), skipping check
            /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5622aabd254e]
            /usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x5622aa71b005]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fc9c649c390]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG21do_checkpoint_requestEm+0x92)[0x5622aa7d1332]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG20checkpoint_and_purgeEm+0x11)[0x5622aa7d13c1]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG23trx_group_commit_leaderEPNS_18group_commit_entryE+0x4ec)[0x5622aa7d421c]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG34write_transaction_to_binlog_eventsEPNS_18group_commit_entryE+0x133)[0x5622aa7d4593]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG27write_transaction_to_binlogEP3THDP17binlog_cache_mngrP9Log_eventbbb+0xd0)[0x5622aa7d4840]
            /usr/sbin/mysqld(+0x659a06)[0x5622aa7d4a06]
            /usr/sbin/mysqld(+0x65a582)[0x5622aa7d5582]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13log_and_orderEP3THDybbb+0x44)[0x5622aa7d56e4]
            /usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x7ab)[0x5622aa71ebfb]
            /usr/sbin/mysqld(_Z17trans_commit_stmtP3THD+0x57)[0x5622aa66eb37]
            /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0xa9c)[0x5622aa58526c]
            /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x311)[0x5622aa58dce1]
            /usr/sbin/mysqld(+0x413635)[0x5622aa58e635]
            /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1ea5)[0x5622aa590d35]
            /usr/sbin/mysqld(_Z10do_commandP3THD+0x139)[0x5622aa591c09]
            /usr/sbin/mysqld(_Z26threadpool_process_requestP3THD+0xff)[0x5622aa6a0c9f]
            /usr/sbin/mysqld(+0x577805)[0x5622aa6f2805]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fc9c64926ba]
            /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc9c5b3d3dd]

            Node1 again shortly after recovering, deleting old triggers and restarting pt-osc alter:
            171014 21:59:46 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.1.28-MariaDB-1~xenial
            key_buffer_size=16777216
            read_buffer_size=524288
            max_used_connections=938
            max_threads=4097
            thread_count=315
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 18937620 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x7f314dc12008
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7f66911c10d8 thread_stack 0x48400
            /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55f6a68af54e]
            /usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x55f6a63f8005]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f66941ef390]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b)[0x55f6a64ae08b]
            /usr/sbin/mysqld(binlog_background_thread+0x35b)[0x55f6a64ae72b]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f66941e56ba]
            /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f66938903dd]

            Node2 with no alter running but merely during SST to Node1 since Node1 wouldn't recover after 2nd crash:
            WSREP_SST: [INFO] Rate-limiting SST to 60M (20171014 22:11:31.060)
            WSREP_SST: [INFO] Sleeping before data transfer for SST (20171014 22:11:31.062)
            2017-10-14 22:11:32 140016479106816 [Note] WSREP: (06100a81, 'tcp://0.0.0.0:4567') turning message relay requesting off
            WSREP_SST: [INFO] Streaming the backup to joiner at 10.0.0.16 4444 (20171014 22:11:41.066)
            WSREP_SST: [INFO] Evaluating innobackupex --no-version-check $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | snzip | pv -q -L $rlimit | socat -u stdio TCP:10.0.0.16:4444; RC=( ${PIPESTATUS[@]} ) (20171014 22:11:41.068)
            171014 22:25:06 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.1.28-MariaDB-1~xenial
            key_buffer_size=16777216
            read_buffer_size=524288
            max_used_connections=1501
            max_threads=4097
            thread_count=375
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 18926318 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x7f400c812008
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7f5924bbd0d8 thread_stack 0x48400
            /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55e5180db54e]
            /usr/sbin/mysqld(handle_fatal_signal+0x2f5)[0x55e517c24005]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f5927c04390]
            /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b)[0x55e517cda08b]
            /usr/sbin/mysqld(binlog_background_thread+0x35b)[0x55e517cda72b]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f5927bfa6ba]
            /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f59272a53dd]

            spikestabber Brendan P added a comment - Happened again on two nodes after trying to alter a 10GB table: Node1 roughly 5 minutes after starting pt-osc: 171014 21:44:55 [ERROR] mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.1.28-MariaDB-1~xenial key_buffer_size=16777216 read_buffer_size=524288 max_used_connections=2049 max_threads=4097 thread_count=314 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 18937620 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x7f7aefcd9008 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7fc8bde3fff8 thread_stack 0x48400 2017-10-14 21:44:57 140500174632704 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.57534S), skipping check /usr/sbin/mysqld(my_print_stacktrace+0x2e) [0x5622aabd254e] /usr/sbin/mysqld(handle_fatal_signal+0x2f5) [0x5622aa71b005] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390) [0x7fc9c649c390] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG21do_checkpoint_requestEm+0x92) [0x5622aa7d1332] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG20checkpoint_and_purgeEm+0x11) [0x5622aa7d13c1] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG23trx_group_commit_leaderEPNS_18group_commit_entryE+0x4ec) [0x5622aa7d421c] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG34write_transaction_to_binlog_eventsEPNS_18group_commit_entryE+0x133) [0x5622aa7d4593] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG27write_transaction_to_binlogEP3THDP17binlog_cache_mngrP9Log_eventbbb+0xd0) [0x5622aa7d4840] /usr/sbin/mysqld(+0x659a06) [0x5622aa7d4a06] /usr/sbin/mysqld(+0x65a582) [0x5622aa7d5582] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13log_and_orderEP3THDybbb+0x44) [0x5622aa7d56e4] /usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x7ab) [0x5622aa71ebfb] /usr/sbin/mysqld(_Z17trans_commit_stmtP3THD+0x57) [0x5622aa66eb37] /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0xa9c) [0x5622aa58526c] /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x311) [0x5622aa58dce1] /usr/sbin/mysqld(+0x413635) [0x5622aa58e635] /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1ea5) [0x5622aa590d35] /usr/sbin/mysqld(_Z10do_commandP3THD+0x139) [0x5622aa591c09] /usr/sbin/mysqld(_Z26threadpool_process_requestP3THD+0xff) [0x5622aa6a0c9f] /usr/sbin/mysqld(+0x577805) [0x5622aa6f2805] /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7fc9c64926ba] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fc9c5b3d3dd] Node1 again shortly after recovering, deleting old triggers and restarting pt-osc alter: 171014 21:59:46 [ERROR] mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.1.28-MariaDB-1~xenial key_buffer_size=16777216 read_buffer_size=524288 max_used_connections=938 max_threads=4097 thread_count=315 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 18937620 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x7f314dc12008 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f66911c10d8 thread_stack 0x48400 /usr/sbin/mysqld(my_print_stacktrace+0x2e) [0x55f6a68af54e] /usr/sbin/mysqld(handle_fatal_signal+0x2f5) [0x55f6a63f8005] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390) [0x7f66941ef390] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b) [0x55f6a64ae08b] /usr/sbin/mysqld(binlog_background_thread+0x35b) [0x55f6a64ae72b] /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f66941e56ba] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f66938903dd] Node2 with no alter running but merely during SST to Node1 since Node1 wouldn't recover after 2nd crash: WSREP_SST: [INFO] Rate-limiting SST to 60M (20171014 22:11:31.060) WSREP_SST: [INFO] Sleeping before data transfer for SST (20171014 22:11:31.062) 2017-10-14 22:11:32 140016479106816 [Note] WSREP: (06100a81, 'tcp://0.0.0.0:4567') turning message relay requesting off WSREP_SST: [INFO] Streaming the backup to joiner at 10.0.0.16 4444 (20171014 22:11:41.066) WSREP_SST: [INFO] Evaluating innobackupex --no-version-check $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | snzip | pv -q -L $rlimit | socat -u stdio TCP:10.0.0.16:4444; RC=( ${PIPESTATUS [@] } ) (20171014 22:11:41.068) 171014 22:25:06 [ERROR] mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.1.28-MariaDB-1~xenial key_buffer_size=16777216 read_buffer_size=524288 max_used_connections=1501 max_threads=4097 thread_count=375 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 18926318 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x7f400c812008 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f5924bbd0d8 thread_stack 0x48400 /usr/sbin/mysqld(my_print_stacktrace+0x2e) [0x55e5180db54e] /usr/sbin/mysqld(handle_fatal_signal+0x2f5) [0x55e517c24005] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390) [0x7f5927c04390] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8b) [0x55e517cda08b] /usr/sbin/mysqld(binlog_background_thread+0x35b) [0x55e517cda72b] /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f5927bfa6ba] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f59272a53dd]
            Elkin Andrei Elkin added a comment -

            Good news. One cause of the crashes is identified and the patch must be out in couple of days.

            Elkin Andrei Elkin added a comment - Good news. One cause of the crashes is identified and the patch must be out in couple of days.
            Elkin Andrei Elkin added a comment -

            The patch is being tried by BB now:

            git push https://github.com/MariaDB/server bb-10.1-mdev9510-andrei:bb-10.1-mdev9510-andrei

            Elkin Andrei Elkin added a comment - The patch is being tried by BB now: git push https://github.com/MariaDB/server bb-10.1-mdev9510-andrei:bb-10.1-mdev9510-andrei
            spikestabber Brendan P added a comment -

            That is great news, I've went over the patch and indeed it makes sense. I am curious though as I don't believe any of our applications actually call any session-only disabling of sql_log_bin directly, would this be something that galera makes use of internally to isolate queries to specific nodes? Or could this also be triggered by another condition? I know much of the crashes have been triggered by pt-online-schema-change but I've went over that also and don't see where sql_log_bin=0 would be getting called. Thank you!

            spikestabber Brendan P added a comment - That is great news, I've went over the patch and indeed it makes sense. I am curious though as I don't believe any of our applications actually call any session-only disabling of sql_log_bin directly, would this be something that galera makes use of internally to isolate queries to specific nodes? Or could this also be triggered by another condition? I know much of the crashes have been triggered by pt-online-schema-change but I've went over that also and don't see where sql_log_bin=0 would be getting called. Thank you!
            spikestabber Brendan P added a comment -

            I've also determined the "random" timeframe before the crash gets triggered after the alter or SST begins certainly concides with when the binary logs get rotated after reaching their 1GB set size, which can be anywhere from 3-5 to 10-15 minutes depending on write load...

            spikestabber Brendan P added a comment - I've also determined the "random" timeframe before the crash gets triggered after the alter or SST begins certainly concides with when the binary logs get rotated after reaching their 1GB set size, which can be anywhere from 3-5 to 10-15 minutes depending on write load...
            spikestabber Brendan P added a comment -

            Had 2 crashes today and have decided to run with the patch, so far its been running fine without incident, will keep you updated.

            spikestabber Brendan P added a comment - Had 2 crashes today and have decided to run with the patch, so far its been running fine without incident, will keep you updated.
            Elkin Andrei Elkin added a comment -

            Brendan, hello.

            No news from you for these days I read as no bad ones .
            Anyway I am submitting the patch for review very shortly.

            Thanks for helping!

            Andrei

            Elkin Andrei Elkin added a comment - Brendan, hello. No news from you for these days I read as no bad ones . Anyway I am submitting the patch for review very shortly. Thanks for helping! Andrei
            Elkin Andrei Elkin made changes -
            spikestabber Brendan P added a comment -

            Thank you, indeed no news = good. This is promising so far.

            I've done a few table alters over the weekend, restarted some nodes due to some innodb tunings, haven't had any crashes since the patch, I will however continue to run this build for now to make sure all is good. Thanks again!

            spikestabber Brendan P added a comment - Thank you, indeed no news = good. This is promising so far. I've done a few table alters over the weekend, restarted some nodes due to some innodb tunings, haven't had any crashes since the patch, I will however continue to run this build for now to make sure all is good. Thanks again!
            Elkin Andrei Elkin made changes -
            Elkin Andrei Elkin made changes -
            Elkin Andrei Elkin added a comment -

            The patch is polished a bit and a regression test is made to reused an existing one.
            Here is the pull request:
            https://github.com/MariaDB/server/pull/473
            I hope sachin.setiya.007 will review it promptly once back to the office.

            Elkin Andrei Elkin added a comment - The patch is polished a bit and a regression test is made to reused an existing one. Here is the pull request: https://github.com/MariaDB/server/pull/473 I hope sachin.setiya.007 will review it promptly once back to the office.
            Elkin Andrei Elkin added a comment -

            Hello, Sachin.

            I see you're back to work on Nov 2nd, Thur. I wonder if you could review mdev-9510 patch on that 44th week (and then it would be practically Thur and Fri for that). The patch is for 10.1, it's rather small.

            Cheers,

            Andrei

            Elkin Andrei Elkin added a comment - Hello, Sachin. I see you're back to work on Nov 2nd, Thur. I wonder if you could review mdev-9510 patch on that 44th week (and then it would be practically Thur and Fri for that). The patch is for 10.1, it's rather small. Cheers, Andrei
            Elkin Andrei Elkin made changes -
            Status In Progress [ 3 ] In Review [ 10002 ]
            serg Sergei Golubchik made changes -
            Sprint 10.1.19, 10.1.22, 10.2.6-2, 10.2.6-3, 10.3.1-2, 10.2.10 [ 109, 143, 148, 150, 174, 183 ] 10.1.19, 10.1.22, 10.2.6-2, 10.2.6-3, 10.3.1-2, 10.2.10, 10.1.29 [ 109, 143, 148, 150, 174, 183, 202 ]

            Okay to push, Sorry for delay.

            Regards
            sachin

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Okay to push, Sorry for delay. Regards sachin
            Elkin Andrei Elkin added a comment -

            Changing the post-review status accordingly.

            Elkin Andrei Elkin added a comment - Changing the post-review status accordingly.
            Elkin Andrei Elkin made changes -
            Status In Review [ 10002 ] Stalled [ 10000 ]
            Elkin Andrei Elkin made changes -
            Elkin Andrei Elkin made changes -
            serg Sergei Golubchik made changes -
            Sprint 10.1.19, 10.1.22, 10.2.6-2, 10.2.6-3, 10.3.1-2, 10.2.10, 10.1.29 [ 109, 143, 148, 150, 174, 183, 202 ] 10.1.19, 10.1.22, 10.2.6-2, 10.2.6-3, 10.3.1-2, 10.2.10, 10.1.29, 10.2.11 [ 109, 143, 148, 150, 174, 183, 202, 203 ]
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            Elkin Andrei Elkin added a comment -

            Pushed c7e38076f33.

            Elkin Andrei Elkin added a comment - Pushed c7e38076f33.
            Elkin Andrei Elkin made changes -
            Fix Version/s 10.1.30 [ 22637 ]
            Fix Version/s 10.2.11 [ 22634 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            spikestabber Brendan P added a comment -

            Thanks everyone, I am still running the same build with the original patch, not a single crash since. 100% fixed for sure.

            spikestabber Brendan P added a comment - Thanks everyone, I am still running the same build with the original patch, not a single crash since. 100% fixed for sure.
            elenst Elena Stepanova made changes -
            serg Sergei Golubchik made changes -
            danblack Daniel Black made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 73866 ] MariaDB v4 [ 150073 ]
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 199143 120351 104924 171005 125314 136735 180752

            People

              Elkin Andrei Elkin
              GeoffMontee Geoff Montee (Inactive)
              Votes:
              14 Vote for this issue
              Watchers:
              35 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.