Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.4.11, 10.4.12, 10.4.13
    • 10.4.14, 10.5.5
    • Galera, Tests

    Description

      Hosts in this cluster are occasionally crashing, with this message each time.

      Here's what got written to our mysql-error.log ...

      mysqld: /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.4.11/wsrep-lib/src/client_state.cpp:121: int wsrep::client_state::before_command(): Assertion `server_state_.rollback_mode() == wsrep::server_state::rm_async' failed.
      200211 21:41:48 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.

      To report this bug, see https://mariadb.com/kb/en/reporting-bugs

      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.

      Server version: 10.4.11-MariaDB
      key_buffer_size=33554432
      read_buffer_size=131072
      max_used_connections=278
      max_threads=65541
      thread_count=238
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 142750971 K bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.

      Thread pointer: 0x7f113c0c73b8
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f12fb93ec98 thread_stack 0x49000

          • stack smashing detected ***: /usr/sbin/mysqld terminated
            ======= Backtrace: =========
            /lib64/libc.so.6(__fortify_fail+0x37)[0x7f1513812877]
            /lib64/libc.so.6(__fortify_fail+0x0)[0x7f1513812840]
            /usr/sbin/mysqld(+0xe3b93c)[0x562eb83c893c]
            /usr/sbin/mysqld(my_print_stacktrace+0x1c6)[0x562eb83b17a6]
            /usr/sbin/mysqld(handle_fatal_signal+0x4b7)[0x562eb7e80557]
            /lib64/libpthread.so.0(+0x3ff8c0f7e0)[0x7f15150e87e0]
            /lib64/libc.so.6(gsignal+0x35)[0x7f15137424f5]
            /lib64/libc.so.6(abort+0x175)[0x7f1513743cd5]
            /lib64/libc.so.6(+0x3ff882b66e)[0x7f151373b66e]
            /lib64/libc.so.6(__assert_perror_fail+0x0)[0x7f151373b730]
            /usr/sbin/mysqld(_ZN5wsrep12client_state14before_commandEv+0x320)[0x562eb8431150]
            /usr/sbin/mysqld(_Z10do_commandP3THD+0x1a4)[0x562eb7c9d1d4]
            /usr/sbin/mysqld(_Z11tp_callbackP13TP_connection+0x58)[0x562eb7e55568]
            /usr/sbin/mysqld(+0xa56dd8)[0x562eb7fe3dd8]
            /usr/sbin/mysqld(+0xa856fd)[0x562eb80126fd]
            /lib64/libpthread.so.0(+0x3ff8c07aa1)[0x7f15150e0aa1]
            /lib64/libc.so.6(clone+0x6d)[0x7f15137f8c4d]

      Attachments

        Activity

          khoov Kent Hoover created issue -

          khoov, what did you mean by setting the label `buildbot`? Have you seen it happen in MariaDB buildbot? Or in your buildbot? Or in someone else's?

          elenst Elena Stepanova added a comment - khoov , what did you mean by setting the label `buildbot`? Have you seen it happen in MariaDB buildbot? Or in your buildbot? Or in someone else's?
          elenst Elena Stepanova made changes -
          Field Original Value New Value
          Component/s wsrep [ 11500 ]
          Component/s Server [ 13907 ]
          Fix Version/s 10.4 [ 22408 ]
          Assignee Jan Lindström [ jplindst ]
          khoov Kent Hoover added a comment -

          Hi, Elena:
          I spotted "buildbot" in the pathname of the source file identified in the error log... so I just added it to the labels.

          Thanks,
          Kent

          khoov Kent Hoover added a comment - Hi, Elena: I spotted "buildbot" in the pathname of the source file identified in the error log... so I just added it to the labels. Thanks, Kent
          elenst Elena Stepanova made changes -
          Labels buildbot crash crash
          khoov Kent Hoover added a comment -

          Hello, Elena:

          Any update/progress regarding this error... We've upgraded this environment to 10.4.12, and are still encountering this crash.

          Thanks,
          Kent

          khoov Kent Hoover added a comment - Hello, Elena: Any update/progress regarding this error... We've upgraded this environment to 10.4.12, and are still encountering this crash. Thanks, Kent

          Hi,

          we're experiencing the same issue with 10.4.13. Any news on this?

          robhost Robert Klikics added a comment - Hi, we're experiencing the same issue with 10.4.13. Any news on this?
          khoov Kent Hoover added a comment -

          I upgraded my site to 10.4.13/galera-4-26.4.4-1 , still experiencing these occasional crashes.

          Typically, mysqld dies on 2 of my 3 servers, with the WSREP complaint about the failed assertion. The surviving host refuses updates at that point.
          So, recovery is usually not straightforward.

          Any updates?

          Thanks,
          Kent

          khoov Kent Hoover added a comment - I upgraded my site to 10.4.13/galera-4-26.4.4-1 , still experiencing these occasional crashes. Typically, mysqld dies on 2 of my 3 servers, with the WSREP complaint about the failed assertion. The surviving host refuses updates at that point. So, recovery is usually not straightforward. Any updates? Thanks, Kent
          khoov Kent Hoover added a comment -

          This Percona bug [ https://jira.percona.com/browse/PXC-2935 ] (recently fixed) looks the same as this one .

          khoov Kent Hoover added a comment - This Percona bug [ https://jira.percona.com/browse/PXC-2935 ] (recently fixed) looks the same as this one .
          jplindst Jan Lindström (Inactive) made changes -
          Assignee Jan Lindström [ jplindst ] Teemu Ollakka [ teemu.ollakka ]
          khoov Kent Hoover added a comment -

          Thanks, Jan and Teemu for digging in...

          Question: If this does match the Percona case, is it reasonable that we could work around this by changing configuration from pool-of-threads to one-thread-per-connection (as long as our application can tolerate the effect of this change)?

          Cheers,
          Kent

          khoov Kent Hoover added a comment - Thanks, Jan and Teemu for digging in... Question: If this does match the Percona case, is it reasonable that we could work around this by changing configuration from pool-of-threads to one-thread-per-connection (as long as our application can tolerate the effect of this change)? Cheers, Kent
          khoov Kent Hoover made changes -
          Affects Version/s 10.4.13 [ 24223 ]
          Affects Version/s 10.4.12 [ 24019 ]
          teemu.ollakka Teemu Ollakka added a comment -

          Hi,

          It looks that changing to one-thread-per-connection should make this crash to go away.

          • Teemu
          teemu.ollakka Teemu Ollakka added a comment - Hi, It looks that changing to one-thread-per-connection should make this crash to go away. Teemu
          khoov Kent Hoover added a comment -

          Thanks for the quick reply, Teemu.
          (We'll still be looking forward to the fix, of course ďż˝ )

          Kent

          khoov Kent Hoover added a comment - Thanks for the quick reply, Teemu. (We'll still be looking forward to the fix, of course ďż˝ ) Kent
          jplindst Jan Lindström (Inactive) made changes -
          Assignee Teemu Ollakka [ teemu.ollakka ] Jan Lindström [ jplindst ]
          jplindst Jan Lindström (Inactive) made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          jplindst Jan Lindström (Inactive) made changes -
          issue.field.resolutiondate 2020-07-24 11:02:59.0 2020-07-24 11:02:59.853
          jplindst Jan Lindström (Inactive) made changes -
          Component/s Galera [ 10124 ]
          Component/s Tests [ 10800 ]
          Component/s wsrep [ 11500 ]
          Fix Version/s 10.4.14 [ 24305 ]
          Fix Version/s 10.5.5 [ 24423 ]
          Fix Version/s 10.4 [ 22408 ]
          Resolution Fixed [ 1 ]
          Status In Progress [ 3 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 103913 ] MariaDB v4 [ 157319 ]

          People

            jplindst Jan Lindström (Inactive)
            khoov Kent Hoover
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.