Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28697

One Node crashed and did not recover

    XMLWordPrintable

Details

    Description

      We're running a 3 Node galera cluster, that worked fine for a month (since update). Last week one node crashed and couldn't restart itself. The cluster is running on small nodes and has very little load (barely any concurrent connections).

      Here are the logs for the whole day.

      May 26 00:30:29 sql-galera-0 mariadbd[28662]: 2022-05-26  0:30:29 3575702 [Warning] Aborted connection 3575702 to db: 'unconnected' user: 'unauthenticated' host: 'scanner-05.ch1.censys-scanner.com' (This connection closed normally without authentication)
      May 26 00:43:55 sql-galera-0 mariadbd[28662]: 2022-05-26  0:43:55 3578279 [Warning] IP address '35.195.93.98' has been resolved to the host name '98.93.195.35.bc.googleusercontent.com', which resembles IPv4-address itself.
      May 26 00:43:55 sql-galera-0 mariadbd[28662]: 2022-05-26  0:43:55 3578279 [Warning] Aborted connection 3578279 to db: 'unconnected' user: 'unauthenticated' host: '35.195.93.98' (This connection closed normally without authentication)
      May 26 00:43:55 sql-galera-0 mariadbd[28662]: 2022-05-26  0:43:55 3578280 [Warning] IP address '35.233.62.116' has been resolved to the host name '116.62.233.35.bc.googleusercontent.com', which resembles IPv4-address itself.
      May 26 00:43:55 sql-galera-0 mariadbd[28662]: 2022-05-26  0:43:55 3578280 [Warning] Access denied for user 'root'@'35.233.62.116'
      May 26 02:23:49 sql-galera-0 mariadbd[28662]: 2022-05-26  2:23:49 3597465 [Warning] Aborted connection 3597465 to db: 'unconnected' user: 'unauthenticated' host: 'scan-57-18.security.ipip.net' (This connection closed normally without authentication)
      May 26 06:31:39 sql-galera-0 mariadbd[28662]: 2022-05-26  6:31:39 3645075 [Warning] Aborted connection 3645075 to db: 'unconnected' user: 'unauthenticated' host: 'scanner-07.ch1.censys-scanner.com' (This connection closed normally without authentication)
      May 26 11:42:28 sql-galera-0 mariadbd[28662]: 2022-05-26 11:42:28 3704988 [Warning] IP address '154.89.5.80' could not be resolved: Name or service not known
      May 26 11:42:28 sql-galera-0 mariadbd[28662]: 2022-05-26 11:42:28 3704988 [Warning] Aborted connection 3704988 to db: 'unconnected' user: 'unauthenticated' host: '154.89.5.80' (This connection closed normally without authentication)
      May 26 11:47:15 sql-galera-0 mariadbd[28662]: 2022-05-26 11:47:15 3705903 [Warning] IP address '165.227.109.30' could not be resolved: Name or service not known
      May 26 11:47:15 sql-galera-0 mariadbd[28662]: 2022-05-26 11:47:15 3705903 [Warning] Aborted connection 3705903 to db: 'unconnected' user: 'unauthenticated' host: '165.227.109.30' (This connection closed normally without authentication)
      May 26 11:47:15 sql-galera-0 mariadbd[28662]: 2022-05-26 11:47:15 3705904 [Warning] Aborted connection 3705904 to db: 'unconnected' user: 'unauthenticated' host: '165.227.109.30' (This connection closed normally without authentication)
      May 26 11:47:15 sql-galera-0 mariadbd[28662]: 2022-05-26 11:47:15 3705905 [Warning] Aborted connection 3705905 to db: 'unconnected' user: 'unauthenticated' host: '165.227.109.30' (This connection closed normally without authentication)
      May 26 11:47:15 sql-galera-0 mariadbd[28662]: 2022-05-26 11:47:15 3705906 [Warning] Aborted connection 3705906 to db: 'unconnected' user: 'unauthenticated' host: '165.227.109.30' (This connection closed normally without authentication)
      May 26 11:47:15 sql-galera-0 mariadbd[28662]: 2022-05-26 11:47:15 3705907 [Warning] Aborted connection 3705907 to db: 'unconnected' user: 'unauthenticated' host: '165.227.109.30' (This connection closed normally without authentication)
      May 26 11:47:16 sql-galera-0 mariadbd[28662]: 2022-05-26 11:47:16 3705908 [Warning] Aborted connection 3705908 to db: 'unconnected' user: 'unauthenticated' host: '165.227.109.30' (This connection closed normally without authentication)
      May 26 11:47:16 sql-galera-0 mariadbd[28662]: 2022-05-26 11:47:16 3705909 [Warning] Aborted connection 3705909 to db: 'unconnected' user: 'unauthenticated' host: '165.227.109.30' (This connection closed normally without authentication)
      May 26 11:47:16 sql-galera-0 mariadbd[28662]: 2022-05-26 11:47:16 3705910 [Warning] Aborted connection 3705910 to db: 'unconnected' user: 'unauthenticated' host: '165.227.109.30' (This connection closed normally without authentication)
      May 26 12:23:12 sql-galera-0 mariadbd[28662]: 2022-05-26 12:23:12 3712876 [Warning] Hostname 'zg-0421e-161.stretchoid.com' does not resolve to '192.241.222.228'.
      May 26 12:23:12 sql-galera-0 mariadbd[28662]: 2022-05-26 12:23:12 3712876 [Note] Hostname 'zg-0421e-161.stretchoid.com' has the following IP addresses:
      May 26 12:23:12 sql-galera-0 mariadbd[28662]: 2022-05-26 12:23:12 3712876 [Note]  - 127.0.0.1
      May 26 12:23:12 sql-galera-0 mariadbd[28662]: 2022-05-26 12:23:12 3712876 [Warning] Aborted connection 3712876 to db: 'unconnected' user: 'unauthenticated' host: '192.241.222.228' (This connection closed normally without authentication)
      May 26 14:26:04 sql-galera-0 mariadbd[28662]: 2022-05-26 14:26:04 0 [Note] WSREP: (9615c121-9136, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr  timed out, no messages seen in PT3S, socket stats: rtt: 145655 rttvar: 54657 rto: 440000 lost: 0 last_data_recv: 3168 cwnd: 10 last_queued_since: 31697088May 26 16:48:17 sql-galera-0 mariadbd[28662]: 2022-05-26 16:48:17 3764008 [Warning] Aborted connection 3764008 to db: 'unconnected' user: 'unauthenticated' host: 'scan-11.shadowserver.org' (This connection closed normally without authentication)
      May 26 19:47:53 sql-galera-0 mariadbd[28662]: 2022-05-26 19:47:53 0 [Warning] WSREP: unserialize error invalid protocol version 4: 71 (Protocol error)
      May 26 19:47:53 sql-galera-0 mariadbd[28662]:          at /home/buildbot/buildbot/build/gcomm/src/gcomm/datagram.hpp:unserialize():133
      May 26 20:04:04 sql-galera-0 mariadbd[28662]: 2022-05-26 20:04:04 3801665 [Warning] IP address '205.210.31.150' could not be resolved: Name or service not known
      May 26 20:04:14 sql-galera-0 mariadbd[28662]: 2022-05-26 20:04:14 3801665 [Warning] Aborted connection 3801665 to db: 'unconnected' user: 'unauthenticated' host: '205.210.31.150' (This connection closed normally without authentication)
      May 26 20:49:33 sql-galera-0 mariadbd[28662]: 2022-05-26 20:49:33 3810414 [Warning] Aborted connection 3810414 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (This connection closed normally without authentication)
      May 26 20:49:33 sql-galera-0 mariadbd[28662]: 2022-05-26 20:49:33 3810415 [Warning] IP address '194.165.16.73' could not be resolved: Temporary failure in name resolution
      May 26 20:49:33 sql-galera-0 mariadbd[28662]: 2022-05-26 20:49:33 3810415 [Warning] Aborted connection 3810415 to db: 'unconnected' user: 'unauthenticated' host: '194.165.16.73' (This connection closed normally without authentication)
      May 26 20:49:33 sql-galera-0 mariadbd[28662]: 2022-05-26 20:49:33 3810416 [Warning] IP address '194.165.16.73' could not be resolved: Temporary failure in name resolution
      May 26 20:49:33 sql-galera-0 mariadbd[28662]: 2022-05-26 20:49:33 3810416 [Warning] Aborted connection 3810416 to db: 'unconnected' user: 'unauthenticated' host: '194.165.16.73' (This connection closed normally without authentication)
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<std::system_error> >'
      May 26 21:46:03 sql-galera-0 mariadbd[28662]:   what():  remote_endpoint: Transport endpoint is not connected
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: 220526 21:46:03 [ERROR] mysqld got signal 6 ;
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: This could be because you hit a bug. It is also possible that this binary
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: or one of the libraries it was linked against is corrupt, improperly built,
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: or misconfigured. This error can also be caused by malfunctioning hardware.
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: We will try our best to scrape up some info that will hopefully help
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: diagnose the problem, but since we have already crashed,
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: something is definitely wrong and this may fail.
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: Server version: 10.6.5-MariaDB-1:10.6.5+maria~bionic-log
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: key_buffer_size=134217728
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: read_buffer_size=131072
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: max_used_connections=152
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: max_threads=153
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: thread_count=153
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: It is possible that mysqld could use up to
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467957 K  bytes of memory
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: Hope that's ok; if not, decrease some variables in the equation.
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: Thread pointer: 0x0
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: Attempting backtrace. You can use the following information to find out
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: where mysqld died. If you see no messages after this, something went
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: terribly wrong...
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: stack_bottom = 0x0 thread_stack 0x49000
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: Printing to addr2line failed
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: /usr/sbin/mariadbd(my_print_stacktrace+0x2e)[0x5571ab05a7ce]
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: /usr/sbin/mariadbd(handle_fatal_signal+0x545)[0x5571aaaab345]
      May 26 21:46:03 sql-galera-0 mariadbd[28662]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f9d7b70c890]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7f9d7aa1ce97]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7f9d7aa1e801]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8c957)[0x7f9d7b1f9957]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92ab6)[0x7f9d7b1ffab6]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92af1)[0x7f9d7b1ffaf1]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92d24)[0x7f9d7b1ffd24]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/libgalera_smm.so(+0x1d1206)[0x7f9d79396206]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/libgalera_smm.so(+0x1d1322)[0x7f9d79396322]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/libgalera_smm.so(+0x1dff33)[0x7f9d793a4f33]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/libgalera_smm.so(+0x1e2a05)[0x7f9d793a7a05]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/libgalera_smm.so(+0x1e95f8)[0x7f9d793ae5f8]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/libgalera_smm.so(+0x1da424)[0x7f9d7939f424]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/libgalera_smm.so(+0x1ca036)[0x7f9d7938f036]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/libgalera_smm.so(+0xdfc2a)[0x7f9d792a4c2a]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/libgalera_smm.so(+0xbfd64)[0x7f9d79284d64]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /usr/lib/libgalera_smm.so(+0xc0649)[0x7f9d79285649]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f9d7b7016db]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f9d7aaff88f]
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: information that should help you find out what is causing the crash.
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: Writing a core file...
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: Working directory at /var/lib/mysql
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: Resource Limits:
      May 26 21:46:04 sql-galera-0 mariadbd[28662]: Fatal signal 11 while backtracing
      May 26 21:46:04 sql-galera-0 systemd[1]: mariadb.service: Main process exited, code=dumped, status=11/SEGV
      May 26 21:46:04 sql-galera-0 systemd[1]: mariadb.service: Failed with result 'core-dump'.
      May 26 21:46:09 sql-galera-0 systemd[1]: mariadb.service: Service hold-off time over, scheduling restart.
      May 26 21:46:09 sql-galera-0 systemd[1]: mariadb.service: Scheduled restart job, restart counter is at 9.
      -- Subject: Automatic restarting of a unit has been scheduled
      -- Defined-By: systemd
      -- Support: http://www.ubuntu.com/support
      --
      -- Automatic restarting of the unit mariadb.service has been scheduled, as the result for
      -- the configured Restart= setting for the unit.
      May 26 21:46:09 sql-galera-0 systemd[1]: Stopped MariaDB 10.6.5 database server.
      -- Subject: Unit mariadb.service has finished shutting down
      -- Defined-By: systemd
      -- Support: http://www.ubuntu.com/support
      --
      -- Unit mariadb.service has finished shutting down.
      May 26 21:46:09 sql-galera-0 systemd[1]: Starting MariaDB 10.6.5 database server...
      -- Subject: Unit mariadb.service has begun start-up
      -- Defined-By: systemd
      -- Support: http://www.ubuntu.com/support
      --
      -- Unit mariadb.service has begun starting up.
      May 26 22:01:09 sql-galera-0 systemd[1]: mariadb.service: Start-pre operation timed out. Terminating.
      May 26 22:16:10 sql-galera-0 systemd[1]: mariadb.service: State 'stop-final-sigterm' timed out. Skipping SIGKILL. Entering failed mode.
      May 26 22:16:10 sql-galera-0 systemd[1]: mariadb.service: Failed with result 'timeout'.
      May 26 22:16:10 sql-galera-0 systemd[1]: Failed to start MariaDB 10.6.5 database server.
      -- Subject: Unit mariadb.service has failed
      -- Defined-By: systemd
      -- Support: http://www.ubuntu.com/support
      --
      -- Unit mariadb.service has failed.
      --
      -- The result is RESULT.
      

      When I tried restarting the server manual I got the mesage

      Can't lock aria control file '/var/lib/mysql/aria_log_control' for exclusive use
      

      I noticed some mysqld instances were still running, after I killed those the server started again.

      I hope this helps.

      Attachments

        1. 60-galera.cnf
          0.7 kB
          Franz Ehrlich
        2. 50-server.cnf
          4 kB
          Franz Ehrlich

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              fehrlich Franz Ehrlich
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.