Details
Description
This happened during a usage spike, ooms occurred on the underlying galera nodes during this time as well.
maxscale 1 lost sight to all db nodes
2023-09-25 13:19:18.614 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:19:32.314 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:23:00.946 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:23:16.987 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server at 'handshake: reading initial communication packet', system error: 110
|
2023-09-25 13:23:45.717 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:23:56.736 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:24:11.889 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:24:20.371 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:24:30.386 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:26:20.262 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:26:31.049 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:27:03.119 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:27:22.378 error : (mxs_mysql_real_connect): Failed to set latin1 character set: Lost connection to server during query
|
2023-09-25 13:27:41.054 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:28:14.823 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:28:47.975 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:28:58.986 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB03' ([10.225.27.183]:3306): Lost connection to server during query
|
2023-09-25 13:29:09.953 error : (mxs_mysql_real_connect): Failed to set latin1 character set: Lost connection to server during query
|
2023-09-25 13:32:34.385 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB04' ([10.225.27.118]:3306): Lost connection to server during query
|
2023-09-25 13:41:44.306 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB04' ([10.225.27.118]:3306): Lost connection to server during query
|
2023-09-25 13:42:05.403 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB04' ([10.225.27.118]:3306): Lost connection to server during query
|
2023-09-25 13:42:19.746 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB04' ([10.225.27.118]:3306): Lost connection to server during query
|
2023-09-25 13:42:30.649 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB04' ([10.225.27.118]:3306): Lost connection to server during query
|
2023-09-25 13:42:44.061 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB04' ([10.225.27.118]:3306): Lost connection to server during query
|
2023-09-25 13:43:09.442 error : (mxs_mysql_real_connect): Failed to set latin1 character set: Lost connection to server during query
|
2023-09-25 13:43:18.010 error : (mxs_mysql_real_connect): Failed to set latin1 character set: Lost connection to server during query
|
2023-09-25 13:43:18.027 notice : (log_state_change): Server changed state: L1LPVERGENTDB04[10.225.27.118:3306]: slave_down. [Slave, Synced, Running] -> [Down]
|
2023-09-25 13:43:18.027 error : (983789) [readwritesplit] (rwsplit-service); (handleError): Server 'L1LPVERGENTDB04' was lost in the middle of a resultset, cannot continue the session: #HY000: Lost connection to backend server: connection closed by peer (L1LPVERGENTDB04: Generated event)
|
2023-09-25 13:43:22.377 notice : (log_state_change): Server changed state: L1LPVERGENTDB04[10.225.27.118:3306]: slave_up. [Down] -> [Slave, Synced, Running]
|
2023-09-25 14:07:29.827 error : (mon_report_query_error): Failed to execute query on server 'L1LPVERGENTDB04' ([10.225.27.118]:3306): Lost connection to server during query
|
2023-09-25 14:08:10.003 notice : (log_state_change): Server changed state: L1LPVERGENTDB04[10.225.27.118:3306]: slave_down. [Slave, Synced, Running] -> [Down]
|
2023-09-25 14:09:06.851 error : (985612) (L1LPVERGENTDB03); (socket_write): Write to Backend DCB 10.225.27.183 in state DCB::State::POLLING failed: 104, Connection reset by peer
|
2023-09-25 14:09:30.579 error : [galeramon] (post_tick): There are no cluster members
|
Maxscales 2 and 3 became unresponsive and hit a signal 6
2023-09-25 13:24:34.499 notice : (log_state_change): Server changed state: L1LPVERGENTDB03[10.225.27.183:3306]: new_slave. [Running] -> [Slave, Synced, Running]
|
2023-09-25 13:25:29.198 warning: (notify_systemd_watchdog): Thread 'Worker-01' has not reported back in 30 seconds.
|
2023-09-25 13:25:59.198 warning: (notify_systemd_watchdog): Thread 'Worker-01' has not reported back in 30 seconds.
|
alert : MaxScale 6.4.10 received fatal signal 6. Commit ID: e21972b152407ccc41a5c835c91774578c64c472 System name: Linux Release string: PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
|
|
|
2023-09-25 13:25:59.268 alert : (sigfatal_handler): MaxScale 6.4.10 received fatal signal 6. Commit ID: e21972b152407ccc41a5c835c91774578c64c472 System name: Linux Release string: PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
|
2023-09-25 13:25:59.268 alert : (sigfatal_handler): Statement currently being classified: none/unknown
|
2023-09-25 13:25:59.268 notice : (sigfatal_handler): For a more detailed stacktrace, install GDB and add 'debug=gdb-stacktrace' under the [maxscale] section.
|
/lib/x86_64-linux-gnu/libc.so.6(epoll_wait+0x56) [0x7f0e0895dd56]: <binutils not installed>
|
/usr/lib/x86_64-linux-gnu/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase6Worker15poll_waiteventsEv+0x104) [0x7f0e09412b34]: <binutils not installed>
|
/usr/lib/x86_64-linux-gnu/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase6Worker3runEPNS_9SemaphoreE+0x4f) [0x7f0e09412e9f]: <binutils not installed>
|
/usr/bin/maxscale(main+0x1c87) [0x55bb71fb9fe7]: <binutils not installed>
|
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x7f0e08885d0a]: <binutils not installed>
|
/usr/bin/maxscale(_start+0x2a) [0x55bb71fba5ea]: <binutils not installed>
|
alert : Writing core dump.
|
|
|
MariaDB MaxScale /var/log/maxscale/maxscale.log Mon Sep 25 14:26:07 2023
|
----------------------------------------------------------------------------
|
2023-09-25 14:26:07 notice : (WatchdogNotifier): The systemd watchdog is Enabled. Internal timeout = 30s
|
We are trying to determine why maxscale would become unresponsive after issues on 1 maxscale node and OOMs on the monitored DBs