Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
6.4.10
-
3 maxscale nodes behind ALB, aws vms
Description
A 5-node galera cluster loses two nodes (NODE03 and NODE04) within a couple minutes due to OOM events.
The cluster reconfigures and remains healthy with the remaining 3 nodes.
However, maxscale loses status for ALL nodes and causes an outage.
2023-09-25 14:07:29.827 error : (mon_report_query_error): Failed to execute query on server 'NODE04' ([10.225.27.118]:3306): Lost connection to server during query
|
2023-09-25 14:08:10.003 notice : (log_state_change): Server changed state: NODE04[10.225.27.118:3306]: slave_down. [Slave, Synced, Running] -> [Down]
|
2023-09-25 14:09:06.851 error : (985612) (NODE03); (socket_write): Write to Backend DCB 10.225.27.183 in state DCB::State::POLLING failed: 104, Connection reset by peer
|
2023-09-25 14:09:30.579 error : [galeramon] (post_tick): There are no cluster members
|
2023-09-25 14:09:30.579 notice : (log_state_change): Server changed state: NODE01[10.225.27.121:3306]: lost_master. [Master, Synced, Running] -> [Running]
|
2023-09-25 14:09:30.579 notice : (log_state_change): Server changed state: NODE02[10.225.27.156:3306]: lost_slave. [Slave, Synced, Running] -> [Running]
|
2023-09-25 14:09:30.579 notice : (log_state_change): Server changed state: NODE03[10.225.27.183:3306]: slave_down. [Slave, Synced, Running] -> [Down]
|
2023-09-25 14:09:30.579 notice : (log_state_change): Server changed state: NODE05[10.225.27.142:3306]: lost_slave. [Slave, Synced, Running] -> [Running]
|
2023-09-25 14:09:30.579 notice : (log_state_change): Server changed state: NODER02[10.225.27.158:3306]: lost_slave. [Slave, Running] -> [Running]
|
2023-09-25 14:09:30.579 notice : (log_state_change): Server changed state: NODER03[10.225.27.172:3306]: lost_slave. [Slave, Running] -> [Running]
|
2023-09-25 14:09:30.579 notice : (log_state_change): Server changed state: NODER04[10.225.27.116:3306]: lost_slave. [Slave, Running] -> [Running]2023-09-25 2023-09-25 14:09:30.594 error : (987213) [readwritesplit] (rwsplit-service); (open_connections): Couldn't find suitable Master from 5 candidates.
|