Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Incomplete
-
10.4.20, 10.5.11
-
Ubuntu 20.04.2 LTS, Dedicated hosts per node
Description
Been seeing Galera nodes crashing within a few minutes of each other with days between incidents.
Running 2 clusters with 3 nodes each, one cluster running 10.5.11 and another cluster 10.4.20. From the logs, both clusters seem to be suffering crashes for the same reason:
Oct 16 19:34:41 db1-core mysqld[3629505]: terminate called after throwing an instance of 'boost::wrapexcept<std::system_error>' |
Oct 16 19:34:41 db1-core mysqld[3629505]: what(): remote_endpoint: Transport endpoint is not connected |
Oct 16 19:34:41 db1-core mysqld[3629505]: 211016 19:34:41 [ERROR] mysqld got signal 6 ; |
It appears that when the crash strikes one node, there is a high chance a second node will crash (with the same error) a few minutes after the 1st crash - causing the cluster to require a bootstrap. Other times, just one node will crash and automatically restart and rejoin the cluster 5-10 minutes later. Days between incidents overall.
I've attached logs from both clusters and a stack trace from the 10.5.11 node.
Attachments
Issue Links
- relates to
-
MDEV-25068 Node crashes with Transport endpoint is not connected mysqld got signal 6 ;
- Closed