Been seeing Galera nodes crashing within a few minutes of each other with days between incidents.
Running 2 clusters with 3 nodes each, one cluster running 10.5.11 and another cluster 10.4.20. From the logs, both clusters seem to be suffering crashes for the same reason:
It appears that when the crash strikes one node, there is a high chance a second node will crash (with the same error) a few minutes after the 1st crash - causing the cluster to require a bootstrap. Other times, just one node will crash and automatically restart and rejoin the cluster 5-10 minutes later. Days between incidents overall.
I've attached logs from both clusters and a stack trace from the 10.5.11 node.