Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Cannot Reproduce
-
10.0.14-galera
-
None
Description
May be already fixed, but I thought I'd file it to get it written down somewhere.
I was running a 3-node cluster on 3 separate VMs.
I had a bash loop
while :; do mysql -e "insert..."; sleep 1; done
|
I killed 2 of the VMs (but left the one running that my bash loop connected to). The remaining note formed a new non-primary component as expected. I did SET GLOBAL wsrep_cluster_address='gcomm://' to make the remaining cluster node primary. But the client started by the bash loop got stuck for quite a long time (10+ minutes?):
| 737 | unauthenticated user | connecting host | NULL | Connect | NULL | login | NULL | 0.000 |
|
I tried to kill the thread, but it stays in "Killed":
| 737 | unauthenticated user | connecting host | NULL | Killed | NULL | login | NULL | 0.000 |
|
Even killing the mysql process itself that's trying to connect doesn't make this MySQL thread go away.
So, it looks like there is some race condition that can cause a client to get stuck perhaps permanently in "login" when wsrep is trying to change the node/cluster state in some way.