Customer created test case to simulate production stuck when running percona backup.
This script should be run on 3 node galera cluster without maxscale. All nodes should be master What this script does is that.
0) ./dealock.sh <user> <password>
1) create test database and 2 tables.
2) insert rows into 2 tables from node1
3) run large select query from node1 and run flush tables there
4) run insert both 2 tables from node2 and node3 at the same time
5) run flush tables from node3
6) run insert again both 2 tables from node2 and node3 at the same time
7) stop the server from node1
8) check both node2 and node3 processlist.
Once this script finished, both node2 and node3 got stuck so nothing could be run until restarting all nodes.
But the interesting thing is that node2 and node3 do not stuck if node1 is in wsrep_local_index=0 state. So customer wants to understand why wsrep_local_index=0 node does not cause any stuck if it's server stopped.