Details
Description
If two nodes have the same my_uuid value in their gvwstate.dat and if they try to join the same cluster, then you would think that they would notice the conflict, and that one of them would throw an error. Instead, they just get stuck in an infinite loop of timeouts.
For example, let's say that two nodes have the following in gvwstate.dat:
my_uuid: 5025de8a-15db-11e9-b571-abb3e219b4d0
|
And then the first node starts up:
2019-01-14 11:33:38 140385718540256 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
|
2019-01-14 11:33:38 140385718540256 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
|
2019-01-14 11:33:38 140385718540256 [Note] WSREP: EVS version 0
|
2019-01-14 11:33:38 140385718540256 [Note] WSREP: gcomm: bootstrapping new group 'my_wsrep_cluster'
|
2019-01-14 11:33:38 140385718540256 [Note] WSREP: start_prim is enabled, turn off pc_recovery
|
2019-01-14 11:33:38 140385718540256 [Note] WSREP: Node 5025de8a state prim
|
2019-01-14 11:33:38 140385718540256 [Note] WSREP: view(view_id(PRIM,5025de8a,6) memb {
|
5025de8a,0
|
} joined {
|
} left {
|
} partitioned {
|
})
|
...
|
2019-01-14 11:33:45 140385718540256 [Note] /usr/sbin/mysqld: ready for connections.
|
Version: '10.2.14-MariaDB-log' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server
|
And then the second node starts up:
2019-01-14 11:36:14 140151256176608 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
|
2019-01-14 11:36:14 140151256176608 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
|
2019-01-14 11:36:14 140151256176608 [Note] WSREP: EVS version 0
|
2019-01-14 11:36:14 140151256176608 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '10.2.220.17:,10.2.220.18:,10.2.220.19:'
|
2019-01-14 11:36:14 140151256176608 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection established to 5025de8a tcp://10.2.220.19:4567
|
2019-01-14 11:36:14 140151256176608 [Warning] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') address 'tcp://10.2.220.19:4567' points to own listening address, blacklisting
|
2019-01-14 11:36:17 140151256176608 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.19:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:36:17 140151256176608 [Warning] WSREP: no nodes coming from prim view, prim not possible
|
2019-01-14 11:36:17 140151256176608 [Note] WSREP: view(view_id(NON_PRIM,5025de8a,6) memb {
|
5025de8a,0
|
} joined {
|
} left {
|
} partitioned {
|
})
|
We can see from the above output from each node that that both nodes have the identifier 5025de8a.
Instead of raising an error message, the nodes just seem to get stuck in an endless loop of timeouts:
2019-01-14 11:36:17 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:36:17 140124222371584 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50334S), skipping check
|
2019-01-14 11:36:22 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:36:27 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:36:31 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:36:36 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:36:41 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:36:46 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:36:51 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:36:55 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:00 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:05 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:09 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:14 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:19 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:23 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:27 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:32 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:37 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:41 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:46 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:50 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:37:55 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
2019-01-14 11:38:00 140124222371584 [Note] WSREP: (5025de8a, 'tcp://0.0.0.0:4567') connection to peer 5025de8a with addr tcp://10.2.220.17:4567 timed out, no messages seen in PT3S
|
This behavior was seen with Galera 25.3.23.
Attachments
Issue Links
- is blocked by
-
MDEV-20337 Merge galera release_25.3.27
- Closed