[MDEV-31617] Galera Cluster could not recover since 2023-07-01 23:55:01 3287386 [Warning] WSREP: gcs_caused() returned -107 (Transport endpoint is not connected) Created: 2023-07-04 Updated: 2023-09-04 Resolved: 2023-09-04 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera |
| Affects Version/s: | 10.5.9 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Min-Jen Chang | Assignee: | Jan Lindström |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | corruption, galera | ||
| Description |
|
Our Galera Cluster was created by 3 nodes. 2023-07-01 23:54:52 0 [Note] WSREP: (63d23c5c-b67b, 'tcp://0.0.0.0:4567') connection to peer 45ebb9d4-a748 with addr tcp://172.24.151.92:4567 timed out, no messages seen in PT3S, socket stats: rtt: 1473 rttvar: 2527 rto: 204000 lost: 0 last_data_recv: 3344 cwnd: 6 last_queued_since: 500032641 last_delivered_since: 3342827068 send_queue_length: 0 send_queue_bytes: 0 segment: 0 messages: 0 But after this message, node-1 and node-2 were all showing this message: This message kept showing, and node-1 and node-2 were both trigger status change, from Then turned into Non-primary view: Since this issue, our Galera Cluster could not access, since each node local_state were turned into Initialization. After we compared node-1 and node-2's wsrep_last_committed, we selected node-1 to rebootstrap node (SET WSREP_PROVIDER_OPTIONS = "pc.bootstrap = 1;"), node-1 turned into Primary, and Did there any reason or trigger, to let this message: Did we hit any bug? Thank you. |
| Comments |
| Comment by Jan Lindström [ 2023-08-07 ] |
|
mjchangk Can you please try with more recent version of MariaDB server and Galera library. If your problem reproduces please provide full error log, output of show processlist and node configuration. |