[MDEV-30723] Failure during galera_certification_ccc test Created: 2023-02-24  Updated: 2023-04-04  Resolved: 2023-04-04

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.3
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Julius Goryavsky Assignee: Seppo Jaakola
Resolution: Won't Fix Votes: 0
Labels: None


 Description   

Failure on the second node (node 2):

2023-02-24  4:23:09 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2023-02-24  4:23:09 0 [Note] WSREP: wsrep_sst_grab()
2023-02-24  4:23:09 0 [Note] WSREP: Start replication
2023-02-24  4:23:09 0 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2023-02-24  4:23:09 0 [Note] WSREP: protonet asio version 0
2023-02-24  4:23:09 0 [Note] WSREP: Using CRC-32C for message checksums.
2023-02-24  4:23:09 0 [Note] WSREP: backend: asio
2023-02-24  4:23:09 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 
2023-02-24  4:23:09 0 [Note] WSREP: access file(/dev/shm/var/1/mysqld.2/data//gvwstate.dat) failed(No such file or directory)
2023-02-24  4:23:09 0 [Note] WSREP: restore pc from disk failed
2023-02-24  4:23:09 0 [Note] WSREP: GMCast version 0
2023-02-24  4:23:09 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') listening at tcp://0.0.0.0:16006
2023-02-24  4:23:09 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') multicast: , ttl: 1
2023-02-24  4:23:09 0 [Note] WSREP: EVS version 0
2023-02-24  4:23:09 0 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '127.0.0.1:16003'
2023-02-24  4:23:09 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') connection established to f15f8aaa tcp://127.0.0.1:16003
2023-02-24  4:23:09 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') turning message relay requesting on, nonlive peers: 
2023-02-24  4:23:09 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') connection established to f1653212 tcp://127.0.0.1:16009
2023-02-24  4:23:13 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') turning message relay requesting off
2023-02-24  4:23:20 0 [Warning] WSREP: no nodes coming from prim view, prim not possible
2023-02-24  4:23:20 0 [Note] WSREP: view(view_id(NON_PRIM,f15f8aaa,1) memb {
	f15f8aaa,0
} joined {
} left {
} partitioned {
})
2023-02-24  4:23:21 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') turning message relay requesting on, nonlive peers: tcp://127.0.0.1:16003 tcp://127.0.0.1:16009 
2023-02-24  4:23:22 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') reconnecting to f15f8aaa (tcp://127.0.0.1:16003), attempt 0
2023-02-24  4:23:22 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') reconnecting to f1653212 (tcp://127.0.0.1:16009), attempt 0
2023-02-24  4:23:22 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') connection established to f15f8aaa tcp://127.0.0.1:16003
2023-02-24  4:23:22 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') connection established to f1653212 tcp://127.0.0.1:16009
2023-02-24  4:23:23 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') reconnecting to f15f8aaa (tcp://127.0.0.1:16003), attempt 0
2023-02-24  4:23:23 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') reconnecting to f1653212 (tcp://127.0.0.1:16009), attempt 0
2023-02-24  4:23:23 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') connection established to f15f8aaa tcp://127.0.0.1:16003
2023-02-24  4:23:23 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') connection established to f1653212 tcp://127.0.0.1:16009
2023-02-24  4:23:27 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16006') turning message relay requesting off
2023-02-24  4:23:30 0 [Note] WSREP: evs::proto(f15f8aaa, GATHER, view_id(REG,f15f8aaa,1)) suspecting node: f15f8aaa
2023-02-24  4:23:30 0 [Note] WSREP: evs::proto(f15f8aaa, GATHER, view_id(REG,f15f8aaa,1)) suspected node without join message, declaring inactive
2023-02-24  4:23:30 0 [Note] WSREP: evs::proto(f15f8aaa, GATHER, view_id(REG,f15f8aaa,1)) suspecting node: f1653212
2023-02-24  4:23:30 0 [Note] WSREP: evs::proto(f15f8aaa, GATHER, view_id(REG,f15f8aaa,1)) suspected node without join message, declaring inactive
2023-02-24  4:23:35 0 [Warning] WSREP: evs::proto(f15f8aaa, GATHER, view_id(REG,f15f8aaa,1)) install timer expired
evs::proto(evs::proto(f15f8aaa, GATHER, view_id(REG,f15f8aaa,1)), GATHER) {
current_view=view(view_id(REG,f15f8aaa,1) memb {
	f15f8aaa,0
} joined {
} left {
} partitioned {
}),
input_map=evs::input_map: {aru_seq=0,safe_seq=0,node_index=node: {idx=0,range=[1,0],safe_seq=0} },
fifo_seq=39,
last_sent=0,
known:
f15f8aaa at tcp://127.0.0.1:16003
{o=0,s=1,i=0,fs=65,}
f15f8aaa at 
{o=1,s=0,i=0,fs=-1,jm=
{v=0,t=4,ut=255,o=1,s=0,sr=-1,as=0,f=0,src=f15f8aaa,srcvid=view_id(REG,f15f8aaa,1),insvid=view_id(UNKNOWN,00000000,0),ru=00000000,r=[-1,-1],fs=39,nl=(
	f15f8aaa, {o=0,s=1,e=0,ls=-1,vid=view_id(REG,00000000,0),ss=-1,ir=[-1,-1],}
	f15f8aaa, {o=1,s=0,e=0,ls=-1,vid=view_id(REG,f15f8aaa,1),ss=0,ir=[1,0],}
	f1653212, {o=0,s=1,e=0,ls=-1,vid=view_id(REG,00000000,0),ss=-1,ir=[-1,-1],}
)
},
}
f1653212 at tcp://127.0.0.1:16009
{o=0,s=1,i=0,fs=53,}
 }
2023-02-24  4:23:35 0 [Note] WSREP: no install message received
2023-02-24  4:23:39 0 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
	 at /tmp/workspace/Galera/Build-Galera-Package/galera-src/gcomm/src/pc.cpp:connect():160
2023-02-24  4:23:39 0 [ERROR] WSREP: /tmp/workspace/Galera/Build-Galera-Package/galera-src/gcs/src/gcs_core.cpp:gcs_core_open():209: Failed to open backend connection: -110 (Connection timed out)
2023-02-24  4:23:39 0 [ERROR] WSREP: /tmp/workspace/Galera/Build-Galera-Package/galera-src/gcs/src/gcs.cpp:gcs_open():1476: Failed to open channel 'my_wsrep_cluster' at 'gcomm://127.0.0.1:16003': -110 (Connection timed out)
2023-02-24  4:23:39 0 [ERROR] WSREP: gcs connect failed: Connection timed out
2023-02-24  4:23:39 0 [ERROR] WSREP: wsrep::connect(gcomm://127.0.0.1:16003) failed: 7
2023-02-24  4:23:39 0 [ERROR] Aborting

At the same time, everything is in order on the first node:

WSREP_SST: [INFO] rsync SST started on donor (20230224 04:23:20.865)
2023-02-24  4:23:20 0 [Note] WSREP: Flushing tables for SST...
2023-02-24  4:23:20 0 [Note] WSREP: Provider paused at f15fadd7-b3fa-11ed-b44e-b71ced44d604:0 (5)
2023-02-24  4:23:20 0 [Note] InnoDB: Stopping purge
2023-02-24  4:23:20 0 [Note] WSREP: Tables flushed.
2023-02-24  4:23:20 0 [Note] WSREP: Donor state reached
WSREP_SST: [INFO] Tables flushed (20230224 04:23:21.144)
WSREP_SST: [INFO] Transfer of normal directories done (20230224 04:23:21.149)
WSREP_SST: [INFO] Transfer of InnoDB data files done (20230224 04:23:21.191)
WSREP_SST: [INFO] Transfer of InnoDB log files done (20230224 04:23:21.228)
WSREP_SST: [INFO] Transfer of InnoDB undo logs done (20230224 04:23:21.232)
WSREP_SST: [INFO] Transfer of Aria logs done (20230224 04:23:21.237)
WSREP_SST: [INFO] Transfer of data done (20230224 04:23:21.263)
WSREP_SST: [INFO] Sending continue to donor (20230224 04:23:21.265)
2023-02-24  4:23:21 0 [Note] InnoDB: Resuming purge
2023-02-24  4:23:21 0 [Note] WSREP: resuming provider at 5
2023-02-24  4:23:21 0 [Note] WSREP: Provider resumed.
2023-02-24  4:23:21 0 [Note] WSREP: 0.0 (ip-172-31-46-246): State transfer to 1.0 (ip-172-31-46-246) complete.
2023-02-24  4:23:21 0 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 0)
2023-02-24  4:23:21 0 [Note] WSREP: Member 0.0 (ip-172-31-46-246) synced with group.
2023-02-24  4:23:21 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2023-02-24  4:23:21 2 [Note] WSREP: Synchronized with group, ready for connections
2023-02-24  4:23:21 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
WSREP_SST: [INFO] rsync SST completed on donor (20230224 04:23:21.270)
2023-02-24  4:23:22 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16003') connection established to f15f8aaa tcp://127.0.0.1:16006
2023-02-24  4:23:22 0 [Warning] WSREP: discarding established (time wait) f15f8aaa (tcp://127.0.0.1:16006) 
2023-02-24  4:23:22 0 [Note] WSREP: 1.0 (ip-172-31-46-246): State transfer from 0.0 (ip-172-31-46-246) complete.
2023-02-24  4:23:22 0 [Note] WSREP: Member 1.0 (ip-172-31-46-246) synced with group.
2023-02-24  4:23:23 0 [Note] WSREP:  cleaning up f15f8aaa (tcp://127.0.0.1:16006)
2023-02-24  4:23:23 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16003') connection established to f15f8aaa tcp://127.0.0.1:16006
2023-02-24  4:23:23 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16003') turning message relay requesting on, nonlive peers: 
2023-02-24  4:23:27 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16003') turning message relay requesting off
2023-02-24  4:23:39 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16003') turning message relay requesting on, nonlive peers: tcp://127.0.0.1:16006 
2023-02-24  4:23:40 0 [Note] /tmp/workspace/Test-Package/bin/mysqld (initiated by: unknown): Normal shutdown
2023-02-24  4:23:40 0 [Note] WSREP: Stop replication
2023-02-24  4:23:40 0 [Note] WSREP: Closing send monitor...
2023-02-24  4:23:40 0 [Note] WSREP: Closed send monitor.
2023-02-24  4:23:40 0 [Note] WSREP: gcomm: terminating thread
2023-02-24  4:23:40 0 [Note] WSREP: gcomm: joining thread
2023-02-24  4:23:40 0 [Note] WSREP: gcomm: closing backend
2023-02-24  4:23:41 0 [Note] WSREP: (f15f8aaa, 'tcp://0.0.0.0:16003') reconnecting to f15f8aaa (tcp://127.0.0.1:16006), attempt 0



 Comments   
Comment by Jan Lindström [ 2023-04-04 ]

10.3 will EOL soon and as this is just a test failure not critical customer or security bug, we will not fix it.

Generated at Thu Feb 08 10:18:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.