Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Incomplete
-
10.2.12, 10.3.5
-
None
-
FreeBSD 11.1
CentOS 7.4
Description
I have just upgrade mariadb to 10.2.12 and enabled galera.
When the second node joins, the first master node stops working and need to be restarted to recover.
Here's my error log:
2018-01-17 12:01:33 34426956544 [Note] WSREP: (3153ad69, 'tcp://0.0.0.0:4567') connection established to f61103ed tcp://192.168.62.211:4567 |
2018-01-17 12:01:33 34426956544 [Note] WSREP: (3153ad69, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: |
2018-01-17 12:01:33 34426956544 [Note] WSREP: (3153ad69, 'tcp://0.0.0.0:4567') connection established to ba2863c0 tcp://192.168.62.201:4567 |
2018-01-17 12:01:33 34426956544 [Note] WSREP: declaring ba2863c0 at tcp://192.168.62.201:4567 stable |
2018-01-17 12:01:33 34426956544 [Note] WSREP: declaring f61103ed at tcp://192.168.62.211:4567 stable |
2018-01-17 12:01:33 34426956544 [Warning] WSREP: 3153ad69 conflicting prims: my prim: view_id(PRIM,3153ad69,1) other prim: view_id(PRIM,ba2863c0,16) |
2018-01-17 12:01:33 34426956544 [ERROR] WSREP: caught exception in PC, state dump to stderr follows: |
pc::Proto{uuid=3153ad69,start_prim=1,npvo=0,ignore_sb=0,ignore_quorum=0,state=1,last_sent_seq=53547,checksum=0,instances= |
3153ad69,prim=1,un=0,last_seq=53547,last_prim=view_id(PRIM,3153ad69,1),to_seq=53546,weight=1,segment=0 |
,state_msgs=
|
3153ad69,pcmsg{ type=STATE, seq=0, flags= 0, node_map { 3153ad69,prim=1,un=0,last_seq=53547,last_prim=view_id(PRIM,3153ad69,1),to_seq=53546,weight=1,segment=0 |
}}
|
,current_view=view(view_id(REG,3153ad69,17) memb { |
3153ad69,0 |
ba2863c0,0 |
f61103ed,0 |
} joined {
|
ba2863c0,0 |
f61103ed,0 |
} left {
|
} partitioned {
|
}),pc_view=view(view_id(PRIM,3153ad69,1) memb { |
3153ad69,0 |
} joined {
|
} left {
|
} partitioned {
|
}),mtu=32636} |
2018-01-17 12:01:33 34426956544 [Note] WSREP: {v=0,t=1,ut=255,o=4,s=0,sr=0,as=-1,f=4,src=ba2863c0,srcvid=view_id(REG,3153ad69,17),insvid=view_id(UNKNOWN,00000000,0),ru=00000000,r=[-1,-1],fs=36492262,nl=( |
)
|
} 64 |
2018-01-17 12:01:33 34426956544 [ERROR] WSREP: exception caused by message: {v=0,t=3,ut=255,o=1,s=0,sr=-1,as=0,f=4,src=f61103ed,srcvid=view_id(REG,3153ad69,17),insvid=view_id(UNKNOWN,00000000,0),ru=00000000,r=[-1,-1],fs=8,nl=( |
)
|
}
|
state after handling message: evs::proto(evs::proto(3153ad69, OPERATIONAL, view_id(REG,3153ad69,17)), OPERATIONAL) { |
current_view=view(view_id(REG,3153ad69,17) memb { |
3153ad69,0 |
ba2863c0,0 |
f61103ed,0 |
} joined {
|
} left {
|
} partitioned {
|
}),
|
input_map=evs::input_map: {aru_seq=0,safe_seq=0,node_index=node: {idx=0,range=[1,0],safe_seq=0} node: {idx=1,range=[1,0],safe_seq=0} node: {idx=2,range=[1,0],safe_seq=0} }, |
fifo_seq=56867, |
last_sent=0, |
known:
|
3153ad69 at
|
{o=1,s=0,i=1,fs=-1,} |
ba2863c0 at tcp://192.168.62.201:4567 |
{o=1,s=0,i=1,fs=36492264,} |
f61103ed at tcp://192.168.62.211:4567 |
{o=1,s=0,i=1,fs=8,} |
}2018-01-17 12:01:33 34426956544 [ERROR] WSREP: exception from gcomm, backend must be restarted: 3153ad69 aborting due to conflicting prims: older overrides (FATAL) |
at gcomm/src/pc_proto.cpp:handle_state():982 |
2018-01-17 12:01:33 34426956544 [Note] WSREP: gcomm: terminating thread |
2018-01-17 12:01:33 34426956544 [Note] WSREP: gcomm: joining thread |
2018-01-17 12:01:33 34426956544 [Note] WSREP: gcomm: closing backend |
2018-01-17 12:01:33 34426956544 [Note] WSREP: Forced PC close |
2018-01-17 12:01:33 34426956544 [Warning] WSREP: discarding 2 messages from message index |
2018-01-17 12:01:33 34426956544 [Note] WSREP: gcomm: closed |
2018-01-17 12:01:33 35628642304 [Note] WSREP: Received self-leave message. |
2018-01-17 12:01:33 35628642304 [Note] WSREP: comp msg error in core 53 |
2018-01-17 12:01:33 38097805824 [Warning] WSREP: Send action {0x0, 2338, TORDERED} returned -53 (Software caused connection abort) |
2018-01-17 12:01:33 38099474176 [Note] WSREP: applier thread exiting (code:6) |
2018-01-17 12:01:33 35628642304 [Note] WSREP: Closing send monitor... |
2018-01-17 12:01:33 35628642304 [Note] WSREP: Closed send monitor. |
2018-01-17 12:01:33 35628642304 [Note] WSREP: Closing replication queue. |
2018-01-17 12:01:33 35628642304 [Note] WSREP: Closing slave action queue. |
2018-01-17 12:01:33 38099134208 [Note] WSREP: applier thread exiting (code:6) |
2018-01-17 12:01:33 38099472896 [Note] WSREP: applier thread exiting (code:6) |
2018-01-17 12:01:33 38099139328 [Note] WSREP: applier thread exiting (code:6) |
2018-01-17 12:01:33 38099467776 [Note] WSREP: applier thread exiting (code:6) |
2018-01-17 12:01:34 38099475456 [Note] WSREP: applier thread exiting (code:6) |
2018-01-17 12:01:34 38083915264 [Note] WSREP: applier thread exiting (code:6) |
2018-01-17 12:01:34 38099460096 [Note] WSREP: applier thread exiting (code:6) |
2018-01-17 12:01:34 38099654912 [Note] WSREP: applier thread exiting (code:6) |
2018-01-17 12:01:34 35628644864 [Note] WSREP: applier thread exiting (code:6) |
2018-01-17 12:01:34 38095302656 [Note] WSREP: applier thread exiting (code:6) |
Attachments
Issue Links
- relates to
-
MDEV-15399 Galera catches exception and terminates, but MariaDB keeps going
- Closed