[MDEV-23195] Mariadb 10.4.12 crash after Streaming replication applied Created: 2020-07-16  Updated: 2020-08-31  Resolved: 2020-08-31

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4.12
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Xesh Assignee: Jan Lindström (Inactive)
Resolution: Incomplete Votes: 0
Labels: crash, need_feedback
Environment:

Mariadb 10.4.12 26.4.3(r4535)



 Description   

After enabling Streaming Replication with set session wsrep_trx_fragment_unit='row'; and wsrep_trx_fragment_size=5000; the database cluster has split, they all went into NON-PRIMARY, wsrep_local_state_comment Inconsistent , wsrep_cluster_status Disconnected, wsrep_ready OFF and only one server was working.

2020-07-16 14:44:04 66 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table beta.ast; Duplicate entry '269-8014285897-William-300000.0000-1900-01-01' for key 'clcode_2', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 8119, Internal MariaDB error code: 1062
2020-07-16 14:44:04 66 [Warning] WSREP: Event 3 Write_rows_v1 apply failed: 121, seqno 1116471573
2020-07-16 14:44:04 66 [ERROR] WSREP: Failed to apply write set: gtid: 3564d059-6c13-11ea-b4de-56dcde7e0e7f:1116471573 server_id: a5a00a0a-c74c-11ea-804a-af822776a2c5 client_id: 3174408 trx_id: 38093106 flags: 3 (start_transaction | commit)

2020-07-16 14:44:04 66 [Note] WSREP: Closing send monitor...
2020-07-16 14:44:04 66 [Note] WSREP: Closed send monitor.
2020-07-16 14:44:04 66 [Note] WSREP: gcomm: terminating thread
2020-07-16 14:44:04 66 [Note] WSREP: gcomm: joining thread
2020-07-16 14:44:04 66 [Note] WSREP: gcomm: closing backend
2020-07-16 14:44:04 66 [Note] WSREP: view(view_id(NON_PRIM,5a34cd0c,36) memb

{ 6dce32ed,2 }

joined {
} left {
} partitioned

{ 5a34cd0c,2 80ea6916,2 8fa2479b,1 a5a00a0a,1 b688a527,1 c5632dbd,1 }

)
2020-07-16 14:44:04 66 [Note] WSREP: PC protocol downgrade 1 -> 0
2020-07-16 14:44:04 66 [Note] WSREP: view((empty))
2020-07-16 14:44:04 66 [Note] WSREP: gcomm: closed
2020-07-16 14:44:04 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2020-07-16 14:44:04 0 [Note] WSREP: Flow-control interval: [1980, 2000]
2020-07-16 14:44:04 0 [Note] WSREP: Received NON-PRIMARY.
2020-07-16 14:44:04 0 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 1116471575)
2020-07-16 14:44:04 0 [Note] WSREP: New SELF-LEAVE.
2020-07-16 14:44:04 0 [Warning] WSREP: Failed to report last committed 3564d059-6c13-11ea-b4de-56dcde7e0e7f:1116471571, -77 (File descriptor in bad state)
2020-07-16 14:44:04 0 [Note] WSREP: Flow-control interval: [0, 0]
2020-07-16 14:44:04 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2020-07-16 14:44:04 108 [Note] WSREP: ####### processing CC -1, local, ordered
2020-07-16 14:44:04 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: -1)
2020-07-16 14:44:04 0 [Note] WSREP: RECV thread exiting 0: Success
2020-07-16 14:44:04 66 [Note] WSREP: recv_thread() joined.
2020-07-16 14:44:04 66 [Note] WSREP: Closing replication queue.
2020-07-16 14:44:04 66 [Note] WSREP: Closing slave action queue.
2020-07-16 14:44:04 108 [Note] WSREP: ####### My UUID: 6dce32ed-c74c-11ea-946d-0759685dbed4
2020-07-16 14:44:04 108 [Note] WSREP: ####### ST not required
2020-07-16 14:44:04 108 [Note] WSREP: ================================================
View:
id: 3564d059-6c13-11ea-b4de-56dcde7e0e7f:-1
status: non-primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: no
own_index: 0
members(1):
0: 6dce32ed-c74c-11ea-946d-0759685dbed4, Production-2A
=================================================
2020-07-16 14:44:04 108 [Note] WSREP: Non-primary view
2020-07-16 14:44:04 108 [Note] WSREP: Server status change synced -> connected
2020-07-16 14:44:04 108 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2020-07-16 14:44:04 108 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2020-07-16 14:44:04 108 [Note] WSREP: ####### processing CC -1, local, ordered
2020-07-16 14:44:04 108 [Note] WSREP: ####### My UUID: 6dce32ed-c74c-11ea-946d-0759685dbed4
2020-07-16 14:44:04 108 [Note] WSREP: ####### ST not required
2020-07-16 14:44:04 108 [Note] WSREP: ================================================
View:
id: 3564d059-6c13-11ea-b4de-56dcde7e0e7f:-1
status: non-primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: yes
own_index: -1
members(0):
=================================================
2020-07-16 14:44:04 108 [Note] WSREP: Non-primary view
2020-07-16 14:44:04 108 [Note] WSREP: Server status change connected -> disconnected
2020-07-16 14:44:04 108 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2020-07-16 14:44:04 108 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.



 Comments   
Comment by Jan Lindström (Inactive) [ 2020-07-27 ]

Can you repeat this assertion? If you can please provide some steps how to reproduce. Current information just shows that one of the applier nodes does not have expected row.

Comment by Xesh [ 2020-08-03 ]

No, I can't, it is a production environment. I had to remove streaming replication.

Generated at Thu Feb 08 09:20:35 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.