[MDEV-30193] Data drift between source and replica with sync_binlog=0 and OS crash - Jira

XML

Word

Printable

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None
Environment:
MariaDB 10.4.27 running on Debian 10

Description

My intention for doing this test is to find out the data drift between source and replicas when a OS crash happens on source which is running with sync_binlog=0 and trx_commit=1.
In my test I saw two things
1> replication on replica is broken as after a crash, binlog information is missed from source as OS has not flushed from memory to file and gtid position is rolled back. Replica will try to look for a latest position which is not there on source and will fail to replicate. This is expected, no issues with this problem.
2> Data Drift- I observed the last transaction which was inserted on source by sysbench insert was rollback on source during crash recovery but this statement was committed on the replica. So my replica has these extra rows which were rollback on source.
My Questions:-
1> I thought data drift is something where source has more data than a replica?
2> My understanding from rollback on source during crash recovery is, while doing recovery it has not found the commit for the last transaction in redo logs so it rolled back the transaction. If commit was not there how can a replica committed that transaction coming from source.
Also how can a replica read a transaction which was not fully committed on the source, my assumption is the transaction is using internal XA commits and after the first phase, before committing the transaction it will write to binlog and once binlog is having some data then replica will read from it-- Not sure my assumption is right here?

From the error log I can see mariadb doing a redo crash recovery first and then binary log crash recovery.
I'm not pasting the info from error log or my table counts from both source and replicas and sysbench command I was running. If you need I can provide them.

How I did an OS crash?
My DB server is running on a VM, through v-center I did a hard stop of VM which did a crash. I tested this scenario multiple times and every time I saw a data drift, more data on Replicas.

Sysbench- I was just running the prepare statement which created tables and inserts data into them, was using 4 parallel threads, inserting into 4 tables at a time.

Server- 4 CPU and 4 GB Ram, around 1.5 GB innodb buffer pool. Debian 10

Let me know if you need any more information.

Regards
Jaya

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Jaya Karthik Karri

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2022-12-10 07:23

Updated:: 2022-12-12 12:21

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.