[MDEV-9036] slave with row replication loses some data Created: 2015-10-28  Updated: 2015-12-25  Resolved: 2015-12-25

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.0.21
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Alex Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: need_feedback
Environment:

CentOS 6.7



 Description   

Hello,
Lately I've noticed a strange slave's behavior with row replication:

Master server receives around few thousands queries per second (insert delayed) and few hundred qps related to the problematic table.
Since master is considered to be some entry point, all tables are blackhole. Replication is row.
The nature of the specific (and problematic) table is that it's using primary key on ID field and may receive multiple inserts with same ID.
I don't worry about duplicates as long as there is an existing entry with that ID already in.
So in the end that slave either adds new row with new ID or fails on duplicate. I've excluded 1062 in config and filtered from error log, and it kinda works fine.

The problem is that under certain (yet unknown to me) conditions the slave sometimes misses new entries. It can receive a dozen of "to be duplicated" entries , then the real fresh one and won't apply it.

I see that in binary log received from master the new entry is present, but then it's missing in both table and slave log. What happens inside - no idea.
The problem is unrelated to parallel replication (I tried with/out).
Everything works fine with statement replication, however, it's slower with parallel replication under high load and slave starts delaying...

There are few other tables that receive way more entries/sec but those tables write all data, which means every entry is new/fresh for it.
And there are no problems with that. The problem is only with mixed inserts when an insert may fail because of duplicated entry or succeed.

I have around 200-300k new entries per hour and only around 100 entries fail to be inserted.

no error, no other things I can think about...couldn't find any dependency on number/order of "will fail on duplicate" and the "will get inserted" entries

So for now it's running with statement mode and everything is clean, the slave is catching up here and there and everything is quite stable but won't work with greater speeds.

Engine - MyISAM

Previously I used to have that table on master as MyISAM, so it cut down all duplicated entries which in its turn fwded only real insert candidates to slave.
That way it worked fine. Now it's blackhole and slave has to deal with every insert...

Please let me know how I can assist you further. I just don't know how I can do a deeper debug of internal row logs processing.

Thanks!
Alex



 Comments   
Comment by Elena Stepanova [ 2015-11-26 ]

alex_accelerationdb,
Sorry for the delay.

Could you please provide SHOW CREATE TABLE from both master and slave, cnf files from both master and slave, and an example of the binary log which contains the events which were not replicated?

You can either upload the binary log as is, or parse it with mysqlbinlog, but in the latter case please use mysqlbinlog --verbose --base64-output=DECODE-ROWS .

Thanks.

Comment by Alex [ 2015-11-26 ]

Hi Elena,
I need to prepare everything since it's working with statement and everything is good. Might take few days till I get to it.

Where can I upload the data for your review? Some things could be sensitive, so I prefer to keep it private.

Comment by Elena Stepanova [ 2015-11-26 ]

You can upload the data to ftp.askmonty.org/private, this way only MariaDB developers will have access to it.

Comment by Elena Stepanova [ 2015-12-25 ]

Please comment to re-open when/if you upload the data.

Generated at Thu Feb 08 07:31:39 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.