[MDEV-10960] Replication broken since migrating to MariaDB 10.1 Created: 2016-10-05  Updated: 2016-10-25  Resolved: 2016-10-06

Status: Closed
Project: MariaDB Server
Component/s: Galera, Replication
Affects Version/s: 10.1.18
Fix Version/s: 10.1.19

Type: Bug Priority: Major
Reporter: Robin Anil Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by MDEV-10944 GALERA log-slave-updates REGRESSION F... Closed

 Description   

Since migrating to 10.1.18. We are unable to get replication working. The root cause is either that mysql dump is not exporting all the rows or the row based replication is broken.

We create a mysqldump with --master-data and --single-transaction copy it to the slave, sets the correct master position and restarts.
We also tried to use innobackupex to create a snapshot copy to the slave and let it start.
What we are seeing is that as soon as the replication starts it fails with errors like this

: Could not execute Update_rows_v1 event on table tock_prod.payment; Can't find record in 'payment', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mariadb-bin.000615, end_log_pos 4326763

I tried to do a mysqlbinlog inspection of mariadb-bin.000615 and found the offending row. it refers to a payment row that is being updated. However the primary key below 1027178 is not seen in the mysql dump file but it is available on the master. Note that the master is one of 3 running in a galera based replication.

'/*!*/;
### UPDATE `tock_prod`.`payment`
### WHERE
###   @1=1027178
###   @2=123
###   @3=1546072
###   @4=''
###   @5=0
###   @6='2700'
###   @7='Discover'
###   @8=11800
###   @9=0
###   @10=2016-10-05 20:49:39
###   @11=2016-10-05 20:49:41
###   @12=1
###   @13=NULL
### SET
###   @1=1027178
###   @2=123
###   @3=1546072
###   @4=''
###   @5=0
###   @6='2700'
###   @7='Discover'
###   @8=11800
###   @9=0
###   @10=2016-10-05 20:49:39
###   @11=2016-10-05 20:49:41
###   @12=1
###   @13=372
# at 4326763

We are using ROW based replication. This used to work correctly when we were on 10.0.20-25 versions.



 Comments   
Comment by Robin Anil [ 2016-10-05 ]

However the logs do not show the corresponding insert statement. Note that the created_at timestamp (@10=2016-10-05 20:49:39) was 2 seconds before the updated timestamp. I can't see the insert in the binlog

Comment by Robin Anil [ 2016-10-05 ]

Is it possible the galera master replication causes some statements to be swallowed and not written to binlog when it comes in as a state transfer

Comment by Robin Anil [ 2016-10-05 ]

Ah! thats it, asynchronous slave replication no longer works a single master. Which worked in 10.0 and no longer works in 10.1. I couldn't find this change in any of the release notes. If there was an announcement please let me know. If this is an accidental bug, then let me know about that as well.

Comment by Robin Anil [ 2016-10-05 ]

we have log_slave_updates=1 set on all the galera masters, somehow that is not logging the log entries from other masters

Comment by Elena Stepanova [ 2016-10-06 ]

If it's log-slave-updates problem, it should be fixed already in the scope of MDEV-10944, to be released in 10.1.19.

Comment by Robin Anil [ 2016-10-06 ]

Considering this is a major regression would a 10.1.19 release be pushed any time soon?

Comment by Elena Stepanova [ 2016-10-06 ]

Currently 10.1.19 release is scheduled for October 27.
nirbhay_c, I don't see the fix pushed in 10.1 tree, when will it happen?
robinanil, after the fix is pushed into the tree, if you don't wait for the official release, you can build from source or use our intermediate binaries.

Comment by Robin Anil [ 2016-10-07 ]

Its only pushed into 10.2 as I see
https://github.com/MariaDB/server/commit/326a6729ec4475cac3fc509ed9746dfb2288ae12

Comment by Elena Stepanova [ 2016-10-07 ]

It's not 10.2, it's the development tree bb-10.1-nirbhay.

Comment by Nirbhay Choubey (Inactive) [ 2016-10-07 ]

elenst Will merge it to the main branch before the next 10.1 release.

Generated at Thu Feb 08 07:46:14 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.