[MDEV-4255] MIXED replication: Unique key updates are not marked unsafe, replication fails Created: 2013-03-08  Updated: 2016-12-02

Status: Stalled
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.0.1, 5.5.29, 5.5, 10.0
Fix Version/s: 10.1

Type: Bug Priority: Minor
Reporter: Elena Stepanova Assignee: Sergei Golubchik
Resolution: Unresolved Votes: 0
Labels: upstream-wontfix

Issue Links:
Relates

 Description   

The problem is reproducible on MySQL 5.5, 5.6 and filed as http://bugs.mysql.com/bug.php?id=68609

An update such as UPDATE t1 SET pk = 5, where pk is a unique key, is not marked as an unsafe statement, so with binlog-format=MIXED it is logged in the statement format. If the update attempts to modify more than one row on a non-transactional table, it will be interrupted due to the duplicate key error and written to the binary log with an error code. However, since the update does not contain an ORDER BY condition, it might happen that it ends up updating a different row on master and slave, which will cause data inconsistency, and eventually a replication failure.

There are various ways to achieve this; the test case in 'How to repeat' section simulates a real-life scenario: it executes some SQL on a server, then initializes a slave from a dump and starts replication. This way we get the same data on master and slave, but written in different order.

Below is a partial output:

connection master;
CREATE TABLE t1 (pk INT PRIMARY KEY) ENGINE=MyISAM;
INSERT INTO t1 VALUES (3),(2),(1);
#-------------------------------------------------
# Prepare a dump for initializing a slave
# (run mysqldump --order-by-primary --master-data test)
#-------------------------------------------------
connection slave;
#-------------------------------------------------
# Restore the dump on the slave
#-------------------------------------------------
START SLAVE;
include/wait_for_slave_to_start.inc
connection master;
#-------------------------------------------------
# Since the rows can get updated in an arbitrary order
# and stop in the middle due to the duplicate key error, 
# the statement is actually unsafe, but is not recognized
# as such and is logged in statement format
#-------------------------------------------------
UPDATE t1 SET pk = 5;
ERROR 23000: Duplicate entry '5' for key 'PRIMARY'
SHOW BINLOG EVENTS;
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
...
master-bin.000001	510	Query	1	589	BEGIN
master-bin.000001	589	Query	1	683	use `test`; UPDATE t1 SET pk = 5
master-bin.000001	683	Query	1	763	COMMIT
#-------------------------------------------------
# We have different data on master and slave
#-------------------------------------------------
SELECT * FROM t1;
pk
1
2
5
connection slave;
SELECT * FROM t1;
pk
2
3
5
#----------------------------------
# Now it's easy to break replication completely
#----------------------------------
connection master;
DELETE FROM t1 LIMIT 3;
 
...
 
=== SHOW SLAVE STATUS ===
---- 1. ----
...
Last_Errno	1032
Last_Error	Could not execute Delete_rows event on table test.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; 
...

--source include/master-slave.inc
--source include/have_binlog_format_mixed.inc
 
--connection slave
STOP SLAVE;
--source include/wait_for_slave_to_stop.inc
 
--enable_connect_log
 
--connection master
CREATE TABLE t1 (pk INT PRIMARY KEY) ENGINE=MyISAM;
INSERT INTO t1 VALUES (3),(2),(1);
 
--echo #-------------------------------------------------
--echo # Prepare a dump for initializing a slave
--echo # (run mysqldump --order-by-primary --master-data test)
--echo #-------------------------------------------------
 
--exec $MYSQL_DUMP --order-by-primary --master-data test > $MYSQLTEST_VARDIR/tmp/d.dump
 
--connection slave
--echo #-------------------------------------------------
--echo # Restore the dump on the slave
--echo #-------------------------------------------------
 
--exec $MYSQL_SLAVE test < $MYSQLTEST_VARDIR/tmp/d.dump
 
START SLAVE;
--disable_connect_log
--source include/wait_for_slave_to_start.inc
--enable_connect_log
 
--connection master
 
--echo #-------------------------------------------------
--echo # Since the rows can get updated in an arbitrary order
--echo # and stop in the middle due to the duplicate key error, 
--echo # the statement is actually unsafe, but is not recognized
--echo # as such and is logged in statement format
--echo #-------------------------------------------------
 
--error ER_DUP_ENTRY
UPDATE t1 SET pk = 5;
SHOW BINLOG EVENTS;
 
--echo #-------------------------------------------------
--echo # We have different data on master and slave
--echo #-------------------------------------------------
 
SELECT * FROM t1;
--sync_slave_with_master
SELECT * FROM t1;
 
--echo #----------------------------------
--echo # Now it's easy to break replication completely
--echo #----------------------------------
 
--connection master
DELETE FROM t1 LIMIT 3;
SHOW BINLOG EVENTS;
 
--sync_slave_with_master



 Comments   
Comment by Elena Stepanova [ 2014-11-09 ]

Raised priority not to emphasize its importance, but to get some motion on this bug. The upstream bug was closed as "won't fix", with the helpful suggestion to use InnoDB and RBR instead of MyISAM and SBR. So, there is no point waiting for the upstream fix any longer.

If we also want to leave it as is, please close the report. Otherwise (if we want to fix it some day), please kick it back to minor.

Comment by Sergei Golubchik [ 2014-11-15 ]

Ok, so the condition is, as far as I can see, "SBR, non-transactional table, update of a UNIQUE KEY columns"

Comment by Elena Stepanova [ 2014-11-15 ]

Well, the initial bug was about MBR (rather than SBR), because with SBR all we can do is issue the warning which is currently missing; it won't protect replication though.
With MBR it is more important, because if the statement was properly recognized as unsafe, binlog format would have switched to row-based, and replication would continue just fine.

But technically yes, it's about SBR.

Comment by Sergei Golubchik [ 2014-11-15 ]

knielsen, could you please review this patch?

I don't particularly like that it explicitly invokes thd->decide_logging_format() . But there's no way around it, first time logging format is decided in lock_tables(), basically when tables are just opened. At that point we don't know what fields will be updated yet.

Comment by Kristian Nielsen [ 2014-11-18 ]

Review done:

https://lists.launchpad.net/maria-developers/msg07906.html

Comment by Sergei Golubchik [ 2014-11-18 ]

It introduces too many warnings for previously silent queries. We cannot do that in 5.5 or 10.0 anymore.

Generated at Thu Feb 08 06:55:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.