[MDEV-36290] ALTER TABLE with multi-master can cause data loss - Jira

Details

Type: Bug
Status: In Progress (View Workflow)
Priority: Critical
Resolution: Unresolved
Affects Version/s: 10.5
Fix Version/s: 10.6
Component/s: Replication
Labels:
None

Description

--binlog_row_metadata=full stores columns names and other metadata into the binary log.
However this is not used by replication threads, which can cause data loss on multi-master if one adds or drop fields with ALTER TABLE when binlog_format=row or binlog_format=statement is used.

ALTER TABLE will work if columns are added last to the table. However if one deletes columns or add a column in the middle of the table will cause all future DML to update wrong fields.

The fix is that if --binlog_row_metadata=full is used, we should not use field positions but instead use field names for insert and update when applying row events.
mariadb-binlog should also be update to show fieldnames instead of field positions if field names are available.

I will add a simple test case that one can use to verify the issue.

We should probably also add an option --slave_type_conversions=IGNORE_MISSING_COLUMNS to allow one to drop columns with ALTER TABLE

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

skr.opt
0.1 kB
2025-03-13 14:31
skr.test
0.5 kB
2025-03-13 14:31

Issue Links

relates to

MDEV-20477 Merge binlog extended metadata support from the upstream

Closed

Activity

Ascending order - Click to sort in descending order

Brandon Nesterenko added a comment - 2025-03-13 20:53 - edited

monty Enums are also broken. That is, if the slave modified its table's enums to have different ordering than the master, the enums are picked based on index, rather than value. Modifying your skr.test to instead test enums shows this:

--source include/have_innodb.inc

--source include/have_binlog_format_row.inc

--source include/master-slave.inc

CREATE TABLE t1 (a INT PRIMARY KEY, en ENUM('a','b','c')) ENGINE = innodb;

INSERT INTO t1 VALUES(100,'a'), (200,'b');

--sync_slave_with_master

--connection slave

alter table t1 modify column en enum ('a','d', 'b', 'c');

connection master;

INSERT INTO t1 VALUES(300, 'b');

--sync_slave_with_master

select * from t1;

connection master;

update t1 set en='b' where A=100;

--sync_slave_with_master

select * from t1;

connection master;

--source include/rpl_end.inc

The slave's result (where rows with a=100 and a=300 should both actually have value b, but show d because of the re-arrangement:

a	en

100	d

200	b

300	d

Note too that binlog_row_metadata=FULL will also print the expectations for the ENUMs, so we can use this to reference the desired enum value.

#250313 14:49:07 server id 1  end_log_pos 944 CRC32 0x2167fc1b 	Table_map: `test`.`t1` mapped to number 33

# Columns(`a` INT NOT NULL,

#         `en` ENUM('a','b','c') CHARSET latin1 COLLATE latin1_swedish_ci)

# Primary Key(a)

Brandon Nesterenko added a comment - 2025-03-13 20:53 - edited monty Enums are also broken. That is, if the slave modified its table's enums to have different ordering than the master, the enums are picked based on index, rather than value. Modifying your skr.test to instead test enums shows this: --source include/have_innodb.inc --source include/have_binlog_format_row.inc --source include/master-slave.inc CREATE TABLE t1 (a INT PRIMARY KEY, en ENUM('a','b','c')) ENGINE = innodb; INSERT INTO t1 VALUES(100,'a'), (200,'b'); --sync_slave_with_master --connection slave alter table t1 modify column en enum ('a','d', 'b', 'c'); connection master; INSERT INTO t1 VALUES(300, 'b'); --sync_slave_with_master select * from t1; connection master; update t1 set en='b' where A=100; --sync_slave_with_master select * from t1; connection master; --source include/rpl_end.inc The slave's result (where rows with a=100 and a=300 should both actually have value b , but show d because of the re-arrangement: a en 100 d 200 b 300 d Note too that binlog_row_metadata=FULL will also print the expectations for the ENUMs, so we can use this to reference the desired enum value. #250313 14:49:07 server id 1 end_log_pos 944 CRC32 0x2167fc1b Table_map: `test`.`t1` mapped to number 33 # Columns(`a` INT NOT NULL, # `en` ENUM('a','b','c') CHARSET latin1 COLLATE latin1_swedish_ci) # Primary Key(a)

Brandon Nesterenko added a comment - 2025-04-01 21:07

serg, elenst:

I discussed this with Monty, and here is the summary of our chat:

The use case is limited to multi-master, with a distributed dataset where different masters update the same table but work on distinct ranges of data. So if one node runs an ALTER TABLE which re-arranges the order of the columns to be inconsistent with its master, then when that node replicates insert/update row events that target its altered table from its master, there will indeed be data loss.

This should be fixed in 10.6, as we have many users/customers that run this topology running in 10.6, and the fix should be relatively easy with low risk. That is, to populate the fields of table->record[0] during unpacking using column names, rather than column indices. The patch applied in 10.6 should be relatively small; and any complex edge cases (e.g. enums) should just report as errors in 10.6, with fixes to be handled in newer versions. The patch in 10.6 can leverage ideas from the ONLINE ALTER work, and then in some 11.4+, the code can instead leverage the actual ONLINE ALTER code.

To the risk, this work will break functionality if a slave renames columns from its master. Currently, this use-case works as long as the column index is in the same location. With this patch, we will not be able to find the column on the slave-side table to write to. Monty says this is not a practical case though, and that users don't/wouldn't actually do this.

Brandon Nesterenko added a comment - 2025-04-01 21:07 serg , elenst : I discussed this with Monty, and here is the summary of our chat: The use case is limited to multi-master, with a distributed dataset where different masters update the same table but work on distinct ranges of data. So if one node runs an ALTER TABLE which re-arranges the order of the columns to be inconsistent with its master, then when that node replicates insert/update row events that target its altered table from its master, there will indeed be data loss. This should be fixed in 10.6, as we have many users/customers that run this topology running in 10.6, and the fix should be relatively easy with low risk. That is, to populate the fields of table->record [0] during unpacking using column names, rather than column indices. The patch applied in 10.6 should be relatively small; and any complex edge cases (e.g. enums) should just report as errors in 10.6, with fixes to be handled in newer versions. The patch in 10.6 can leverage ideas from the ONLINE ALTER work, and then in some 11.4+, the code can instead leverage the actual ONLINE ALTER code. To the risk, this work will break functionality if a slave renames columns from its master. Currently, this use-case works as long as the column index is in the same location. With this patch, we will not be able to find the column on the slave-side table to write to. Monty says this is not a practical case though, and that users don't/wouldn't actually do this.

MariaDB Server

ALTER TABLE with multi-master can cause data loss

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration