Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36290

ALTER TABLE with multi-master can cause data loss

Details

    • Bug
    • Status: In Progress (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 10.5
    • 10.6
    • Replication
    • None

    Description

      --binlog_row_metadata=full stores columns names and other metadata into the binary log.
      However this is not used by replication threads, which can cause data loss on multi-master if one adds or drop fields with ALTER TABLE when binlog_format=row or binlog_format=statement is used.

      ALTER TABLE will work if columns are added last to the table. However if one deletes columns or add a column in the middle of the table will cause all future DML to update wrong fields.

      The fix is that if --binlog_row_metadata=full is used, we should not use field positions but instead use field names for insert and update when applying row events.
      mariadb-binlog should also be update to show fieldnames instead of field positions if field names are available.

      I will add a simple test case that one can use to verify the issue.

      We should probably also add an option --slave_type_conversions=IGNORE_MISSING_COLUMNS to allow one to drop columns with ALTER TABLE

      Attachments

        1. skr.opt
          0.1 kB
        2. skr.test
          0.5 kB

        Issue Links

          Activity

            bnestere Brandon Nesterenko added a comment - - edited

            monty Enums are also broken. That is, if the slave modified its table's enums to have different ordering than the master, the enums are picked based on index, rather than value. Modifying your skr.test to instead test enums shows this:

            --source include/have_innodb.inc
            --source include/have_binlog_format_row.inc
            --source include/master-slave.inc
             
            CREATE TABLE t1 (a INT PRIMARY KEY, en ENUM('a','b','c')) ENGINE = innodb;
            INSERT INTO t1 VALUES(100,'a'), (200,'b');
             
            --sync_slave_with_master
             
            --connection slave
            alter table t1 modify column en enum ('a','d', 'b', 'c');
             
            connection master;
             
            INSERT INTO t1 VALUES(300, 'b');
             
            --sync_slave_with_master
            select * from t1;
             
            connection master;
             
            update t1 set en='b' where A=100;
             
            --sync_slave_with_master
            select * from t1;
             
            connection master;
            --source include/rpl_end.inc
            

            The slave's result (where rows with a=100 and a=300 should both actually have value b, but show d because of the re-arrangement:

            a	en
            100	d
            200	b
            300	d
            

            Note too that binlog_row_metadata=FULL will also print the expectations for the ENUMs, so we can use this to reference the desired enum value.

            #250313 14:49:07 server id 1  end_log_pos 944 CRC32 0x2167fc1b 	Table_map: `test`.`t1` mapped to number 33
            # Columns(`a` INT NOT NULL,
            #         `en` ENUM('a','b','c') CHARSET latin1 COLLATE latin1_swedish_ci)
            # Primary Key(a)
            

            bnestere Brandon Nesterenko added a comment - - edited monty Enums are also broken. That is, if the slave modified its table's enums to have different ordering than the master, the enums are picked based on index, rather than value. Modifying your skr.test to instead test enums shows this: --source include/have_innodb.inc --source include/have_binlog_format_row.inc --source include/master-slave.inc   CREATE TABLE t1 (a INT PRIMARY KEY, en ENUM('a','b','c')) ENGINE = innodb; INSERT INTO t1 VALUES(100,'a'), (200,'b');   --sync_slave_with_master   --connection slave alter table t1 modify column en enum ('a','d', 'b', 'c');   connection master;   INSERT INTO t1 VALUES(300, 'b');   --sync_slave_with_master select * from t1;   connection master;   update t1 set en='b' where A=100;   --sync_slave_with_master select * from t1;   connection master; --source include/rpl_end.inc The slave's result (where rows with a=100 and a=300 should both actually have value b , but show d because of the re-arrangement: a en 100 d 200 b 300 d Note too that binlog_row_metadata=FULL will also print the expectations for the ENUMs, so we can use this to reference the desired enum value. #250313 14:49:07 server id 1 end_log_pos 944 CRC32 0x2167fc1b Table_map: `test`.`t1` mapped to number 33 # Columns(`a` INT NOT NULL, # `en` ENUM('a','b','c') CHARSET latin1 COLLATE latin1_swedish_ci) # Primary Key(a)

            serg, elenst:

            I discussed this with Monty, and here is the summary of our chat:

            The use case is limited to multi-master, with a distributed dataset where different masters update the same table but work on distinct ranges of data. So if one node runs an ALTER TABLE which re-arranges the order of the columns to be inconsistent with its master, then when that node replicates insert/update row events that target its altered table from its master, there will indeed be data loss.

            This should be fixed in 10.6, as we have many users/customers that run this topology running in 10.6, and the fix should be relatively easy with low risk. That is, to populate the fields of table->record[0] during unpacking using column names, rather than column indices. The patch applied in 10.6 should be relatively small; and any complex edge cases (e.g. enums) should just report as errors in 10.6, with fixes to be handled in newer versions. The patch in 10.6 can leverage ideas from the ONLINE ALTER work, and then in some 11.4+, the code can instead leverage the actual ONLINE ALTER code.

            To the risk, this work will break functionality if a slave renames columns from its master. Currently, this use-case works as long as the column index is in the same location. With this patch, we will not be able to find the column on the slave-side table to write to. Monty says this is not a practical case though, and that users don't/wouldn't actually do this.

            bnestere Brandon Nesterenko added a comment - serg , elenst : I discussed this with Monty, and here is the summary of our chat: The use case is limited to multi-master, with a distributed dataset where different masters update the same table but work on distinct ranges of data. So if one node runs an ALTER TABLE which re-arranges the order of the columns to be inconsistent with its master, then when that node replicates insert/update row events that target its altered table from its master, there will indeed be data loss. This should be fixed in 10.6, as we have many users/customers that run this topology running in 10.6, and the fix should be relatively easy with low risk. That is, to populate the fields of table->record [0] during unpacking using column names, rather than column indices. The patch applied in 10.6 should be relatively small; and any complex edge cases (e.g. enums) should just report as errors in 10.6, with fixes to be handled in newer versions. The patch in 10.6 can leverage ideas from the ONLINE ALTER work, and then in some 11.4+, the code can instead leverage the actual ONLINE ALTER code. To the risk, this work will break functionality if a slave renames columns from its master. Currently, this use-case works as long as the column index is in the same location. With this patch, we will not be able to find the column on the slave-side table to write to. Monty says this is not a practical case though, and that users don't/wouldn't actually do this.

            People

              bnestere Brandon Nesterenko
              monty Michael Widenius
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.