[MDEV-8915] Replication ignoring column which is missing on slave Created: 2015-10-07 Updated: 2023-06-06 Resolved: 2023-06-06 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera, Replication |
| Affects Version/s: | 10.0.21 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Tim Soderstrom | Assignee: | Andrei Elkin |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | columns, galera, replication | ||
| Environment: |
Ubuntu 12.04.3 |
||
| Description |
|
Replication appears to silently fail on a table which exists on the slave that is missing a column that is on the master. The master is part of a 3-node Galera setup. The column was created during an rolling-schema update which makes me wonder if that's the reason the column never made it to the slave. Row based replication is of course being used here. The relay log on the slave includes the column when doing a 'mysqlbinlog -v -v'. The column is the last column on the table. The table is getting updated on the slave, minus the column. So it looks like it's just silently ignoring the missing column from the replication stream but writing the rest of the data? Slave_skip_errors is empty and slave_sql_verify_checksum is enabled. |
| Comments |
| Comment by Elena Stepanova [ 2015-10-17 ] |
|
The description of the problem is quite unclear: e.g. you say that replication fails, but silently, but then you say that (some?) columns are updated; or, first you see that a column is missing on the slave, but then that it's present as the last column... Could you please provide a specific example of what appears to be the problem? E.g. SHOW CREATE TABLE on the master and slave, event that is produced by the master, and what you see on the slave. Please also attach config files from the master and slave. Thanks. |
| Comment by Tim Soderstrom [ 2015-10-19 ] |
|
Valid points. Here is the table we are using: CREATE TABLE `test1` ( We have a 3 node Galera cluster with one of the nodes replicating to a conventional MariaDB slave (node1 in our case). The MariaDB slave is what was silently failing. By silently failing, I mean that the added column did not exist on the slave, but table updates were still being processed by the slave. I confirmed in looking at the binary log that data for the missing row was in there. Replication should have failed due to the missing column on the slave, but instead, MariaDB was ignoring the column and inserting the others into the table. The sequence of events that caused us to discover this was: 1. We performed a Rolling Schema Update on Galera cluster with one of the nodes serving as a replication master to a regular MariaDB slave We were able to fix this by manually adding the column and then using pt-table-sync to sync the missing data. As we have corrected this, we do not have an immediate way to reproduce the bug at the moment. We should have more time to do this later this year if need be. For now, our work around is to run pt-table-checksum and compare the results from time to time. |
| Comment by Elena Stepanova [ 2019-04-06 ] |
|
Elkin, |
| Comment by Jan Lindström [ 2023-06-06 ] |
|
10.0 and 10.1 are EOL. |