[MDEV-33291] Slave SQL: Could not execute Delete_rows_v1/Update_rows_v1 event; Can't find record, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND or HA_ERR_NO_PARTITION_FOUND Created: 2024-01-22 Updated: 2024-02-08 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Replication, Storage Engine - InnoDB |
| Affects Version/s: | 10.6, 10.11, 11.0, 11.1, 11.2, 11.3, 11.4 |
| Fix Version/s: | 10.6, 10.11, 11.0, 11.1, 11.2, 11.3 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Roel Van de Paar | Assignee: | Andrei Elkin |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | affects-tests, regression-10.6 | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Description |
|
While the testcase does not immediately look significant (we're deleting from mysql.innodb_table_stats) note that this is
Leads to (on slave):
Present in 10.6+, opt+dbg builds |
| Comments |
| Comment by Roel Van de Paar [ 2024-01-22 ] | ||||||||||||||||||||||||||||||||
|
MTR Testcase. May be very lightly sporadic.
| ||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2024-01-22 ] | ||||||||||||||||||||||||||||||||
|
Affects tests as the filtering of this issue may mean missing other issues with a similar or identical outcome. | ||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2024-01-24 ] | ||||||||||||||||||||||||||||||||
|
I ran into this again, with different SQL. Attempts to reproduce this version in MTR failed, it reproduces using the pquery client only.
This results in, on the master:
However, the second Start binlog_dump to slave_server line only happens on the DELETE being executed
The retry reconnect looks fine (inc the GTID used), but the Can't find record in 'db' looks incorrect, as there are two entries in mysql.db (on the master and the slave) - the test db's. 1032=ER_KEY_NOT_FOUND. Final MTR attempt (does not reproduce the issue):
| ||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2024-01-30 ] | ||||||||||||||||||||||||||||||||
|
Increased prio on this ticket as this issue significantly and regularly affects replication testing. | ||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2024-01-30 ] | ||||||||||||||||||||||||||||||||
|
Leads to:
Same testcase but with a PK:
Gives instead:
| ||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2024-01-30 ] | ||||||||||||||||||||||||||||||||
|
It is likely that the original issue (inc first comment MTR testcase) is different from the later ones which require RESET MASTER and ROW based replication. | ||||||||||||||||||||||||||||||||
| Comment by Kristian Nielsen [ 2024-02-02 ] | ||||||||||||||||||||||||||||||||
|
1. The issue with mysql.innodb_table_stats happens because that table is updated in the background by InnoDB internals, and the contents not replicated. For some reason I didn't get any data in the table in my quick test. But if at some point it contains different data on master and slave, a row-binlogged update/delete on this table on the master is likely to result in "row not found" on the slave. I think tests that do DML on mysql.* tables can be used to check that the slaves do not crash or deadlock the whole slave server somehow, but the slave failing with error or getting different data must be expected. 2. I checked the code. When RESET MASTER is run while a slave is connected, it looks like the dump thread happily continues with its open filehandle to the old (now deleted) master-bin.000001 file. Then, once sufficient data is written to the new master-bin.000001 to make it longer than the old, the dump thread wakes up, tries to read from its stale file handle, finds EOF, and exits. The slave IO thread gets disconnected, and tries to reconnect with its current binlog position, which is now completely wrong since RESET MASTER was done on master. In (2), I think normally an error like "bogus data in log event" or checksum error will occur when the slave jumps to the invalid position. But if by chance the new position is valid, the slave will skip a bunch of events and can get a "can't find record" error like you saw. It's clear that RESET MASTER will in general require re-configuring or resetting slaves, since it deletes the binlogs on the master and reverts to the starting master-bin.000001 position and GTID 0-1-1. But the current behaviour of leaving the old dump thread reading a stale file handle, and then at some random time later jumping to an invalid position in the new binlog file, seems ... not ideal. I wonder what the correct behaviour should be? Maybe an error should be given if any slaves are connected when RESET MASTER is attempted. (The locking around RESET MASTER is already tricky, and it might be hard to avoid a race if a slave connects at the same time as RESET MASTER is run, but at least any connected slave could cause the RESET MASTER to fail until all slave connections are stopped from the slave servers). | ||||||||||||||||||||||||||||||||
| Comment by Andrei Elkin [ 2024-02-02 ] | ||||||||||||||||||||||||||||||||
|
knielsen, thanks for the analysis. Wrt mysql.innodb_table_stats the issue is more general. Clearly for tables of this kind replication events should be normally ignored on slave (logging alone may have sense). Perhaps a "slave" server option for that needs to be introduced. To follow up your RESET MASTER thinking, another policy would be to prevent the command when there's a connected slave that is a running dump thread that then would mark binlog files (one that is being read and the following ones) busy. bnestere, feel free to comment on. | ||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2024-02-02 ] | ||||||||||||||||||||||||||||||||
|
Thank you for the input and analysis knielsen and Elkin! Any idea what the connection with ROW based replication is? i.e. with MIXED and STATEMENT the RESET MASTER issues do not seem reproducible. Answering my own question after pondering it for a while, I guess due to the offset "mismatch" in the master binlog position. So it would seem not impossible to also see such outcomes with MIXED/STATEMENT setups, though they would be much less frequent (i.e. exact statement start position required?). | ||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2024-02-02 ] | ||||||||||||||||||||||||||||||||
|
Saw one more variation which was not filtered yet. Similar to earlier one, but on Delete_rows_v1 and yielding HA_ERR_END_OF_FILE.
| ||||||||||||||||||||||||||||||||
| Comment by Kristian Nielsen [ 2024-02-03 ] | ||||||||||||||||||||||||||||||||
|
In row-based replication, the slave searches the table for a specific row (compares all fields or at least the primary key), and thus can detect if that row is not present on the slave. In statement-based, there's just a DELETE or UPDATE statement, the statement is run on whatever data is in the table, there's no knowledge of which rows are expected to be there. You would be able to get a duplicate key error in statement-based if the RESET MASTER causes the slave to skip a DELETE, and then later an insert happens with a conflict of the row that was not deleted on the slave. But the root issue here is that the slave diverges from the master. Once that happens, a variety of errors can happen later, it completely depends on the data and query pattern. | ||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2024-02-03 ] | ||||||||||||||||||||||||||||||||
I don't think it's realistic to expect that technical data (statistics, auth data and such) would always be identical on a master and a slave. Even if you completely forbid any direct action on a slave, there can still be legitimate difference. Let's take for example ignore-db and similar parameters, they'll surely lead to some statistics existing on the master but not on the slave. Or different histogram_type will cause EITS differences. Or in case of OM => NS replication system tables on master can have different structure. | ||||||||||||||||||||||||||||||||
| Comment by Brandon Nesterenko [ 2024-02-03 ] | ||||||||||||||||||||||||||||||||
|
To the RESET MASTER problem: Can we kill the active dump threads to force a reconnect (similar to kill_zombie_dump_threads)? And then on reconnect, the binlog dump thread would know if there is an ongoing RESET MASTER because reset_master_pending would be true. So we'd then have a few options, e.g. 1) send some sort of error to the replica and self-kill the binlog dump thread connection, 2) before sending the initial fake rotate event, wait for the RESET MASTER to finish (reset_master_pending would be false) so we'd initialize the replica with the new binlog. And if master-slave consistency is a problem, perhaps we should track if active binlog dump thread connections are "caught up" with the existing binlogs before resetting them. Then we could delay the RESET MASTER until the binlogs have been sent to the replicas, and then RESET MASTER would kill the dump threads. I imagine this would be optional though, perhaps with an extension to the grammar similar to SHUTDOWN, e.g. RESET MASTER [WAIT FOR ALL REPLICAS]. Another thing to note, we can track reset masters using the pattern established in | ||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2024-02-05 ] | ||||||||||||||||||||||||||||||||
|
When not resulting in MDEV-33376, will - sporadically - run into:
This is reproducible with RBR, SBR and MBR. The 'Write_rows_v1' error is seen only when using RBR, otherwise (SBR/MBR) we see:
| ||||||||||||||||||||||||||||||||
| Comment by Andrei Elkin [ 2024-02-05 ] | ||||||||||||||||||||||||||||||||
|
The fact that ROW format is replication-unsafe for the "statistics" system tables can be captured formally in THD::decide_logging_format. | ||||||||||||||||||||||||||||||||
| Comment by Kristian Nielsen [ 2024-02-05 ] | ||||||||||||||||||||||||||||||||
|
Roel, there are two different issues here: 1. Row-based replication, user replicates an update to system tables that differ (here: mysql.innodb_table_stats). This can lead to row not found errors. I think this is "not a bug", as Elena wrote, "user must be aware". 2. RESET MASTER is run on the master while a slave is connected. Again, this is something that cannot really work, RESET MASTER requires stopping and reconfiguring replication. But in this case the way things "do not work" seems poor (the active dump threads remain attached to the deleted binlog file until the new file grows larger than the old one). I think this is a good test find, and I suggest simply giving an error for RESET MASTER with attached slaves is an appropriate fix. But again, once RESET MASTER is run without reconfiguring replication, slaves will be diverged and some random error will occur later; the exact nature of those errors are not meaningful as the root cause is misconfigured replication and master-slave difference. Hope this helps, - Kristian. | ||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2024-02-05 ] | ||||||||||||||||||||||||||||||||
|
knielsen Thank you. Agree on 2nd item. For the first one I hope there can be a better solution in time. I agree the priority is low for that one. As for the various errors observed, while there is indeed a common root cause, the various issues are documented here 1) to show the variety in resulting outcomes - for example the last testcase also reproduces on SBR and MBR, which is significantly different from all the others, and 2) before adding them to a filter list. | ||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2024-02-05 ] | ||||||||||||||||||||||||||||||||
|
Elkin Thank you. Can you clarify a bit further what you have in mind? | ||||||||||||||||||||||||||||||||
| Comment by Andrei Elkin [ 2024-02-06 ] | ||||||||||||||||||||||||||||||||
|
Roel, one way to fix the p.1 is to mirror the STATEMENT format unsafety. That is to refuse to log in ROW format with a new replication unsafe message. MIXED would translate to the actual STMT. | ||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2024-02-06 ] | ||||||||||||||||||||||||||||||||
|
What will then MIXED do if you write delete from mysql.innodb_table_stats limit 1? | ||||||||||||||||||||||||||||||||
| Comment by Andrei Elkin [ 2024-02-06 ] | ||||||||||||||||||||||||||||||||
|
elenst, there won't be any error on slave - and to avoid that is my point of course. | ||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2024-02-06 ] | ||||||||||||||||||||||||||||||||
|
Elkin, | ||||||||||||||||||||||||||||||||
| Comment by Andrei Elkin [ 2024-02-06 ] | ||||||||||||||||||||||||||||||||
|
elenst, sure the STATEMENT format of queries of the likes of mysql.innodb_table_stats would not be declared UNSAFE anymore. It's technically feasible. | ||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2024-02-06 ] | ||||||||||||||||||||||||||||||||
|
I tried to offer a simplified example, but okay.
Will the delete be logged in statement or row format? If it's SBR, it's unsafe at least because of limit. If it's RBR, it would be unsafe because of the stats table, as for RBR it's the same problem as a direct delete from the stats table. My point is, be it SBR or RBR, if you operate on tables with different contents between master and slave, you can always end up with inconsistency leading to replication abort, the only question is how many steps you'll have to make for that. So, I don't believe that making stat tables RBR-unsafe will solve the problem. | ||||||||||||||||||||||||||||||||
| Comment by Andrei Elkin [ 2024-02-06 ] | ||||||||||||||||||||||||||||||||
|
In this case mysql.innodb_table_stats is not the target table which is t. The UNSAFEty applies to the operation over t. This DELETE query must be regarded as UNSAFE, so ROW format for it. On the second thought the data consistency argument is actually serious.
| ||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2024-02-06 ] | ||||||||||||||||||||||||||||||||
|
It doesn't matter whether the table is a target or not, this distinction may only prolong the suffering a little bit. As long as a table with inconsistent contents participates in a query which gives birth to binary log events, it can always cause discrepancy => errors on the slave server side => replication abort in a non-idempotent mode.
It is likely not the shortest possible example, just the first that came to mind. | ||||||||||||||||||||||||||||||||
| Comment by Andrei Elkin [ 2024-02-07 ] | ||||||||||||||||||||||||||||||||
|
Being the target table matters in your example. That is
would be SAFE in ROW format, t would be consistent across master-slave. (Conversely and in contrast mysql.innodb_table_stats as the target table is not SAFE in ROW format).
which must be detected as UNSAFE and binlogged in ROW format which you test demonstrates it does not do. | ||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2024-02-07 ] | ||||||||||||||||||||||||||||||||
That's because these are just examples, they are expected to be extrapolated properly when you think of a common solution. But you are suggesting a solution each time based on one example only, not on a general problem, that's why I provide the next one trying to explain why the suggestion won't work.
Why is that? | ||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2024-02-07 ] | ||||||||||||||||||||||||||||||||
|
Also observed as a result of RESET MASTER:
As well as
Interesting here is that the in both cases master binlog issues were detected. |