[MDEV-25607] Auto-generated DELETE from HEAP table can break replication - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL)
Fix Version/s: 10.5.26, 10.6.19, 10.11.9, 11.1.6, 11.2.5, 11.4.3
Component/s: Replication, Storage Engine - Memory
Labels:
None

Description

After server restart, a DELETE query is written into the binary log for every HEAP table to reflect the restart emptying them. It is written unconditionally, regardless whether it's actually executable or not. If it is not, it causes replication abort.

In the example test case below DELETE causes an error because the table has a DELETE trigger which refers to a non-existing table.

--source include/master-slave.inc

reset master;

create table t (a int) engine=MEMORY;

create trigger tr after delete on t for each row update t2 set a = 1;

insert into t values (1);

--let $rpl_server_number= 1

--source include/rpl_restart_server.inc

check table t;

--sync_slave_with_master

# Cleanup

--connection master

drop table t;

--source include/rpl_end.inc

10.2 e788738e
master-bin.000002 4 Format_desc 1 256 Server ver: 10.2.38-MariaDB-debug-log, Binlog ver: 4
master-bin.000002 256 Gtid_list 1 299 [0-1-3]
master-bin.000002 299 Binlog_checkpoint 1 343 master-bin.000002
master-bin.000002 343 Gtid 1 385 GTID 0-1-4
master-bin.000002 385 Query 1 474 DELETE FROM `test`.`t`

Last_Errno	1146

Last_Error	Error 'Table 'test.t2' doesn't exist' on query. Default database: 'test'. Query: 'DELETE FROM `test`.`t`'

Attachments

Issue Links

relates to

MDEV-18803 Memory tables replication

Open

MDEV-19732 Option binlog_ignore_db ignored on MemoryBuffer database

Open

MDEV-20106 Restart slave with memory table break replication in strict-mode

Open

Activity

Ascending order - Click to sort in descending order

View 8 older comments

Sergei Golubchik added a comment - 2024-06-16 16:20

knielsen, do you think this change could cause any problems?

Sergei Golubchik added a comment - 2024-06-16 16:20 knielsen , do you think this change could cause any problems?

Kristian Nielsen added a comment - 2024-06-18 07:10

I agree with the patch that using TRUNCATE is more appropriate than DELETE. The truncate more closely represents the operation that (implicitly) occured on the master on the HEAP table due to the restart. DELETE triggers were not run on the master, so they mustn't be run on the slave either. The test case is for an invalid trigger that causes an error, but the more natural case of a "real" trigger would similarly cause the slave to diverge.

I don't think this patch will make HEAP tables work well for replication. Ok, so now a master restart will propagate the loss of table rows to the slave (assuming there isn't another corner case that's overlooked). But if the slave restarts, the replication still diverges.

Anyway, this special case for binlogging row deletion for HEAP table at server start seems to be ancient code, and using TRUNCATE instead of DELETE shouldn't make things worse, at least, so I don't see any problems.

- Kristian.

Kristian Nielsen added a comment - 2024-06-18 07:10 I agree with the patch that using TRUNCATE is more appropriate than DELETE. The truncate more closely represents the operation that (implicitly) occured on the master on the HEAP table due to the restart. DELETE triggers were not run on the master, so they mustn't be run on the slave either. The test case is for an invalid trigger that causes an error, but the more natural case of a "real" trigger would similarly cause the slave to diverge. I don't think this patch will make HEAP tables work well for replication. Ok, so now a master restart will propagate the loss of table rows to the slave (assuming there isn't another corner case that's overlooked). But if the slave restarts, the replication still diverges. Anyway, this special case for binlogging row deletion for HEAP table at server start seems to be ancient code, and using TRUNCATE instead of DELETE shouldn't make things worse, at least, so I don't see any problems. - Kristian.

Andrei Elkin added a comment - 2024-06-18 10:20

> if the slave restarts...

A good point.
To that matter I'd consider a partial backup from master for all non-durable tables (except of course those that \in replicate_ignore_*).
The partial backup for a memory table could be implemented as INSERT..SELECT ROW-format events generated by master and sent to slave at its connecting time for pre-processing before any first replicated event on top of the backup image has arrived.

Andrei Elkin added a comment - 2024-06-18 10:20 > if the slave restarts... A good point. To that matter I'd consider a partial backup from master for all non-durable tables (except of course those that \in replicate_ignore_* ). The partial backup for a memory table could be implemented as INSERT..SELECT ROW-format events generated by master and sent to slave at its connecting time for pre-processing before any first replicated event on top of the backup image has arrived.

Andrei Elkin added a comment - 2024-06-18 10:21

Approved.

Andrei Elkin added a comment - 2024-06-18 10:21 Approved.

Brandon Nesterenko added a comment - 2024-07-06 02:18

Pushed into 10.5 as cbc1898e82b.

Merge conflict observed in 10.6 with fix: 10.6-MDEV-25607-mergefix

Brandon Nesterenko added a comment - 2024-07-06 02:18 Pushed into 10.5 as cbc1898e82b . Merge conflict observed in 10.6 with fix: 10.6-MDEV-25607-mergefix

People

Assignee:: Brandon Nesterenko

Reporter:: Elena Stepanova

Votes:: 1 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 2021-05-05 22:52

Updated:: 2024-08-12 04:21

Resolved:: 2024-07-06 02:18

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server