[MDEV-30021] FK actions are ignored after tc_purge and COMMIT Created: 2022-11-16  Updated: 2023-12-15

Status: Stalled
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6
Fix Version/s: 10.5, 10.6, 10.11, 11.0

Type: Bug Priority: Critical
Reporter: Nikita Malyavin Assignee: Nikita Malyavin
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Blocks
blocks MDEV-29181 Potential corruption on Foreign key u... In Review
Problem/Incident
causes MDEV-30869 Failing assertion: rec == NULL || !(d... Open
Relates
relates to MDEV-30103 FK constraint fails upon referenced t... Stalled

 Description   

To reproduce, please specify

--mariadbd=--innodb_evict_tables_on_commit_debug

--source include/have_debug_sync.inc
--source include/have_innodb.inc
 
create table t1 (a int, key(a), b int) engine=innodb;
create table t2 (a int key references t1(a), b int) engine=innodb;
insert t1 values (1, 1);
insert t2 values (1, 1);
set autocommit =0;
 
 
--connect purge, localhost, root
BACKUP STAGE start;
--connection default
 
begin;
update t1 set b = b + 1;
# Commenting this makes the test pass
update t2 set b = b + 1;
 
--connect bug, localhost, root
set debug_sync="open_tables_after_open_and_process_table signal open wait_for go";
send
update t1 set a = a + 1;
 
--connection purge
BACKUP STAGE block_commit;
 
BACKUP STAGE END;
 
--connection default
set debug_sync="now wait_for open";
commit;
set debug_sync="now signal go";
--connection bug
--error ER_ROW_IS_REFERENCED_2
reap;
--connection default
 
 
# Cleanup
set debug_sync= reset;
drop table t2, t1;

Actual result, rev. 505da21e3

mysqltest: At line 36: query 'reap' succeeded - should have failed with error ER_ROW_IS_REFERENCED_2 (1451)...

Another suitable alternative to BACKUP STAGE can be FLUSH TABLES.

The table affected by a transaction is removed on COMMIT in lock_release().
Then it is removed from all {{referenced_set}}s in dict_sys_t::remove().
The trick is to make t2 reference count = 0, and leave t1 reference count != 0.
So t1 was opened for write, but t2 is going to be opened soon by prelocking,
so sync point is added right after t1 was opened.



 Comments   
Comment by Marko Mäkelä [ 2022-11-16 ]

I can reproduce this with the following:

--source include/have_debug_sync.inc
--source include/have_innodb.inc
--source include/have_debug.inc
 
set @save_debug = @@GLOBAL.innodb_evict_tables_on_commit_debug;
SET GLOBAL innodb_evict_tables_on_commit_debug=1;
create table t1 (a int, key(a), b int) engine=innodb;
create table t2 (a int key references t1(a), b int) engine=innodb;
insert t1 values (1, 1);
insert t2 values (1, 1);
set autocommit =0;
 
 
--connect purge, localhost, root
BACKUP STAGE start;
--connection default
 
begin;
update t1 set b = b + 1;
# Commenting this makes the test pass
update t2 set b = b + 1;
 
--connect bug, localhost, root
set debug_sync="open_tables_after_open_and_process_table signal open wait_for go";
send
update t1 set a = a + 1;
 
--connection purge
BACKUP STAGE block_commit;
 
BACKUP STAGE END;
 
--connection default
set debug_sync="now wait_for open";
commit;
set debug_sync="now signal go";
--connection bug
--error ER_ROW_IS_REFERENCED_2
reap;
--connection default
 
 
# Cleanup
set debug_sync= reset;
SET GLOBAL innodb_evict_tables_on_commit_debug=@save_debug;
drop table t2, t1;

10.6 ae6ebafd819d48c965d2615fc78f1f950e0fbf40

mysqltest: At line 39: query 'reap' succeeded - should have failed with error ER_ROW_IS_REFERENCED_2 (1451)...

Comment by Nikita Malyavin [ 2022-11-16 ]

This bug is represented as an assertion failure in f3cc51e4 (MDEV-29181), as was found by the latest RQG

Comment by Nikita Malyavin [ 2022-11-16 ]

AS far as I understand the idea of innodb_evict_tables_on_commit_debug, it is used to imitate eviction in a cheap way.
Unfortunately, for eviction it can be even more dramatic: as only half of the tables is normally evicted, we can end up in a situation when
both referenced and refegencing tables have refcount=0, but only one of them can be evicted. A referenced table may just stay in upper half of the LRU cache,
either because it's just in upper half in comparison to a referencing one,.

So I see a few ways of fixing it.
1. Allow safe removal of a single referencing table, just set foreign_table = nullptr for every foreign_set element (which is shared with a referenced table in referenced_set). Unfortunately, this didn't work for free, as only referenced_table could be opened on demand before, so some careful fixing will be required. Also
it would require a careful testing.
2. Do not remove a table whenever it has some still used referenced tables. If no tables are used, evict them from cache altogether. This will change the caching logic related to tables with foreign keys. Though it never worked correctly before. The whole relation graph will have to be checked. LRU seems a very core component, so modifications to it are unlikely and nominally require good testing. However, the changes will be carefully localized, and everything will happen under single dict_sys lock, so it looks less error-prone for me.
3. Load the missing tables on table open. For example, whenever table is going to be modified. Combine with 1. This can have some added cost for delete-only operations, however normally foreign tables should be represented. Reopening can only happen after a combination of FLUSH (or other tc_purge call) and eviction, but iteration over referenced_set is inevitable.

Comment by Nikita Malyavin [ 2022-12-09 ]

Approach #3 wouldn't save us from a parallel FLUSH+evict. What could be done is a reference count could be increased once the tables are loaded, but then DROP-ing a referenced table under FOREIGN_KEYS_CHECK=off wouldn't work.

Approach#2 would required a significant change in eviction logic. Besides, TRUNCATE/DROP will do it without waiting for release.

So I went by approach #1.

There code base already was expecting this possible, but with some issues, see MDEV-30103 .

marko please review commit 4021d847e, branch bb-10.6-MDEV-29181

Comment by Marko Mäkelä [ 2022-12-19 ]

This look OK to me. Please wait for stress test results.

Comment by Marko Mäkelä [ 2023-01-26 ]

This was tested OK, according to a comment in MDEV-29181.

Generated at Thu Feb 08 10:13:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.