Store Foreign Key metadata outside of InnoDB (MDEV-16417)

[MDEV-21652] FK migration from old version Created: 2020-02-04  Updated: 2023-10-30

Status: Stalled
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: 11.5

Type: Technical task Priority: Major
Reporter: Aleksey Midenkov Assignee: Aleksey Midenkov
Resolution: Unresolved Votes: 0
Labels: None
Environment:

bb-10.6-midenok-MDEV-16417-atomic


Issue Links:
Blocks
blocks MDEV-21052 InnoDB foreign key refactoring for TA... Stalled
is blocked by MDEV-17567 Atomic DDL Closed
is blocked by MDEV-25180 Atomic ALTER TABLE Closed
Relates
relates to MDEV-21052 InnoDB foreign key refactoring for TA... Stalled

 Description   

The task is to upgrade table foreign key metadata for the old storage of InnoDB system tables SYS_FOREIGN and SYS_FOREIGN_COLS by reading the foreign key definitions on table open and storing them into table's FRM file. When the system tables SYS_FOREIGN(_COLS) become empty during the process of foreign keys upgrade they are automatically dropped.

The task is required for the database migration from the previous versions of MariaDB. Without this functionality any foreign keys defined in previous versions of MariaDB will not have effect and will become inaccessible.

Foreign key upgrade is transparent operation, i.e. nothing special has to be done to start using the tables. The table is upgraded automatically whenever it is opened. However there is an option to upgrade all tables in all databases at once by issuing mysql_upgrade command (no special options required as long as user tables processed).

fk_check_legacy_storage(), fk_upgrade_legacy_storage()

fk_check_legacy_storage() checks whether upgrade is required for the
given table name by looking SYS_FOREIGN table for corresponding record
existence.

fk_upgrade_legacy_storage() does the upgrade routine which includes
getting the foreign keys from SYS_FOREIGN[_COLS], updating the
foreign/referenced shares as well as their FRM files, deleting the
records from SYS_FOREIGN[_COLS] tables.

Both routines utilize the internal SQL for SYS_FOREIGN[_COLS]
processing.

Upgrade foreign keys via backoff action

When table is opened fk_check_legacy_storage() detects whether upgrade
is required and HA_ERR_FK_UPGRADE is returned to SQL layer which then
handles this error by backoff action from Open_table_context where
table is opened again with HA_OPEN_FOR_REPAIR flag which indicates
that fk_upgrade_legacy_storage() is required. After
fk_upgrade_legacy_storage() is done fk_check_legacy_storage() is
checked again to ensure that SYS_FOREIGN[_COLS] are empty for the
given table.

DB_LEGACY_FK and HA_ERR_FK_UPGRADE error codes

Indicate that table has legacy data in SYS_FOREIGN[_COLS].

Check foreign/referenced indexes existence

fk_upgrade_legacy_storage() via fk_upgrade_push_fk() fails if there
are no indexes in foreign/referenced tables for the given data
acquired from SYS_FOREIGN[_COLS].

Internal SQL: select into both func and vars extension

fk_upgrade_legacy_storage() utilizes syntax extension in internal SQL:

FETCH c INTO fk_upgrade_create_fk() fk_id, unused;

Thus the data is fetched into both fk_upgrade_create_fk() function and
fk_id variable.

Rename table, rename column, drop table, drop column handling

When foreign table is opened it is automatically upgraded by backoff
action. But if the referenced table is altered or dropped first there
is no chance for the foreign table to get the correct data. So the
SYS_FOREIGN_[COLS] must be kept in sync with the above DDL operations
in respect of referenced names. DROP TABLE for the referenced table is
disabled as usual. DROP TABLE, DROP COLUMN relied in 10.5 on
dict_foreign_t cached data for the referenced tables. Now there is no
such possibility for the legacy data so we have to look at
SYS_FOREIGN_[COLS] directly.

Reverted some SYS_FOREIGN(_COLS) routines

Rename table and rename column handling was done in sync with
SYS_FOREIGN_[COLS] in 10.5. To retain the above DDL consistency for
the referenced tables we still use that old synchronization code.

ALGORITHM=COPY handling

Since we cannot faingrain ALGORITHM=COPY in innobase handler it is
disabled for the referenced tables unless the foreign tables are upgraded.

The check is done in create_table_info_t::create_table() and is
equivalent to DROP TABLE check as we are actually dropping the old
table after the copy routine is done.

HA_EXTRA_CHECK_LEGACY_FK

Extra interface to check whether ALGORITHM=COPY is possible. If there
is legacy data in SYS_FOREIGN[_COLS] about foreign tables referring
altered table, ALGORITHM=COPY is not possible.

WITH_INNODB_FOREIGN_UPGRADE macro

Every SYS_FOREIGN_[COLS] handling is wrapped inside
WITH_INNODB_FOREIGN_UPGRADE compilation macro. When this macro is
disabled the foreign key upgrade is not possible. Future versions will
obsolete the upgrade procedure completely and the wrapped code will be
removed.

innodb_eval_sql debug interface

Test cases must fill SYS_FOREIGN_[COLS] with data. This is done with
setting the new innodb_eval_sql debug variable. The value of that
variable is processed by que_eval_sql().

Some syntax error-friendly parser handling

que_eval_sql() was unfriendly on syntax errors: it just failed with
SIGABRT exception. To keep the server alive some frequent syntax
errors are now returned as DB_ERROR from que_eval_sql().

Drop empty SYS_FOREIGN[_COLS] after import

fk_create_legacy_storage debug flag to create SYS_FOREIGN[_COLS] on
innodb_eval_sql.

fk_release_locks() iterates through all indexes and pages and releases
locks on SUPREMUM records placed there previously by SELECT FOR
UPDATE. Also it releases all table locks.

pars_info_t::fatal_syntax_err (debug only)

When false don't abort server on syntax errors while parsing internal
SQL code.



 Comments   
Comment by Ralf Gebhardt [ 2020-02-14 ]

midenok, this task will now be moved to 10.6 as it, and the task MDEV-16417, cannot be finished for 10.5 anymore. This has been agreed with serg

Comment by Marko Mäkelä [ 2020-12-21 ]

I can see one related test failure:

bb-10.6-midenok-MDEV-16417 eb6953068650d6e9d54e7443f2339f2bb3e08f16

innodb.foreign-keys 'innodb'             w4 [ fail ]
        Test ended at 2020-12-08 13:48:14
 
CURRENT_TEST: innodb.foreign-keys
--- D:/winx64-debug/build/src/mysql-test/suite/innodb/r/foreign-keys.result	2020-12-08 13:20:14.123162100 +0000
+++ D:\winx64-debug\build\src\mysql-test\suite\innodb\r\foreign-keys.reject	2020-12-08 13:48:14.475908900 +0000
@@ -530,8 +530,6 @@
 drop tables t4, t1, t3, t2;
 ERROR HY000: Lost connection to MySQL server during query
 # State after crash (before recovery)
-#sqlf-shadow-t2.frm
-#sqlf-shadow-t3.frm
 db.opt
 t1.MYD
 t1.MYI
 
mysqltest: Result length mismatch

This branch includes many commits, also some preceding changes to InnoDB files in MDEV-20865 and MDEV-21052. Therefore, I think that the final review cannot be conducted before those changes have been reviewed and approved.

I see that the change is introducing a cmake compilation option WITH_INNODB_LEGACY_FOREIGN_STORAGE. I am not happy about that, and I cannot accept the idea that this option would be enabled by default, that is, even in non-debug versions, we would keep creating useless system tables that would never be used.

I would find it acceptable to create the tables SYS_FOREIGN and SYS_FOREIGN_COLS in a debug build, controlled by a specific debug-only parameter (which would be specified in the .opt file of the tests of the upgrade logic). That parameter would ideally be debug_dbug (based on DBUG_EXECUTE_IF).

The change is generating pars0grm.cc with an older version of Bison than it used to be created with. Do we really have to change the InnoDB internal SQL parser for this task, only to benefit some debug code for testing upgrades without actually using an older server in the upgrade tests? Yes, we know that the parser is cranky (likes to crash with any syntax error), but I’d rather avoid extending it.

If I understood the changes correctly, the function fk_check_legacy_storage() would crash if the table SYS_FOREIGN did not exist. I think that a properly implemented upgrade must:

  • stop creating the tables SYS_FOREIGN and SYS_FOREIGN_COLS
  • apply "delete if table exists" semantics
  • DROP the old tables when the last legacy FOREIGN KEY has been upgraded or dropped

I also noticed that a debug assertion had been commented out in create_table_info_t::create_table_def() without any explanation. Such changes are not acceptable.

Comment by Aleksey Midenkov [ 2021-01-21 ]

Mandatory changes from Marko's review:

1. Drop SYS_FOREIGN[_COLS] when empty; Fixed
2. debug assertion had been commented out; Fixed, I guess

marko Please complement this list if I missed something.

I can see one related test failure:

Test failure is related to MDEV-21053.

Do we really have to change the InnoDB internal SQL parser for this task, only to benefit some debug code for testing upgrades without actually using an older server in the upgrade tests?

It's not only for testing, it makes syntax better with relatively simple changes which are used in foreign key import code.

I see that the change is introducing a cmake compilation option WITH_INNODB_LEGACY_FOREIGN_STORAGE. I am not happy about that, and I cannot accept the idea that this option would be enabled by default, that is, even in non-debug versions, we would keep creating useless system tables that would never be used.

The purpose of this option is different. It either compiles FK migration code or not. Without option enabled it is impossible to migrate old foreign key data. Now it doesn't create SYS_FOREIGN[_COLS] as per p.1 fix.

The change is generating pars0grm.cc with an older version of Bison

As a bonus cmake-generated InnoDB lex and grm were made.

Comment by Marko Mäkelä [ 2021-02-24 ]

I assume that http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.6-midenok-MDEV-16417 is the latest version of this. For some reason, it fails to build on most platforms, and there is no platform where all tests would have passed.

I think that it would have been cleaner if upgrade had remained an integral part of MDEV-21052. Please address the comments that I made in MDEV-21052.

Comment by Marko Mäkelä [ 2021-02-24 ]

I think that a mandatory part of the task of moving FOREIGN KEY metadata storage out of InnoDB is to stop creating the old metadata tables. After an upgrade, the old metadata must converge towards a point where the old metadata tables become empty and can be dropped.

In a new installation, the old tables SYS_FOREIGN and SYS_FOREIGN_COLS should never exist.

Immediately after an upgrade, those tables will necessarily exist. When a table is opened and its metadata is loaded, and the .frm file version indicates that the table was created in an older server, then we must try to load the FOREIGN KEY metadata from SYS_FOREIGN and SYS_FOREIGN_COLS if those tables still exist.

How to ensure that the old metadata tables will shrink and can eventually be dropped? I think that this is easiest to implement as part of DDL operations: TRUNCATE, OPTIMIZE, ALTER, DROP, RENAME. As long as the old metadata tables exist, all of those operations must attempt to delete the metadata from SYS_FOREIGN and SYS_FOREIGN_COLS. Once the tables become empty, they must be dropped.

As far as I understand, we are almost following these principles. One thing that definitely looks wrong to me is that in innobase_rename_column_try(), as part of renaming columns, we are executing UPDATE SYS_FOREIGN_COLS instead of DELETE FROM SYS_FOREIGN_COLS. I think that on any ALTER TABLE operation, we must try to delete the foreign key metadata from the old tables.

I am not happy to see the undocumented function fk_release_locks(). Is there really no other way to implement fk_cleanup_legacy_storage()? Such as, simply letting row_drop_table_for_mysql() rename the table and invoke row_add_table_to_background_drop_list(), or executing these renames manually in a single transaction? I think that we will want to get rid of both tables atomically, so that there is no need to implement rarely executed code to drop the other table if only one of them exist after the server was killed and restarted. (I do not think that multiple tables can be directly dropped atomically in a single transaction until some undo log format changes have been implemented in MDEV-18518.)

Comment by Aleksey Midenkov [ 2021-02-24 ]

marko Sorry, the branch is bb-10.6-midenok-MDEV-16417-atomic.

Comment by Marko Mäkelä [ 2021-02-24 ]

Thank you. That branch looks better (but would benefit from a merge from, or a rebase to, the latest 10.6).

Comment by Marko Mäkelä [ 2021-02-25 ]

Please address my comment regarding fk_release_locks() and ensure that the code compiles without warnings on GCC and clang. GCC 10.2.1 reported several -Wmaybe-uninitialized for me when I compiled the debug build with -Og.

Also, please ensure that the upgrade works correctly with ALGORITHM=COPY. I tested the following:

diff --git a/mysql-test/suite/innodb/t/foreign_key_legacy.combinations b/mysql-test/suite/innodb/t/foreign_key_legacy.combinations
new file mode 100644
index 00000000000..ba58beb28fd
--- /dev/null
+++ b/mysql-test/suite/innodb/t/foreign_key_legacy.combinations
@@ -0,0 +1,5 @@
+[COPY]
+--alter-algorithm=copy
+
+[DEFAULT]
+--alter-algorithm=default

This triggered a test failure:

bb-10.6-midenok-MDEV-16417-atomic 99cb1fe51e3ead519bc7f62cd6bdaafda2c8e2be

innodb.foreign_key_legacy 'COPY,innodb,lcase_def' w3 [ fail ]
        Test ended at 2021-02-25 08:20:45
 
CURRENT_TEST: innodb.foreign_key_legacy
mysqltest: At line 247: query 'show create table t2' failed: 1034: Index for table 't2' is corrupt; try to repair it

Each combination with ALGORITHM=COPY failed.

Comment by Aleksey Midenkov [ 2021-03-05 ]

Your comments were addressed.

Comment by Marko Mäkelä [ 2021-03-15 ]

Please address my review comments and try to avoid unnecessary changes.

New code must not be formatted in the old InnoDB style. No /*======*/ or /*!< or TABs, please.

The error code DB_LEGACY_FK is not mentioned in the commit message. (Only HA_ERR_FK_UPGRADE is.)

What are the exact implications on the ALGORITHM=COPY restriction (cannot refine something)? The table in MDEV-11424 mentions that adding a stored generated column requires ALGORITHM=COPY. What exactly will happen on such operations when the table includes old-format foreign key metadata? Can you please extend tests to cover that case? How will replication work?

Comment by Aleksey Midenkov [ 2021-03-18 ]

Please address my review comments and try to avoid unnecessary changes.

Addressed.

What are the exact implications on the ALGORITHM=COPY restriction (cannot refinesomething)?

10.5 checked that at SQL layer with fk_check_column_changes(). Now such refine is not possible since there is no foreign keys at SQL layer from legacy InnoDB storage. There is no need to do special InnoDB refine as such problems easily resolved by issuing CHECK TABLE f.ex.

The table in MDEV-11424 mentions that adding a stored generated column requires ALGORITHM=COPY. What exactly will happen on such operations when the table includes old-format foreign key metadata?

The same applies as in my previous paragraph.

Can you please extend tests to cover that case?

Added.

How will replication work?

I hope, perfectly. Added test cases for concurrency, mysql_upgrade, replication.

Comment by Marko Mäkelä [ 2021-03-22 ]

midenok, in my previous review, I understood that some ALTER TABLE operations would be refused if only ALGORITHM=COPY is possible. Can you please clarify what the actual limitation is? The ALGORITHM=COPY implementation is supposed to be the fall-back option when nothing else is available. It must discover and logically preserve the FOREIGN KEY constraints.

I do not think that this can be properly tested before MDEV-25180 and MDEV-17567 have been completed.

Comment by Aleksey Midenkov [ 2021-03-23 ]

The limitation is to run CHECK TABLE or mysql_upgrade before using ALGORITHM=COPY.

Comment by Marko Mäkelä [ 2021-07-01 ]

Before a review can be conducted, these changes will need to be rebased to the 10.7 branch, once that branch has been properly created so that it includes the changes of the 10.6.3 release.

Generated at Thu Feb 08 09:08:47 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.