Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22180

Planner opens unnecessary tables when updated table is referenced by foreign keys

Details

    Description

      When performing a writing query on a referenced table, the planner opens all the tables referencing it, even if the modified fields are not the referenced ones and even if no record matches the WHERE clause. This behavior started to happen at some version between 10.1.20 and 10.1.40, as it is reproducible in the later and not in the former.
      It is interesting to mention that the problem is triggered even if just the explain is executed which seems to be related to the planner rather than the excution of the query itself.
      It takes a lot of time and resources, mainly when several FKs are involved.
      To reproduce it just follow these steps:
      1 . Create a DB with one table referenced by several tables. Here's a bash example of how to create it (and there is a dump file attached to this issue which you can just restore):

      mysql -uroot -e "DROP DATABASE IF EXISTS fktest; CREATE DATABASE fktest"
       
      mysql -uroot fktest -e "
        DROP TABLE IF EXISTS fkParent; 
        CREATE TABLE fkParent (
            id int(10) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
            textField varchar(10)
        );"
       
      for n in $(seq 12000); do
          [ $(($n % 100)) -eq 0 ] && echo $n
          mysql -uroot fktest -e " 
          CREATE TABLE fkChild$n (
            id int(10) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY, 
            fkField$n int(10) unsigned,
            CONSTRAINT FK_$n FOREIGN KEY (fkField$n) REFERENCES fkParent(id)
          );"
      done
      
      

      Then just access that DB and run this explain statement:

      explain UPDATE fkParent SET textField="test";
      

      It will take quite long and it seems that the planner is opnening all those referencing tables.
      If you run exactly the same test in version 10.1.20 it just goes instantaneous.

      To see it more clearly here are some evident tests:

      In version 10.1.40 (or any later version):

      MariaDB [fktest]> show status like "opened_table%";
      +--------------------------+-------+
      | Variable_name            | Value |
      +--------------------------+-------+
      | Opened_table_definitions | 0     |
      | Opened_tables            | 0     |
      +--------------------------+-------+
      2 rows in set (0.00 sec)
       
      MariaDB [fktest]> explain UPDATE fkParent SET textField="test";
      +------+-------------+----------+-------+---------------+---------+---------+------+------+-------+
      | id   | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows | Extra |
      +------+-------------+----------+-------+---------------+---------+---------+------+------+-------+
      |    1 | SIMPLE      | fkParent | index | NULL          | PRIMARY | 4       | NULL |    1 |       |
      +------+-------------+----------+-------+---------------+---------+---------+------+------+-------+
      1 row in set (16.18 sec)
       
      MariaDB [fktest]> show status like "opened_table%";
      +--------------------------+-------+
      | Variable_name            | Value |
      +--------------------------+-------+
      | Opened_table_definitions | 3934  |
      | Opened_tables            | 3934  |
      +--------------------------+-------+
      2 rows in set (0.01 sec)
      
      

      In version 10.1.20 looks quite better and does it in 0 miliseconds

      MariaDB [siemens]> show status like "opened_table%";
      +--------------------------+-------+
      | Variable_name            | Value |
      +--------------------------+-------+
      | Opened_table_definitions | 0     |
      | Opened_tables            | 0     |
      +--------------------------+-------+
      2 rows in set (0.01 sec)
       
      MariaDB [fktest]> explain UPDATE fkParent SET textField="test";
      +------+-------------+----------+-------+---------------+---------+---------+------+------+-------+
      | id   | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows | Extra |
      +------+-------------+----------+-------+---------------+---------+---------+------+------+-------+
      |    1 | SIMPLE      | fkParent | index | NULL          | PRIMARY | 4       | NULL |    1 |       |
      +------+-------------+----------+-------+---------------+---------+---------+------+------+-------+
      1 row in set (0.00 sec)
       
      MariaDB [siemens]> show status like "opened_table%";
      +--------------------------+-------+
      | Variable_name            | Value |
      +--------------------------+-------+
      | Opened_table_definitions | 0     |
      | Opened_tables            | 0     |
      +--------------------------+-------+
      2 rows in set (0.00 sec)
      
      

      Additionally, you can see it in the profiling detail for the failing version:

      MariaDB [fktest]> set profiling =1;
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [fktest]> explain UPDATE fkParent SET textField="test";
      +------+-------------+----------+-------+---------------+---------+---------+------+------+-------+
      | id   | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows | Extra |
      +------+-------------+----------+-------+---------------+---------+---------+------+------+-------+
      |    1 | SIMPLE      | fkParent | index | NULL          | PRIMARY | 4       | NULL |    1 |       |
      +------+-------------+----------+-------+---------------+---------+---------+------+------+-------+
      1 row in set (12.49 sec)
       
      MariaDB [fktest]> show profile for query 1;
      +----------------------+-----------+
      | Status               | Duration  |
      +----------------------+-----------+
      | starting             |  0.000259 |
      | checking permissions |  0.000055 |
      | Opening tables       | 12.409309 |
      | After opening tables |  0.006020 |
      | System lock          |  0.005769 |
      | Table lock           |  0.007041 |
      | init                 |  0.001493 |
      | query end            |  0.000017 |
      | closing tables       |  0.003398 |
      | Unlocking tables     |  0.060393 |
      | freeing items        |  0.000021 |
      | updating status      |  0.000045 |
      | cleaning up          |  0.002870 |
      +----------------------+-----------+
      13 rows in set (0.01 sec)
      
      

      As you can see, all the time is spent opening tables.

      Attachments

        Issue Links

          Activity

            It is also slow on version 10.5.8

            gsmethells Greg Smethells added a comment - It is also slow on version 10.5.8

            gsmethells, that depends on what you mean by "fixed". If you mean that the server still opens referenced tables, and you expected it not to do that just as it was not doing it before — this isn't likely to happen, not doing it was a bug that allowed concurrent DDLs on fk-referenced tables. It could've caused crashes or data corruption.

            The fix implemented here was to reduce the overhead and keep the DDL protection without actually opening tables when possible.

            What is your use case? Perhaps we've missed it. Or perhaps it is not possible to avoid opening of tables. Can you show your CREATE TABLE statements?

            serg Sergei Golubchik added a comment - gsmethells , that depends on what you mean by "fixed". If you mean that the server still opens referenced tables, and you expected it not to do that just as it was not doing it before — this isn't likely to happen, not doing it was a bug that allowed concurrent DDLs on fk-referenced tables. It could've caused crashes or data corruption. The fix implemented here was to reduce the overhead and keep the DDL protection without actually opening tables when possible. What is your use case? Perhaps we've missed it. Or perhaps it is not possible to avoid opening of tables. Can you show your CREATE TABLE statements?
            dsobczak Dan Sobczak added a comment - - edited

            Hello,

            Our use case is similar to the example above. We have a single authentication table as well as ~16k foreign key constraints referencing that table. Our foreign keys appear to be using the RESTRICT scenario, which as you stated above may be able to skip opening tables:

            With RESTRICT we, quite possibly, don't need to open child tables. Either we can ignore them completely (but beware MDEV-21175) or only mdl-lock them without opening.

            Here are some CREATE TABLE statements. What we are seeing is any UPDATE on our authentication table has slowed down considerably from 10.1.32.

            Single authentication table:

            CREATE TABLE `authentication` (
              `userID` int(10) unsigned NOT NULL AUTO_INCREMENT,
              `customerID` int(10) unsigned DEFAULT NULL,
              `email` varchar(50) NOT NULL,
              `verifier` varchar(64) NOT NULL,
              `name` varchar(100) NOT NULL,
              `accountStatus` tinyint(3) unsigned NOT NULL,
              `emailCode` varchar(64) DEFAULT NULL,
              `authChanged` datetime NOT NULL,
              `failures` tinyint(3) unsigned NOT NULL,
              `statusChanged` datetime NOT NULL,
              `homeServer` int(10) unsigned NOT NULL,
              `isHomeSet` tinyint(1) NOT NULL DEFAULT 0,
              `rpcKey` varchar(50) NOT NULL,
              PRIMARY KEY (`userID`),
              UNIQUE KEY `idx_authentication_email` (`email`),
              KEY `fk_authentication_customerID` (`customerID`),
              KEY `fk_authentication_homeServer` (`homeServer`),
              CONSTRAINT `fk_authentication_customerID` FOREIGN KEY (`customerID`) REFERENCES `customer` (`customerID`),
              CONSTRAINT `fk_authentication_homeServer` FOREIGN KEY (`homeServer`) REFERENCES `server` (`serverID`)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8
            

            Access table (one of many):

            CREATE TABLE `1_access` (
              `userID` int(10) unsigned NOT NULL,
              `uid` varchar(64) NOT NULL,
              `obj` varchar(8) NOT NULL,
              `granterID` int(10) unsigned NOT NULL,
              `hasEditor` tinyint(1) NOT NULL DEFAULT 0,
              `accessCreated` datetime(6) NOT NULL,
              PRIMARY KEY (`userID`,`uid`,`obj`),
              KEY `fk_1_access_granterID` (`granterID`),
              CONSTRAINT `fk_1_access_granterID` FOREIGN KEY (`granterID`) REFERENCES `authentication` (`userID`),
              CONSTRAINT `fk_1_access_userID` FOREIGN KEY (`userID`) REFERENCES `authentication` (`userID`)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8
            

            dsobczak Dan Sobczak added a comment - - edited Hello, Our use case is similar to the example above. We have a single authentication table as well as ~16k foreign key constraints referencing that table. Our foreign keys appear to be using the RESTRICT scenario, which as you stated above may be able to skip opening tables: With RESTRICT we, quite possibly, don't need to open child tables. Either we can ignore them completely (but beware MDEV-21175 ) or only mdl-lock them without opening. Here are some CREATE TABLE statements. What we are seeing is any UPDATE on our authentication table has slowed down considerably from 10.1.32. Single authentication table: CREATE TABLE `authentication` ( `userID` int ( 10 ) unsigned NOT NULL AUTO_INCREMENT, `customerID` int ( 10 ) unsigned DEFAULT NULL, `email` varchar( 50 ) NOT NULL, `verifier` varchar( 64 ) NOT NULL, `name` varchar( 100 ) NOT NULL, `accountStatus` tinyint( 3 ) unsigned NOT NULL, `emailCode` varchar( 64 ) DEFAULT NULL, `authChanged` datetime NOT NULL, `failures` tinyint( 3 ) unsigned NOT NULL, `statusChanged` datetime NOT NULL, `homeServer` int ( 10 ) unsigned NOT NULL, `isHomeSet` tinyint( 1 ) NOT NULL DEFAULT 0 , `rpcKey` varchar( 50 ) NOT NULL, PRIMARY KEY (`userID`), UNIQUE KEY `idx_authentication_email` (`email`), KEY `fk_authentication_customerID` (`customerID`), KEY `fk_authentication_homeServer` (`homeServer`), CONSTRAINT `fk_authentication_customerID` FOREIGN KEY (`customerID`) REFERENCES `customer` (`customerID`), CONSTRAINT `fk_authentication_homeServer` FOREIGN KEY (`homeServer`) REFERENCES `server` (`serverID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 Access table (one of many): CREATE TABLE `1_access` ( `userID` int ( 10 ) unsigned NOT NULL, `uid` varchar( 64 ) NOT NULL, `obj` varchar( 8 ) NOT NULL, `granterID` int ( 10 ) unsigned NOT NULL, `hasEditor` tinyint( 1 ) NOT NULL DEFAULT 0 , `accessCreated` datetime( 6 ) NOT NULL, PRIMARY KEY (`userID`,`uid`,`obj`), KEY `fk_1_access_granterID` (`granterID`), CONSTRAINT `fk_1_access_granterID` FOREIGN KEY (`granterID`) REFERENCES `authentication` (`userID`), CONSTRAINT `fk_1_access_userID` FOREIGN KEY (`userID`) REFERENCES `authentication` (`userID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8
            dsobczak Dan Sobczak added a comment -

            Example of a slow query, "failures" is not referenced by any foreign key so this UPDATE being slow is surprising.

            MariaDB [joints]> update authentication set failures=0 where userid=51;
            Query OK, 0 rows affected (3.61 sec)
            Rows matched: 1  Changed: 0  Warnings: 0
             
            MariaDB [joints]> show profile;
            +----------------------+----------+
            | Status               | Duration |
            +----------------------+----------+
            | starting             | 0.000087 |
            | checking permissions | 0.000007 |
            | Opening tables       | 3.575954 |
            | After opening tables | 0.000322 |
            | System lock          | 0.000017 |
            | Table lock           | 0.000266 |
            | init                 | 0.001088 |
            | updating             | 0.004479 |
            | end                  | 0.000031 |
            | query end            | 0.015326 |
            | closing tables       | 0.000014 |
            | Unlocking tables     | 0.008250 |
            | freeing items        | 0.000041 |
            | updating status      | 0.000062 |
            | cleaning up          | 0.000094 |
            +----------------------+----------+
            

            dsobczak Dan Sobczak added a comment - Example of a slow query, "failures" is not referenced by any foreign key so this UPDATE being slow is surprising. MariaDB [joints]> update authentication set failures= 0 where userid= 51 ; Query OK, 0 rows affected ( 3.61 sec) Rows matched: 1 Changed: 0 Warnings: 0   MariaDB [joints]> show profile; +----------------------+----------+ | Status | Duration | +----------------------+----------+ | starting | 0.000087 | | checking permissions | 0.000007 | | Opening tables | 3.575954 | | After opening tables | 0.000322 | | System lock | 0.000017 | | Table lock | 0.000266 | | init | 0.001088 | | updating | 0.004479 | | end | 0.000031 | | query end | 0.015326 | | closing tables | 0.000014 | | Unlocking tables | 0.008250 | | freeing items | 0.000041 | | updating status | 0.000062 | | cleaning up | 0.000094 | +----------------------+----------+
            nikitamalyavin Nikita Malyavin added a comment - - edited

            Hello dsobczak, gschulman and gsmethells!

            You were right, a time to open tables is not just suspicous.

            I've been working on a few bugs around foreign keys, namely MDEV-30103, where I've seen one of the possible solutions in making a complete bidirectional prelocking and pre-opening.

            Your report became an obstacle for that solution. I decided to find out what is taking so much time, and is it really because of ha_innobase::open taking long.
            I have repeated your test setup with 12k tables. Perf reports following:

            As you may see, most of the time spent was in MDL_context::find_ticket and prepare_fk_prelocking_list. I don't know how __divtf3 ended up there, it's probably some symbol missing in my debug libc build. Anyway there are two recognized functions taking 60% of time. find_ticket traverses a list of all MDL tickets, which is at least one ticket per table. prepare_fk_prelocking_list does nothing weighty except calling find_fk_prelocked_table, which also traverses a list, of opened tables this time. This means that open_tables took O(tables^2) time!

            Sounds nonsense for me, but I guess we just didn't expect so many tables to open in one query, it wasn't supposed as a normal use case.
            Making those two functions simply return NULL decreased a time to 0.15 seconds!

            It was no big difference with or without fix on the structure of the report. It's only that find_ticket somehow took more time (according to a number of samples reported by perf) in the reverted fix case.

            I have opened MDEV-30182 to implement an optimization. Such kind of work is usually only done in the major versions.

            Thank you for your alarms!

            nikitamalyavin Nikita Malyavin added a comment - - edited Hello dsobczak , gschulman and gsmethells ! You were right, a time to open tables is not just suspicous. I've been working on a few bugs around foreign keys, namely MDEV-30103 , where I've seen one of the possible solutions in making a complete bidirectional prelocking and pre-opening. Your report became an obstacle for that solution. I decided to find out what is taking so much time, and is it really because of ha_innobase::open taking long. I have repeated your test setup with 12k tables. Perf reports following: As you may see, most of the time spent was in MDL_context::find_ticket and prepare_fk_prelocking_list . I don't know how __divtf3 ended up there, it's probably some symbol missing in my debug libc build. Anyway there are two recognized functions taking 60% of time. find_ticket traverses a list of all MDL tickets, which is at least one ticket per table. prepare_fk_prelocking_list does nothing weighty except calling find_fk_prelocked_table , which also traverses a list, of opened tables this time. This means that open_tables took O(tables^2) time! Sounds nonsense for me, but I guess we just didn't expect so many tables to open in one query, it wasn't supposed as a normal use case. Making those two functions simply return NULL decreased a time to 0.15 seconds! It was no big difference with or without fix on the structure of the report. It's only that find_ticket somehow took more time (according to a number of samples reported by perf) in the reverted fix case. I have opened MDEV-30182 to implement an optimization. Such kind of work is usually only done in the major versions. Thank you for your alarms!

            People

              serg Sergei Golubchik
              gschulman Guillermo Schulman
              Votes:
              4 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.