Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32899

InnoDB is holding shared dict_sys.latch while waiting for FOREIGN KEY child table lock on DDL

Details

    Description

      In order to fix the race conditions MDEV-26217 and MDEV-26554, some code was added so that InnoDB could hold a shared dict_sys.latch while waiting for an exclusive lock on tables that are connected by FOREIGN KEY statements. This is not acceptable, because a lock wait can be blocked for a long time (worst case, indefinitely if innodb_lock_wait_timeout=100000000). If another thread tries to acquire an exclusive dict_sys.latch, it will block any other threads from acquiring a shared dict_sys.latch until the table lock wait has been resolved.

      This bug can be fixed by changing lock_table_for_trx() so that whenever the caller is holding a shared dict_sys.latch, it will be released and reacquired around the call to lock_wait(). In this way, the lock object will be created or released while the table is protected by the shared dict_sys.latch. It is safe to temporarily release the dict_sys.latch, because tables on which lock objects exist cannot be evicted or dropped. In the callers, we have to take special care to ensure that dict_table_t::referenced_set is safe to traverse if dict_sys.latch was temporarily released.

      Attachments

        Issue Links

          Activity

            marko Marko Mäkelä created issue -
            marko Marko Mäkelä made changes -
            Field Original Value New Value
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Thirunarayanan Balathandayuthapani [ thiru ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            thiru Thirunarayanan Balathandayuthapani made changes -
            Assignee Thirunarayanan Balathandayuthapani [ thiru ] Marko Mäkelä [ marko ]
            Status In Review [ 10002 ] Stalled [ 10000 ]
            marko Marko Mäkelä made changes -
            Fix Version/s 10.6.17 [ 29518 ]
            Fix Version/s 10.11.7 [ 29519 ]
            Fix Version/s 11.0.5 [ 29520 ]
            Fix Version/s 11.1.4 [ 29024 ]
            Fix Version/s 11.2.3 [ 29521 ]
            Fix Version/s 11.3.2 [ 29522 ]
            Fix Version/s 10.6 [ 24028 ]
            Fix Version/s 10.11 [ 27614 ]
            Fix Version/s 11.0 [ 28320 ]
            Fix Version/s 11.1 [ 28549 ]
            Fix Version/s 11.3 [ 28565 ]
            Fix Version/s 11.2 [ 28603 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]

            I reverted this due to the regression MDEV-33104.

            marko Marko Mäkelä added a comment - I reverted this due to the regression MDEV-33104 .
            marko Marko Mäkelä made changes -
            Resolution Fixed [ 1 ]
            Status Closed [ 6 ] Stalled [ 10000 ]

            To avoid reintroducing a bug like MDEV-33104, we must revise lock_table_children() so that it will successfully acquire MDL on each child table before waiting for an InnoDB table lock. The initial (reverted) version of this was holding a table reference while waiting for an InnoDB table lock. Concurrently, a DDL operation might want to drop or rebuild the table while holding an MDL_EXCLUSIVE as well as an InnoDB table lock.

            marko Marko Mäkelä added a comment - To avoid reintroducing a bug like MDEV-33104 , we must revise lock_table_children() so that it will successfully acquire MDL on each child table before waiting for an InnoDB table lock. The initial (reverted) version of this was holding a table reference while waiting for an InnoDB table lock. Concurrently, a DDL operation might want to drop or rebuild the table while holding an MDL_EXCLUSIVE as well as an InnoDB table lock.
            marko Marko Mäkelä made changes -
            Fix Version/s 10.6 [ 24028 ]
            Fix Version/s 10.11 [ 27614 ]
            Fix Version/s 11.0 [ 28320 ]
            Fix Version/s 11.1 [ 28549 ]
            Fix Version/s 11.2 [ 28603 ]
            Fix Version/s 11.3 [ 28565 ]
            Fix Version/s 11.4 [ 29301 ]
            Fix Version/s 11.1.4 [ 29024 ]
            Fix Version/s 10.6.17 [ 29518 ]
            Fix Version/s 10.11.7 [ 29519 ]
            Fix Version/s 11.0.5 [ 29520 ]
            Fix Version/s 11.2.3 [ 29521 ]
            Fix Version/s 11.3.2 [ 29522 ]
            marko Marko Mäkelä made changes -

            A metadata lock can be acquired by invoking dict_acquire_mdl_shared<false>() in lock_table_children() while holding shared dict_sys.latch. Because that function will temporarily release dict_sys.latch while waiting for MDL, we had better rescan table->referenced_set after each call, in case a constraint or a child table had been dropped meanwhile. We will have to keep track of the tables on which dict_acquire_mdl_shared<false>() was already invoked.

            marko Marko Mäkelä added a comment - A metadata lock can be acquired by invoking dict_acquire_mdl_shared<false>() in lock_table_children() while holding shared dict_sys.latch . Because that function will temporarily release dict_sys.latch while waiting for MDL, we had better rescan table->referenced_set after each call, in case a constraint or a child table had been dropped meanwhile. We will have to keep track of the tables on which dict_acquire_mdl_shared<false>() was already invoked.
            marko Marko Mäkelä made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Debarun Banerjee [ JIRAUSER54513 ]
            Status In Progress [ 3 ] In Review [ 10002 ]

            origin/10.6-MDEV-32899 c851e172ea043985fc8d3cec46368004a174892d 2024-01-23T17:10:37+02:00
            performed well in RQG testing. No new bad effects.

            mleich Matthias Leich added a comment - origin/10.6- MDEV-32899 c851e172ea043985fc8d3cec46368004a174892d 2024-01-23T17:10:37+02:00 performed well in RQG testing. No new bad effects.

            origin/10.6-MDEV-32899 f50940ee0b81b9c963bd114c54788e515220bc7e 2024-02-01T15:48:46+02:00
            performed well in RQG testing. No new problems.

            mleich Matthias Leich added a comment - origin/10.6- MDEV-32899 f50940ee0b81b9c963bd114c54788e515220bc7e 2024-02-01T15:48:46+02:00 performed well in RQG testing. No new problems.
            debarun Debarun Banerjee added a comment - https://github.com/MariaDB/server/pull/3021 looks good to me.
            debarun Debarun Banerjee made changes -
            Status In Review [ 10002 ] Stalled [ 10000 ]
            debarun Debarun Banerjee made changes -
            Assignee Debarun Banerjee [ JIRAUSER54513 ] Marko Mäkelä [ marko ]
            danblack Daniel Black made changes -
            Fix Version/s 10.6.18 [ 29627 ]
            Fix Version/s 10.6 [ 24028 ]
            Fix Version/s 10.11 [ 27614 ]
            Fix Version/s 11.0 [ 28320 ]
            Fix Version/s 11.1 [ 28549 ]
            Fix Version/s 11.3 [ 28565 ]
            Fix Version/s 11.2 [ 28603 ]
            Fix Version/s 11.4 [ 29301 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            JIraAutomate JiraAutomate made changes -
            Fix Version/s 10.11.8 [ 29630 ]
            Fix Version/s 11.0.6 [ 29628 ]
            Fix Version/s 11.1.5 [ 29629 ]
            Fix Version/s 11.2.4 [ 29631 ]
            Fix Version/s 11.3.3 [ 29632 ]
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 11.3.3 [ 29632 ]
            elenst Elena Stepanova made changes -
            evanelias Evan Elias made changes -

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.