Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21175

Remove dict_table_t::n_foreign_key_checks_running from InnoDB

Details

    Description

      InnoDB should clean n_foreign_key_checks usage and assume that SQL is already holding
      appropriate MDL. But an Insert into the child table doesn't seem to acquire MDL on the parent. When vice-versa is happening that there is MDL on the child when insert is happening
      for the parent table.

      The following test case could repeat the issue:

      --source include/have_innodb.inc
      create table t1(f1 int not null, primary key(f1))engine=innodb;
      create table t2(f1 int not null primary key,
                      f2 int not null, foreign key(f2) references t1(f1))engine=innodb;
      set DEBUG_SYNC='ib_after_row_insert_step SIGNAL alter WAIT_FOR dml';
      --send insert into t1 values(1)
      connect(con1,localhost,root,,,);
      set DEBUG_SYNC="now WAIT_FOR alter";
      alter table t2 force;
      set DEBUG_SYNC="now SIGNAL dml";
      connection default;
      reap;
      drop table t2, t1;
      disconnect con1;
      

      In above test case, ALTER TABLE t2 FORCE waits for MDL to open the table.

      Another test case:

      --source include/have_innodb.inc
      create table t1(f1 int not null, primary key(f1))engine=innodb;
      create table t2(f1 int not null primary key,
                      f2 int not null, foreign key(f2) references t1(f1))engine=innodb;
      set DEBUG_SYNC='ib_after_row_insert_step SIGNAL alter WAIT_FOR dml';
      --send insert into t2 values(1, 2)
      connect(con1,localhost,root,,,);
      set DEBUG_SYNC="now WAIT_FOR alter";
      alter table t1 force;
      set DEBUG_SYNC="now SIGNAL dml";
      connection default;
      reap;
      drop table t2, t1;
      disconnect con1;
      

      In this case, alter waits at commit_inplace_alter_table(). That too for innodb table lock.

      #6  0x00005555565df591 in lock_wait_suspend_thread (thr=0x7fffb8030768)
          at /home/thiru/mariarepo/10.5/10.5-work/storage/innobase/lock/lock0wait.cc:351
      #7  0x00005555566bb378 in row_mysql_handle_errors (new_err=0x7ffff01c3844,
          trx=0x7ffff0cc3268, thr=0x7fffb8030768, savept=0x0)
          at /home/thiru/mariarepo/10.5/10.5-work/storage/innobase/row/row0mysql.cc:740
      #8  0x00005555565cd2f7 in lock_table_for_trx (table=0x7fffc41317e8,
          trx=0x7ffff0cc3268, mode=LOCK_X)
          at /home/thiru/mariarepo/10.5/10.5-work/storage/innobase/lock/lock0lock.cc:4070
      #9  0x00005555566ad12d in row_merge_lock_table (trx=0x7ffff0cc3268,
          table=0x7fffc41317e8, mode=LOCK_X)
          at /home/thiru/mariarepo/10.5/10.5-work/storage/innobase/row/row0merge.cc:3698
      #10 0x0000555556568d42 in ha_innobase::commit_inplace_alter_table (this=
          0x7fffb8020c08, altered_table=0x7ffff01c4fb0,
          ha_alter_info=0x7ffff01c4f20, commit=true)
          at /home/thiru/mariarepo/10.5/10.5-work/storage/innobase/handler/handler0alter.cc:10699
      #11 0x000055555611bf0a in handler::ha_commit_inplace_alter_table (
          this=0x7fffb8020c08, altered_table=0x7ffff01c4fb0,
          ha_alter_info=0x7ffff01c4f20, commit=true
      

      First case is acquiring MDL on child table (un-necessary locking)
      Second case should do MDL on parent table.

      Attachments

        Issue Links

          Activity

            In my prototype, I forgot to remove some more code, but strangely no tests were failing due to that:

            diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc
            --- a/storage/innobase/row/row0ins.cc
            +++ b/storage/innobase/row/row0ins.cc
            @@ -1395,16 +1395,6 @@ row_ins_foreign_check_on_constraint(
             	err = row_update_cascade_for_mysql(thr, cascade,
             					   foreign->foreign_table);
             
            -	/* Release the data dictionary latch for a while, so that we do not
            -	starve other threads from doing CREATE TABLE etc. if we have a huge
            -	cascaded operation running. */
            -
            -	row_mysql_unfreeze_data_dictionary(thr_get_trx(thr));
            -
            -	DEBUG_SYNC_C("innodb_dml_cascade_dict_unfreeze");
            -
            -	row_mysql_freeze_data_dictionary(thr_get_trx(thr));
            -
             	mtr_start(mtr);
             
             	/* Restore pcur position */
            

            If the tables are sufficiently protected by MDL, there is no need to hold the data dictionary latch to prevent concurrent DDL operations. At the start of this function, my fix already removed an assertion that we must be holding the latch.

            There is no public test (not even in MySQL) that would exercise the DEBUG_SYNC point.

            marko Marko Mäkelä added a comment - In my prototype, I forgot to remove some more code, but strangely no tests were failing due to that: diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc --- a/storage/innobase/row/row0ins.cc +++ b/storage/innobase/row/row0ins.cc @@ -1395,16 +1395,6 @@ row_ins_foreign_check_on_constraint( err = row_update_cascade_for_mysql(thr, cascade, foreign->foreign_table); - /* Release the data dictionary latch for a while, so that we do not - starve other threads from doing CREATE TABLE etc. if we have a huge - cascaded operation running. */ - - row_mysql_unfreeze_data_dictionary(thr_get_trx(thr)); - - DEBUG_SYNC_C("innodb_dml_cascade_dict_unfreeze"); - - row_mysql_freeze_data_dictionary(thr_get_trx(thr)); - mtr_start(mtr); /* Restore pcur position */ If the tables are sufficiently protected by MDL, there is no need to hold the data dictionary latch to prevent concurrent DDL operations. At the start of this function, my fix already removed an assertion that we must be holding the latch. There is no public test (not even in MySQL) that would exercise the DEBUG_SYNC point.
            serg Sergei Golubchik added a comment - - edited

            marko, I don't understand. Tests that you've disabled and that crash in your prototype branch are tests like

            CREATE TABLE t1 (a int, b int) engine=myisam;
            INSERT INTO t1 values (1,1),(1,2);
            --error ER_DUP_ENTRY
            CREATE TABLE t2 (primary key (a)) engine=innodb select * from t1;
            

            How is that related to the problem at hand, to MDL or foreign key referenced tables?

            And the test from the bug description, that shows that an fk-referenced table does not have a proper MDL, this does does not crash in your prototype branch.

            serg Sergei Golubchik added a comment - - edited marko , I don't understand. Tests that you've disabled and that crash in your prototype branch are tests like CREATE TABLE t1 (a int , b int ) engine=myisam; INSERT INTO t1 values (1,1),(1,2); --error ER_DUP_ENTRY CREATE TABLE t2 ( primary key (a)) engine=innodb select * from t1; How is that related to the problem at hand, to MDL or foreign key referenced tables? And the test from the bug description, that shows that an fk-referenced table does not have a proper MDL, this does does not crash in your prototype branch.

            Related to what I wrote earlier, MDEV-23484 removed unnecessary operations on dict_operation_lock on ROLLBACK.

            serg, my earlier comment in this ticket tried to explain that also MDEV-21602 is breaking some rules. Did you read the commit message that it linked to?

            MDEV-21602 CREATE TABLE…PRIMARY KEY…SELECT workaround
            causes DROP TABLE to ignore locks

            The error handling of CREATE…SELECT would invoke
            handler::delete_table() while still holding locks
            on the table, due to not having invoked handlerton::rollback
            first. InnoDB used to work around this as well.
            In MDEV-742 this was worked around further by breaking
            MDL, causing MDEV-22733.

            Unfortunately, thiru did not include enough details to reproduce the originally reported problem. At least, we would have to know the source code revision. Based on the filing date, I would guess that it is somewhere around 5a00792c69065140a34315e3c5aec00bdbef8447.

            marko Marko Mäkelä added a comment - Related to what I wrote earlier, MDEV-23484 removed unnecessary operations on dict_operation_lock on ROLLBACK . serg , my earlier comment in this ticket tried to explain that also MDEV-21602 is breaking some rules. Did you read the commit message that it linked to? MDEV-21602 CREATE TABLE…PRIMARY KEY…SELECT workaround causes DROP TABLE to ignore locks The error handling of CREATE…SELECT would invoke handler::delete_table() while still holding locks on the table, due to not having invoked handlerton::rollback first. InnoDB used to work around this as well. In MDEV-742 this was worked around further by breaking MDL, causing MDEV-22733 . Unfortunately, thiru did not include enough details to reproduce the originally reported problem. At least, we would have to know the source code revision. Based on the filing date, I would guess that it is somewhere around 5a00792c69065140a34315e3c5aec00bdbef8447.

            I think that the field can be safely removed in the third part of the MDEV-25506 fix (rewriting DROP operations).

            marko Marko Mäkelä added a comment - I think that the field can be safely removed in the third part of the MDEV-25506 fix (rewriting DROP operations).

            This was fixed in MDEV-25506 part 3.

            marko Marko Mäkelä added a comment - This was fixed in MDEV-25506 part 3.

            People

              marko Marko Mäkelä
              thiru Thirunarayanan Balathandayuthapani
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.