Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17813

Crash in instant ALTER TABLE due to purge concurrently emptying table

Details

    Description

      The following deterministic test case reproduces a problem that was reported in MDEV-17721:

      --source include/have_innodb.inc
      --source include/have_debug.inc
      --source include/have_debug_sync.inc
       
      SET @saved_frequency = @@GLOBAL.innodb_purge_rseg_truncate_frequency;
      SET GLOBAL innodb_purge_rseg_truncate_frequency = 1;
       
      CREATE TABLE t1 (f2 INT) ENGINE=InnoDB;
      INSERT INTO t1 SET f2=1;
      ALTER TABLE t1 ADD COLUMN f1 INT;
       
      connect (purge_control,localhost,root);
      START TRANSACTION WITH CONSISTENT SNAPSHOT;
       
      connection default;
      DELETE FROM t1;
       
      SET DEBUG_SYNC='innodb_commit_inplace_alter_table_enter SIGNAL go WAIT_FOR do';
      send ALTER TABLE t1 ADD COLUMN f3 INT;
       
      connection purge_control;
      SET DEBUG_SYNC='now WAIT_FOR go';
      COMMIT;
      --source include/wait_all_purged.inc
      SET DEBUG_SYNC='now SIGNAL do';
      disconnect purge_control;
       
      connection default;
      reap;
      SET DEBUG_SYNC=RESET;
      DROP TABLE t1;
      SET GLOBAL innodb_purge_rseg_truncate_frequency = @saved_frequency;
      

      This does not crash in 10.3, so MDEV-11369 does not seem to be affected.
      The original test did not do ALTER TABLE t1 ADD COLUMN f3 INT but the following, which would only be instantaneous after MDEV-15562:

      send ALTER TABLE t1 DROP f2;
      

      The crash looks like this:

      10.4 46a411088c5abbacda4e52e89f4dbcf52a451f7a

      mysqld: /mariadb/10.4/storage/innobase/handler/handler0alter.cc:338: void dict_index_t::instant_add_field(const dict_index_t&): Assertion `instant.n_core_fields == n_core_fields' failed.
      ...
      #4  0x0000555556186653 in dict_index_t::instant_add_field (
          this=0x7fffa80238d8, instant=...)
          at /mariadb/10.4/storage/innobase/handler/handler0alter.cc:338
      #5  0x0000555556187a6d in dict_table_t::instant_column (this=0x7fffa8020348, 
          table=..., col_map=0x7fffa8035de8)
          at /mariadb/10.4/storage/innobase/handler/handler0alter.cc:487
      #6  0x0000555556170908 in innobase_instant_try (ha_alter_info=0x7ffff41f2b60, 
          ctx=0x7fffa80176b0, altered_table=0x7fffa8036bd8, table=0x7fffa8025e78, 
          trx=0x7ffff4cec300)
          at /mariadb/10.4/storage/innobase/handler/handler0alter.cc:5294
      

      In order to fix this, we must rebuild the ctx->instant_table metadata in innobase_instant_try() if ctx->old_table->is_instant() != ctx->old_was_instant. And while doing that, we must already be holding an exclusive latch on the leftmost leaf page, so that purge cannot empty the table. This implies that the following code has to be moved earlier in the function:

      	mtr_t mtr;
      	mtr.start();
      	index->set_modified(mtr);
      	btr_pcur_t pcur;
      	btr_pcur_open_at_index_side(true, index, BTR_MODIFY_TREE, &pcur, true,
      				    0, &mtr);
      

      Furthermore, there appears to be a possible race condition earlier during the ALTER TABLE. Both during instant_alter_column_possible() and ctx->prepare_instant(), a purge thread could invoke dict_index_t::clear_instant_add() when the table becomes empty. That should not be a problem in MDEV-11369 in 10.3, but it is serious in 10.4, because index fields in the current dict_table_t object can be removed or permuted. It ought to be possible to prevent this by holding index->lock in S mode, because that should prevent modifications of the root page, which is the page that needs to be modified when the table becomes empty.

      Because of partitioned tables, I think that each inplace_alter function in ha_innobase must acquire and release index->lock independently. I do not see any way to avoid rebuilding the the ctx->instant_table metadata.

      Attachments

        Issue Links

          Activity

            Purge is not holding index->lock here, so I think that we must acquire the root page latch in order to protect index->n_core_fields and index->n_fields during instant ALTER TABLE:

            #8  0x0000555a2d95e651 in dict_index_t::clear_instant_alter (this=0x7f9de4062298) at /mariadb/10.4/storage/innobase/include/dict0mem.h:2149
            #9  0x0000555a2db68069 in btr_cur_optimistic_delete_func (cursor=0x555a2ff12810, flags=0, mtr=0x7f9e1e7081b0) at /mariadb/10.4/storage/innobase/btr/btr0cur.cc:5629
            #10 0x0000555a2da9385a in row_purge_remove_clust_if_poss_low (node=0x555a2ff12778, mode=2) at /mariadb/10.4/storage/innobase/row/row0purge.cc:169
            

            marko Marko Mäkelä added a comment - Purge is not holding index->lock here, so I think that we must acquire the root page latch in order to protect index->n_core_fields and index->n_fields during instant ALTER TABLE : #8 0x0000555a2d95e651 in dict_index_t::clear_instant_alter (this=0x7f9de4062298) at /mariadb/10.4/storage/innobase/include/dict0mem.h:2149 #9 0x0000555a2db68069 in btr_cur_optimistic_delete_func (cursor=0x555a2ff12810, flags=0, mtr=0x7f9e1e7081b0) at /mariadb/10.4/storage/innobase/btr/btr0cur.cc:5629 #10 0x0000555a2da9385a in row_purge_remove_clust_if_poss_low (node=0x555a2ff12778, mode=2) at /mariadb/10.4/storage/innobase/row/row0purge.cc:169

            Purge should never reduce index->n_fields or touch index->fields, because it should only empty the root page when !index->table->instant. I will add a debug assertion to row_purge_remove_clust_if_poss_low() to ensure this. The assertion did not fail when running tests in mysql-test-run.

            Therefore, the race condition should be limited to index->n_core_fields and index->n_core_null_bytes. It should suffice to copy these values from ctx->old_table to ctx->instant_table in innobase_instant_try() after acquiring the leftmost leaf page latch.

            There still is the question whether dict_index_t::clear_instant_alter() could clear index->table->instant in the rollback of a recovered instant ALTER TABLE transaction, while a new instant ALTER TABLE is executing on the same table. The answer is no, because InnoDB only allows at most one DDL transaction to be active at a time, and it would roll back any incomplete DDL transaction before letting the server accept any client connections. If this were not the case, we should acquire and release a shared or exclusive InnoDB table lock before invoking ctx->prepare_instant(), to ensure that no conflicting transactions exist.

            marko Marko Mäkelä added a comment - Purge should never reduce index->n_fields or touch index->fields , because it should only empty the root page when !index->table->instant . I will add a debug assertion to row_purge_remove_clust_if_poss_low() to ensure this. The assertion did not fail when running tests in mysql-test-run . Therefore, the race condition should be limited to index->n_core_fields and index->n_core_null_bytes . It should suffice to copy these values from ctx->old_table to ctx->instant_table in innobase_instant_try() after acquiring the leftmost leaf page latch. There still is the question whether dict_index_t::clear_instant_alter() could clear index->table->instant in the rollback of a recovered instant ALTER TABLE transaction, while a new instant ALTER TABLE is executing on the same table. The answer is no, because InnoDB only allows at most one DDL transaction to be active at a time, and it would roll back any incomplete DDL transaction before letting the server accept any client connections. If this were not the case, we should acquire and release a shared or exclusive InnoDB table lock before invoking ctx->prepare_instant() , to ensure that no conflicting transactions exist.

            At the start of innobase_instant_try() we are updating the InnoDB internal data dictionary. At this time, it is not possible to hold any latch in the user table, so index->n_core_fields would necessarily be unprotected. But, we can acquire the page latch before updating the InnoDB dictionary and commit the mini-transaction before updating the data dictionary tables.

            Is there any possibility of race condition during dict_table_t::rollback_instant(), that is, when an instant ALTER TABLE operation is rolled back? Purge would remove the metadata record unless it was for the MDEV-11369 instant ADD COLUMN. In that case, dict_index_t::clear_instant_add() called by dict_index_t::clear_instant_alter() would reset index->n_core_fields and index->n_core_null_bytes. These fields are not touched by dict_table_t::rollback_instant(). However, dict_index_t::clear_instant_add() would also invoke fields[i].col->clear_instant() on all index columns. This is something that is missing from dict_table_t::rollback_instant(), and race conditions must be prevented by holding a latch on the leftmost leaf page.

            marko Marko Mäkelä added a comment - At the start of innobase_instant_try() we are updating the InnoDB internal data dictionary. At this time, it is not possible to hold any latch in the user table, so index->n_core_fields would necessarily be unprotected. But, we can acquire the page latch before updating the InnoDB dictionary and commit the mini-transaction before updating the data dictionary tables. Is there any possibility of race condition during dict_table_t::rollback_instant() , that is, when an instant ALTER TABLE operation is rolled back? Purge would remove the metadata record unless it was for the MDEV-11369 instant ADD COLUMN . In that case, dict_index_t::clear_instant_add() called by dict_index_t::clear_instant_alter() would reset index->n_core_fields and index->n_core_null_bytes . These fields are not touched by dict_table_t::rollback_instant() . However, dict_index_t::clear_instant_add() would also invoke fields[i].col->clear_instant() on all index columns. This is something that is missing from dict_table_t::rollback_instant() , and race conditions must be prevented by holding a latch on the leftmost leaf page.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.