Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14511

Use fewer transactions for updating InnoDB persistent statistics

Details

    Description

      The function dict_stats_save() is problematic. If it ever goes to 'goto err' it will attempt to free an active transaction object, without trying to roll it back first.

      Furthermore, the first call to dict_stats_exec_sql() will use a different transaction object, that is, the update of table statistics is not a single atomic transaction with the update of index statistics.

      Attachments

        Issue Links

          Activity

            When porting the fix from 10.2 to 10.0, I noticed that more work is needed so that the test would pass.

            diff --git a/mysql-test/suite/innodb/t/innodb_stats_debug.test b/mysql-test/suite/innodb/t/innodb_stats_debug.test
            new file mode 100644
            index 00000000000..c0b5aff15ea
            --- /dev/null
            +++ b/mysql-test/suite/innodb/t/innodb_stats_debug.test
            @@ -0,0 +1,11 @@
            +--source include/have_innodb.inc
            +--source include/have_debug.inc
            +
            +CREATE TABLE t1 (a INT, KEY(a)) ENGINE=INNODB STATS_PERSISTENT=1;
            +SET @save_debug= @@SESSION.debug_dbug;
            +SET debug_dbug= '+d,stats_index_error';
            +ANALYZE TABLE t1;
            +SET debug_dbug= @save_debug;
            +ANALYZE TABLE t1;
            +
            +DROP TABLE t1;
            diff --git a/storage/innobase/dict/dict0stats.cc b/storage/innobase/dict/dict0stats.cc
            index 0063b6b9ed4..b05a371fe06 100644
            --- a/storage/innobase/dict/dict0stats.cc
            +++ b/storage/innobase/dict/dict0stats.cc
            @@ -2410,8 +2410,9 @@ dict_stats_save(
             	dict_fs2utf8(table->name, db_utf8, sizeof(db_utf8),
             		     table_utf8, sizeof(table_utf8));
             
            -	rw_lock_x_lock(&dict_operation_lock);
            -	mutex_enter(&dict_sys->mutex);
            +	trx_t*	trx = trx_allocate_for_background();
            +
            +	row_mysql_lock_data_dictionary(trx);
             
             	/* MySQL's timestamp is 4 byte, so we use
             	pars_info_add_int4_literal() which takes a lint arg, so "now" is
            @@ -2429,6 +2430,8 @@ dict_stats_save(
             	pars_info_add_ull_literal(pinfo, "sum_of_other_index_sizes",
             		table->stat_sum_of_other_index_sizes);
             
            +	trx_start_if_not_started(trx);
            +
             	ret = dict_stats_exec_sql(
             		pinfo,
             		"PROCEDURE TABLE_STATS_SAVE () IS\n"
            @@ -2449,31 +2452,14 @@ dict_stats_save(
             		":clustered_index_size,\n"
             		":sum_of_other_index_sizes\n"
             		");\n"
            -		"END;", NULL);
            -
            -	if (ret != DB_SUCCESS) {
            -		char	buf[MAX_FULL_NAME_LEN];
            -		ut_print_timestamp(stderr);
            -		fprintf(stderr,
            -			" InnoDB: Cannot save table statistics for table "
            -			"%s: %s\n",
            -			ut_format_name(table->name, TRUE, buf, sizeof(buf)),
            -			ut_strerr(ret));
            -
            -		mutex_exit(&dict_sys->mutex);
            -		rw_lock_x_unlock(&dict_operation_lock);
            +		"END;", trx);
             
            -		dict_stats_snapshot_free(table);
            +	index_map_t	indexes;
             
            -		return(ret);
            +	if (ret != DB_SUCCESS) {
            +		goto end;
             	}
             
            -	trx_t*	trx = trx_allocate_for_background();
            -	trx_start_if_not_started(trx);
            -
            -	dict_index_t*	index;
            -	index_map_t	indexes;
            -
             	/* Below we do all the modifications in innodb_index_stats in a single
             	transaction for performance reasons. Modifying more than one row in a
             	single transaction may deadlock with other transactions if they
            @@ -2486,18 +2472,17 @@ dict_stats_save(
             	stat_name). This is why below we sort the indexes by name and then
             	for each index, do the mods ordered by stat_name. */
             
            -	for (index = dict_table_get_first_index(table);
            +	for (dict_index_t* index = dict_table_get_first_index(table);
             	     index != NULL;
             	     index = dict_table_get_next_index(index)) {
             
             		indexes[index->name] = index;
             	}
             
            -	index_map_t::const_iterator	it;
            -
            -	for (it = indexes.begin(); it != indexes.end(); ++it) {
            +	for (index_map_t::const_iterator it = indexes.begin();
            +	     it != indexes.end(); ++it) {
             
            -		index = it->second;
            +		dict_index_t* index = it->second;
             
             		if (only_for_index != NULL && index->id != *only_for_index) {
             			continue;
            @@ -2562,14 +2547,20 @@ dict_stats_save(
             		}
             	}
             
            -	trx_commit_for_mysql(trx);
            -
            +	if (ret != DB_SUCCESS) {
            +		char	buf[MAX_FULL_NAME_LEN];
             end:
            +		ib_logf(IB_LOG_LEVEL_ERROR,
            +			"Cannot save table statistics for table %s: %s\n",
            +			ut_format_name(table->name, TRUE, buf, sizeof(buf)),
            +			ut_strerr(ret));
            +		trx_rollback_to_savepoint(trx, NULL);
            +	} else {
            +		trx_commit_for_mysql(trx);
            +	}
            +	row_mysql_unlock_data_dictionary(trx);
             	trx_free_for_background(trx);
             
            -	mutex_exit(&dict_sys->mutex);
            -	rw_lock_x_unlock(&dict_operation_lock);
            -
             	dict_stats_snapshot_free(table);
             
             	return(ret);
            

            The function dict_stats_exec_sql() must be rewritten so that it always takes a trx_t* parameter, and never calls commit or rollback. The caller should invoke row_mysql_lock_data_dictionary(trx) and row_mysql_unlock_data_dictionary(trx) in addition to deciding whether to commit or rollback.

            marko Marko Mäkelä added a comment - When porting the fix from 10.2 to 10.0, I noticed that more work is needed so that the test would pass. diff --git a/mysql-test/suite/innodb/t/innodb_stats_debug.test b/mysql-test/suite/innodb/t/innodb_stats_debug.test new file mode 100644 index 00000000000..c0b5aff15ea --- /dev/null +++ b/mysql-test/suite/innodb/t/innodb_stats_debug.test @@ -0,0 +1,11 @@ +--source include/have_innodb.inc +--source include/have_debug.inc + +CREATE TABLE t1 (a INT, KEY(a)) ENGINE=INNODB STATS_PERSISTENT=1; +SET @save_debug= @@SESSION.debug_dbug; +SET debug_dbug= '+d,stats_index_error'; +ANALYZE TABLE t1; +SET debug_dbug= @save_debug; +ANALYZE TABLE t1; + +DROP TABLE t1; diff --git a/storage/innobase/dict/dict0stats.cc b/storage/innobase/dict/dict0stats.cc index 0063b6b9ed4..b05a371fe06 100644 --- a/storage/innobase/dict/dict0stats.cc +++ b/storage/innobase/dict/dict0stats.cc @@ -2410,8 +2410,9 @@ dict_stats_save( dict_fs2utf8(table->name, db_utf8, sizeof(db_utf8), table_utf8, sizeof(table_utf8)); - rw_lock_x_lock(&dict_operation_lock); - mutex_enter(&dict_sys->mutex); + trx_t* trx = trx_allocate_for_background(); + + row_mysql_lock_data_dictionary(trx); /* MySQL's timestamp is 4 byte, so we use pars_info_add_int4_literal() which takes a lint arg, so "now" is @@ -2429,6 +2430,8 @@ dict_stats_save( pars_info_add_ull_literal(pinfo, "sum_of_other_index_sizes", table->stat_sum_of_other_index_sizes); + trx_start_if_not_started(trx); + ret = dict_stats_exec_sql( pinfo, "PROCEDURE TABLE_STATS_SAVE () IS\n" @@ -2449,31 +2452,14 @@ dict_stats_save( ":clustered_index_size,\n" ":sum_of_other_index_sizes\n" ");\n" - "END;", NULL); - - if (ret != DB_SUCCESS) { - char buf[MAX_FULL_NAME_LEN]; - ut_print_timestamp(stderr); - fprintf(stderr, - " InnoDB: Cannot save table statistics for table " - "%s: %s\n", - ut_format_name(table->name, TRUE, buf, sizeof(buf)), - ut_strerr(ret)); - - mutex_exit(&dict_sys->mutex); - rw_lock_x_unlock(&dict_operation_lock); + "END;", trx); - dict_stats_snapshot_free(table); + index_map_t indexes; - return(ret); + if (ret != DB_SUCCESS) { + goto end; } - trx_t* trx = trx_allocate_for_background(); - trx_start_if_not_started(trx); - - dict_index_t* index; - index_map_t indexes; - /* Below we do all the modifications in innodb_index_stats in a single transaction for performance reasons. Modifying more than one row in a single transaction may deadlock with other transactions if they @@ -2486,18 +2472,17 @@ dict_stats_save( stat_name). This is why below we sort the indexes by name and then for each index, do the mods ordered by stat_name. */ - for (index = dict_table_get_first_index(table); + for (dict_index_t* index = dict_table_get_first_index(table); index != NULL; index = dict_table_get_next_index(index)) { indexes[index->name] = index; } - index_map_t::const_iterator it; - - for (it = indexes.begin(); it != indexes.end(); ++it) { + for (index_map_t::const_iterator it = indexes.begin(); + it != indexes.end(); ++it) { - index = it->second; + dict_index_t* index = it->second; if (only_for_index != NULL && index->id != *only_for_index) { continue; @@ -2562,14 +2547,20 @@ dict_stats_save( } } - trx_commit_for_mysql(trx); - + if (ret != DB_SUCCESS) { + char buf[MAX_FULL_NAME_LEN]; end: + ib_logf(IB_LOG_LEVEL_ERROR, + "Cannot save table statistics for table %s: %s\n", + ut_format_name(table->name, TRUE, buf, sizeof(buf)), + ut_strerr(ret)); + trx_rollback_to_savepoint(trx, NULL); + } else { + trx_commit_for_mysql(trx); + } + row_mysql_unlock_data_dictionary(trx); trx_free_for_background(trx); - mutex_exit(&dict_sys->mutex); - rw_lock_x_unlock(&dict_operation_lock); - dict_stats_snapshot_free(table); return(ret); The function dict_stats_exec_sql() must be rewritten so that it always takes a trx_t* parameter, and never calls commit or rollback. The caller should invoke row_mysql_lock_data_dictionary(trx) and row_mysql_unlock_data_dictionary(trx) in addition to deciding whether to commit or rollback.

            The debug instrumentation test case alone is not enough to trigger a crash in the unmodified 10.0 or 10.1. Therefore, it is not necessary to perform this refactoring in those versions.

            That said, I prepared a 10.0 patch 0001-MDEV-14511-Use-fewer-transactions-for-updating-InnoD.patch that seems to work fine in local tests (bb-10.0-marko). I plan to port it to 10.3 where the clean-up can be applied without risking stability of GA versions.

            marko Marko Mäkelä added a comment - The debug instrumentation test case alone is not enough to trigger a crash in the unmodified 10.0 or 10.1. Therefore, it is not necessary to perform this refactoring in those versions. That said, I prepared a 10.0 patch 0001-MDEV-14511-Use-fewer-transactions-for-updating-InnoD.patch that seems to work fine in local tests ( bb-10.0-marko ). I plan to port it to 10.3 where the clean-up can be applied without risking stability of GA versions.

            Because the 10.2 and 10.3 code bases are very similar in this respect, and because MDEV-13201 would depend on this change, I decided to push this fix into 10.2 already.

            marko Marko Mäkelä added a comment - Because the 10.2 and 10.3 code bases are very similar in this respect, and because MDEV-13201 would depend on this change, I decided to push this fix into 10.2 already.

            Due to the regression MDEV-14941, this change will be reverted in MariaDB 10.2.14 and 10.3.5.

            marko Marko Mäkelä added a comment - Due to the regression MDEV-14941 , this change will be reverted in MariaDB 10.2.14 and 10.3.5.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.