[MDEV-24094] Race condition and hang in INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Not a Bug
Affects Version/s: 10.5.7
Fix Version/s: N/A
Component/s: Information Schema, Storage Engine - InnoDB
Labels:
- regression

Description

The reduction of fil_system.mutex contention introduced a race condition that can lead to a hang of the server when the view INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION is accessed. If space->is_stopping() starts to hold right after the check, then we would attempt to acquire fil_system.mutex while we are already holding it.

The fix is simple (and also clarifies what this is about):

diff --git a/storage/innobase/handler/i_s.cc b/storage/innobase/handler/i_s.cc

index ecc0905de56..4c9098f9a86 100644

--- a/storage/innobase/handler/i_s.cc

+++ b/storage/innobase/handler/i_s.cc

@@ -7055,8 +7055,7 @@ i_s_tablespaces_encryption_fill_table(

 	for (fil_space_t* space = UT_LIST_GET_FIRST(fil_system.space_list);

 	     space; space = UT_LIST_GET_NEXT(space_list, space)) {

 		if (space->purpose == FIL_TYPE_TABLESPACE

-		    && !space->is_stopping()) {

-			space->reacquire();

+		    && space->acquire_if_not_stopped(true)) {

 			mutex_exit(&fil_system.mutex);

 			if (int err = i_s_dict_fill_tablespaces_encryption(

 				    thd, space, tables->table)) {

To avoid the race condition, we must atomically check the is_stopping() flag while incrementing the reference count.

Attachments

Activity

Marko Mäkelä added a comment - 2020-11-02 10:11

This is not a bug after all. I encountered this while rebasing ~~MDEV-21452~~ to the latest 10.5. While resolving conflicts, I replaced space->reacquire() above with space->acquire(), which would attempt to open the file if it was closed. It was not even a race condition: the space->is_open() is a separate flag from space->is_stopping().

This bug only existed in my local repository and was never pushed anywhere. I am sorry for the noise.

Marko Mäkelä added a comment - 2020-11-02 10:11 This is not a bug after all. I encountered this while rebasing MDEV-21452 to the latest 10.5. While resolving conflicts, I replaced space->reacquire() above with space->acquire() , which would attempt to open the file if it was closed. It was not even a race condition: the space->is_open() is a separate flag from space->is_stopping() . This bug only existed in my local repository and was never pushed anywhere. I am sorry for the noise.

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2020-11-02 09:00

Updated:: 2020-11-02 10:11

Resolved:: 2020-11-02 10:11

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server