Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25975

Turning on innodb_disallow_writes will cause mysqladmin shutdown to hang

Details

    Description

      SET GLOBAL innodb_disallow_writes=ON;
      # Exit the client, and execute: mysqladmin shutdown
      

      (No other SQL is required and innodb_fast_shutdown=1)

      Will cause mysqladmin shutdown to hang, on debug and optimized builds alike.

      Attachments

        Issue Links

          Activity

            Roel Roel Van de Paar added a comment - - edited

            While mysqladmin shutdown will hang, a SHUTDOWN command issued at the client will not hang (expected as SHUTDOWN does not wait for shutdown to finalized), but the PID will still be live for an undefined amount of time.

            This is an additional concern as an automatic restart will run into issues.

            SET GLOBAL innodb_disallow_writes=ON;
            SHUTDOWN;
            

            Leads to:

            10.3.30 75a65d3201a4486af96cf3277b6c5a4ba460eef7 (Optimized)

            2021-06-21 20:37:30 0 [ERROR] mysqld: Can't lock aria control file '/test/MD140621-mariadb-10.3.30-linux-x86_64-opt/data/aria_log_control' for exclusive use, error: 11. Will retry for 30 seconds
            2021-06-21 20:38:01 0 [ERROR] mysqld: Got error 'Could not get an exclusive lock; file is probably in use by another process' when trying to use aria control file '/test/MD140621-mariadb-10.3.30-linux-x86_64-opt/data/aria_log_control'
            2021-06-21 20:38:01 0 [ERROR] Plugin 'Aria' init function returned error.
            2021-06-21 20:38:01 0 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
            2021-06-21 20:38:01 0 [Note] InnoDB: Using Linux native AIO
            2021-06-21 20:38:01 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
            2021-06-21 20:38:01 0 [Note] InnoDB: Uses event mutexes
            2021-06-21 20:38:01 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
            2021-06-21 20:38:01 0 [Note] InnoDB: Number of pools: 1
            2021-06-21 20:38:01 0 [Note] InnoDB: Using SSE2 crc32 instructions
            2021-06-21 20:38:01 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
            2021-06-21 20:38:01 0 [Note] InnoDB: Completed initialization of buffer pool
            2021-06-21 20:38:01 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
            2021-06-21 20:38:01 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
            2021-06-21 20:38:01 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
            2021-06-21 20:38:01 0 [Note] InnoDB: Retrying to lock the first data file
            2021-06-21 20:38:02 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
            2021-06-21 20:38:02 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
            2021-06-21 20:38:03 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
            2021-06-21 20:38:03 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
            2021-06-21 20:38:04 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
            2021-06-21 20:38:04 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
            ...
            2021-06-21 20:39:41 0 [Note] InnoDB: Unable to open the first data file
            2021-06-21 20:39:41 0 [ERROR] InnoDB: Operating system error number 11 in a file operation.
            2021-06-21 20:39:41 0 [ERROR] InnoDB: Error number 11 means 'Resource temporarily unavailable'
            2021-06-21 20:39:41 0 [Note] InnoDB: Some operating system error numbers are described at https://mariadb.com/kb/en/library/operating-system-error-codes/
            2021-06-21 20:39:41 0 [ERROR] InnoDB: Cannot open datafile './ibdata1'
            2021-06-21 20:39:41 0 [ERROR] InnoDB: Could not open or create the system tablespace. If you tried to add new data files to the system tablespace, and it failed here, you should now edit innodb_data_file_path in my.cnf back to what it was, and remove the new ibdata files InnoDB created in this failed attempt. InnoDB only wrote those files full of zeros, but did not yet use them in any way. But be careful: do not remove old data files which contain your precious data!
            2021-06-21 20:39:41 0 [ERROR] InnoDB: Plugin initialization aborted with error Cannot open a file
            2021-06-21 20:39:41 0 [Note] InnoDB: Starting shutdown...
            2021-06-21 20:39:43 0 [ERROR] Plugin 'InnoDB' init function returned error.
            2021-06-21 20:39:43 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
            2021-06-21 20:39:43 0 [Note] Plugin 'FEEDBACK' is disabled.
            2021-06-21 20:39:43 0 [ERROR] Unknown/unsupported storage engine: InnoDB
            2021-06-21 20:39:43 0 [ERROR] Aborting
            

            Roel Roel Van de Paar added a comment - - edited While mysqladmin shutdown will hang, a SHUTDOWN command issued at the client will not hang (expected as SHUTDOWN does not wait for shutdown to finalized), but the PID will still be live for an undefined amount of time. This is an additional concern as an automatic restart will run into issues. SET GLOBAL innodb_disallow_writes=ON; SHUTDOWN; Leads to: 10.3.30 75a65d3201a4486af96cf3277b6c5a4ba460eef7 (Optimized) 2021-06-21 20:37:30 0 [ERROR] mysqld: Can't lock aria control file '/test/MD140621-mariadb-10.3.30-linux-x86_64-opt/data/aria_log_control' for exclusive use, error: 11. Will retry for 30 seconds 2021-06-21 20:38:01 0 [ERROR] mysqld: Got error 'Could not get an exclusive lock; file is probably in use by another process' when trying to use aria control file '/test/MD140621-mariadb-10.3.30-linux-x86_64-opt/data/aria_log_control' 2021-06-21 20:38:01 0 [ERROR] Plugin 'Aria' init function returned error. 2021-06-21 20:38:01 0 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed. 2021-06-21 20:38:01 0 [Note] InnoDB: Using Linux native AIO 2021-06-21 20:38:01 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2021-06-21 20:38:01 0 [Note] InnoDB: Uses event mutexes 2021-06-21 20:38:01 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 2021-06-21 20:38:01 0 [Note] InnoDB: Number of pools: 1 2021-06-21 20:38:01 0 [Note] InnoDB: Using SSE2 crc32 instructions 2021-06-21 20:38:01 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M 2021-06-21 20:38:01 0 [Note] InnoDB: Completed initialization of buffer pool 2021-06-21 20:38:01 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority(). 2021-06-21 20:38:01 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11 2021-06-21 20:38:01 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files. 2021-06-21 20:38:01 0 [Note] InnoDB: Retrying to lock the first data file 2021-06-21 20:38:02 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11 2021-06-21 20:38:02 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files. 2021-06-21 20:38:03 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11 2021-06-21 20:38:03 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files. 2021-06-21 20:38:04 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11 2021-06-21 20:38:04 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files. ... 2021-06-21 20:39:41 0 [Note] InnoDB: Unable to open the first data file 2021-06-21 20:39:41 0 [ERROR] InnoDB: Operating system error number 11 in a file operation. 2021-06-21 20:39:41 0 [ERROR] InnoDB: Error number 11 means 'Resource temporarily unavailable' 2021-06-21 20:39:41 0 [Note] InnoDB: Some operating system error numbers are described at https://mariadb.com/kb/en/library/operating-system-error-codes/ 2021-06-21 20:39:41 0 [ERROR] InnoDB: Cannot open datafile './ibdata1' 2021-06-21 20:39:41 0 [ERROR] InnoDB: Could not open or create the system tablespace. If you tried to add new data files to the system tablespace, and it failed here, you should now edit innodb_data_file_path in my.cnf back to what it was, and remove the new ibdata files InnoDB created in this failed attempt. InnoDB only wrote those files full of zeros, but did not yet use them in any way. But be careful: do not remove old data files which contain your precious data! 2021-06-21 20:39:41 0 [ERROR] InnoDB: Plugin initialization aborted with error Cannot open a file 2021-06-21 20:39:41 0 [Note] InnoDB: Starting shutdown... 2021-06-21 20:39:43 0 [ERROR] Plugin 'InnoDB' init function returned error. 2021-06-21 20:39:43 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. 2021-06-21 20:39:43 0 [Note] Plugin 'FEEDBACK' is disabled. 2021-06-21 20:39:43 0 [ERROR] Unknown/unsupported storage engine: InnoDB 2021-06-21 20:39:43 0 [ERROR] Aborting

            I think that the parameter is implemented incorrectly. The correct implementation would be to block anything that would write something, at the highest possible level. Already in log_free_check() it is too late, because that function is not allowed to return any error, only delay the execution.

            A proper design should be something like this:

            1. Set a flag that will cause any future writes to be aborted with an error (or be blocked until the connection is killed).
            2. Any background operations that perform writes, such as purge, encryption threads, statistics and defragmentation threads, must be stopped.
            3. Wait for any pending writes to complete. This should include waiting for the rollback of any transactions. And it must include waiting for the page cleaner to be idle.
            4. Return the control to the caller of SET GLOBAL.
            marko Marko Mäkelä added a comment - I think that the parameter is implemented incorrectly. The correct implementation would be to block anything that would write something, at the highest possible level. Already in log_free_check() it is too late, because that function is not allowed to return any error, only delay the execution. A proper design should be something like this: Set a flag that will cause any future writes to be aborted with an error (or be blocked until the connection is killed). Any background operations that perform writes, such as purge, encryption threads, statistics and defragmentation threads, must be stopped. Wait for any pending writes to complete. This should include waiting for the rollback of any transactions. And it must include waiting for the page cleaner to be idle. Return the control to the caller of SET GLOBAL .
            Roel Roel Van de Paar added a comment - - edited

            Additionally, using the test case in my previous comment above, there is a 'Memory not freed' issue in 10.3;

            10.3.30 75a65d3201a4486af96cf3277b6c5a4ba460eef7 (Optimized)

            Warning: Memory not freed: 520
            

            10.4.20 and 10.5.11 show the Aria SE issue also, but not the 'Memory not freed' issue. Same result for InnnoDB.
            10.6.2 does not show the Aria issue, and does not produce the 'Memory not freed' issue. Same result for InnnoDB.

            10.6.2 6c39eaeb126328e7813b146ecf652d51e4508981 (Debug)

            [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
            ...
            2021-06-21 20:44:21 0 [ERROR] Unknown/unsupported storage engine: InnoDB
            2021-06-21 20:44:21 0 [ERROR] Aborting
            

            10.2 Does not show the Aria issue nor the 'Memory not freed' issue but initiates a crash recovery:

            10.2.39 (Optimized)

            2021-06-21 21:40:30 23452867323840 [Note] Recovering after a crash using tc.log
            2021-06-21 21:40:30 23452867323840 [Note] Starting crash recovery...
            

            Roel Roel Van de Paar added a comment - - edited Additionally, using the test case in my previous comment above, there is a 'Memory not freed' issue in 10.3; 10.3.30 75a65d3201a4486af96cf3277b6c5a4ba460eef7 (Optimized) Warning: Memory not freed: 520 10.4.20 and 10.5.11 show the Aria SE issue also, but not the 'Memory not freed' issue. Same result for InnnoDB. 10.6.2 does not show the Aria issue, and does not produce the 'Memory not freed' issue. Same result for InnnoDB. 10.6.2 6c39eaeb126328e7813b146ecf652d51e4508981 (Debug) [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11 ... 2021-06-21 20:44:21 0 [ERROR] Unknown/unsupported storage engine: InnoDB 2021-06-21 20:44:21 0 [ERROR] Aborting 10.2 Does not show the Aria issue nor the 'Memory not freed' issue but initiates a crash recovery: 10.2.39 (Optimized) 2021-06-21 21:40:30 23452867323840 [Note] Recovering after a crash using tc.log 2021-06-21 21:40:30 23452867323840 [Note] Starting crash recovery...
            Roel Roel Van de Paar added a comment - - edited

            I am also seeing this:

            10.7.1 12eb8ad7b98b03a6a7659fce7b75bdc8696ccaf6 (Debug)

            safe_mutex: Trying to lock mutex at /test/10.7_dbg/storage/innobase/handler/ha_innodb.cc, line 19471, when the mutex was already locked at /test/10.7_dbg/sql/sys_vars_shared.h, line 56 in thread T@52
            

            And was able to reduce the testcase for "ha_innodb.cc, line 19471, when the mutex was already locked at /test/10.7_dbg/sql/sys_vars_shared.h, line 56" in the error log to the following (not all statements may be required, the issue may be is highly sporadic, and a connector/C client may be is not required; CLI replay suffices):

            # mysqld options required for replay: --log-bin 
            SET GLOBAL innodb_disallow_writes=ON;
            SET GLOBAL expire_logs_days=100;
            SET GLOBAL binlog_checksum=NONE;
            

            Attempts to reproduce the mutex note using CLI have failed thus far now succeeded, though and the server also customary hangs on shutdown as described above.

            Roel Roel Van de Paar added a comment - - edited I am also seeing this: 10.7.1 12eb8ad7b98b03a6a7659fce7b75bdc8696ccaf6 (Debug) safe_mutex: Trying to lock mutex at /test/10.7_dbg/storage/innobase/handler/ha_innodb.cc, line 19471, when the mutex was already locked at /test/10.7_dbg/sql/sys_vars_shared.h, line 56 in thread T@52 And was able to reduce the testcase for "ha_innodb.cc, line 19471, when the mutex was already locked at /test/10.7_dbg/sql/sys_vars_shared.h, line 56" in the error log to the following (not all statements may be required, the issue may be is highly sporadic, and a connector/C client may be is not required; CLI replay suffices ): # mysqld options required for replay: --log-bin SET GLOBAL innodb_disallow_writes= ON ; SET GLOBAL expire_logs_days=100; SET GLOBAL binlog_checksum=NONE; Attempts to reproduce the mutex note using CLI have failed thus far now succeeded , though and the server also customary hangs on shutdown as described above.
            marko Marko Mäkelä added a comment - - edited

            Please test and review the following:
            bb-10.3-MDEV-25795
            bb-10.4-MDEV-25795
            bb-10.5-MDEV-25795
            bb-10.6-MDEV-25795

            This differs from what you submitted for my review in MDEV-24845 in that instead of introducing locking to some InnoDB internal threads and tasks, for the duration of the Galera snapshot transfer we will disable those threads and tasks. Only the fts_optimize_task must keep running, and we will avoid writes, except when a server shutdown is initiated during snapshot transfer.

            In 10.3 and 10.4, we will also keep the dict_stats_thread running, but disable it by checking a global variable. That is not properly locked (just like the check in fts_optimize_task or fts_optimize_thread is not), but hopefully during the time spent on log_make_checkpoint() the writes will be paused.

            marko Marko Mäkelä added a comment - - edited Please test and review the following: bb-10.3-MDEV-25795 bb-10.4-MDEV-25795 bb-10.5-MDEV-25795 bb-10.6-MDEV-25795 This differs from what you submitted for my review in MDEV-24845 in that instead of introducing locking to some InnoDB internal threads and tasks, for the duration of the Galera snapshot transfer we will disable those threads and tasks. Only the fts_optimize_task must keep running, and we will avoid writes, except when a server shutdown is initiated during snapshot transfer. In 10.3 and 10.4, we will also keep the dict_stats_thread running, but disable it by checking a global variable. That is not properly locked (just like the check in fts_optimize_task or fts_optimize_thread is not), but hopefully during the time spent on log_make_checkpoint() the writes will be paused.

            The tree
            origin/bb-10.6-MDEV-25975 a469d8501f2028324425fabe94ff7536a418a8bc 2022-04-01T11:50:49+03:00
            performed well in RQG testing(InnoDB standard test battery).
            

            mleich Matthias Leich added a comment - The tree origin/bb-10.6-MDEV-25975 a469d8501f2028324425fabe94ff7536a418a8bc 2022-04-01T11:50:49+03:00 performed well in RQG testing(InnoDB standard test battery).

            Ok to push

            jplindst Jan Lindström (Inactive) added a comment - Ok to push

            The variable innodb_disallow_writes was removed.

            marko Marko Mäkelä added a comment - The variable innodb_disallow_writes was removed.

            People

              marko Marko Mäkelä
              Roel Roel Van de Paar
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.