[MDEV-31568] innodb protection against dual processes accessing data insufficient Created: 2023-06-28  Updated: 2023-07-21  Resolved: 2023-07-05

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.5.21, 10.6.14
Fix Version/s: 10.5.22, 10.6.15, 10.9.8, 10.10.6, 10.11.5, 11.0.3, 11.1.2, 11.2.1

Type: Bug Priority: Critical
Reporter: Daniel Black Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: not-10.4

Issue Links:
Duplicate
is duplicated by MDEV-31758 Missing FILE_CHECKPOINT by successful... Closed
Relates
relates to MDEV-28495 Corruption due to attempting to start... Closed
relates to MDEV-24393 table_open_cache creates same number ... Closed
relates to MDEV-28495 Corruption due to attempting to start... Closed
relates to MDEV-28976 InnoDB: Missing FILE_CHECKPOINT Closed

 Description   

MDEV-28495 protection insufficient

steps:

1. start
2. insert some data
3. start new instance on same datadir and it attempts crash recovery:

second instance

$  sql/mysqld --no-defaults --skip-networking --datadir=/tmp/build-mariadb-server-10.6-datadir --socket=/tmp/build-mariadb-server-10.6.sock --plugin-dir=/home/dan/repos/build-mariadb-server-10.6/mysql-test/var/plugins/ --verbose 
2023-06-28 17:56:59 0 [Note] Starting MariaDB 10.6.15-MariaDB source revision 694ce0d08e64238e1e2228a05a19c1cd7a20e39c as process 305981
2023-06-28 17:56:59 0 [ERROR] mysqld: Can't lock aria control file '/tmp/build-mariadb-server-10.6-datadir/aria_log_control' for exclusive use, error: 11. Will retry for 30 seconds
2023-06-28 17:57:29 0 [ERROR] mysqld: Got error 'Could not get an exclusive lock; file is probably in use by another process' when trying to use aria control file '/tmp/build-mariadb-server-10.6-datadir/aria_log_control'
2023-06-28 17:57:29 0 [ERROR] Plugin 'Aria' init function returned error.
2023-06-28 17:57:29 0 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
2023-06-28 17:57:29 0 [Note] InnoDB: Compressed tables use zlib 1.2.13
2023-06-28 17:57:29 0 [Note] InnoDB: Number of pools: 1
2023-06-28 17:57:29 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2023-06-28 17:57:29 0 [Note] InnoDB: Using liburing
2023-06-28 17:57:29 0 [Note] InnoDB: Initializing buffer pool, total size = 134217728, chunk size = 134217728
2023-06-28 17:57:29 0 [Note] InnoDB: Completed initialization of buffer pool
2023-06-28 17:57:29 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed
2023-06-28 17:57:29 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=41302,41302
2023-06-28 17:57:29 0 [Note] InnoDB: To recover: 6 pages
2023-06-28 17:57:29 0 [Note] InnoDB: 128 rollback segments are active.
2023-06-28 17:57:29 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
2023-06-28 17:57:29 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2023-06-28 17:57:29 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2023-06-28 17:57:29 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2023-06-28 17:57:29 0 [Note] InnoDB: 10.6.15 started; log sequence number 54816; transaction id 23
2023-06-28 17:57:29 0 [Note] InnoDB: Loading buffer pool(s) from /tmp/build-mariadb-server-10.6-datadir/ib_buffer_pool
2023-06-28 17:57:29 0 [Note] Plugin 'FEEDBACK' is disabled.
2023-06-28 17:57:29 0 [ERROR] Could not open mysql.plugin table: "Unknown storage engine 'Aria'". Some plugins may be not loaded
2023-06-28 17:57:29 0 [ERROR] Failed to initialize plugins.
2023-06-28 17:57:29 0 [ERROR] Aborting

Not 10.4

$ sql/mysqld --no-defaults --skip-networking --datadir=/tmp/build-mariadb-server-10.4-datadir --socket=/tmp/build-mariadb-server-10.4.sock --plugin-dir=/home/dan/repos/build-mariadb-server-10.4/mysql-test/var/plugins/ --verbose
2023-06-28 17:59:14 0 [Note] Starting MariaDB 10.4.31-MariaDB source revision f5dceafd0ba211405e04be4c778d69e8a7533d8d as process 307636
2023-06-28 17:59:14 0 [ERROR] mysqld: Can't lock aria control file '/tmp/build-mariadb-server-10.4-datadir/aria_log_control' for exclusive use, error: 11. Will retry for 30 seconds
2023-06-28 17:59:45 0 [ERROR] mysqld: Got error 'Could not get an exclusive lock; file is probably in use by another process' when trying to use aria control file '/tmp/build-mariadb-server-10.4-datadir/aria_log_control'
2023-06-28 17:59:45 0 [ERROR] Plugin 'Aria' init function returned error.
2023-06-28 17:59:45 0 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
2023-06-28 17:59:45 0 [Note] InnoDB: Using Linux native AIO
2023-06-28 17:59:45 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2023-06-28 17:59:45 0 [Note] InnoDB: Uses event mutexes
2023-06-28 17:59:45 0 [Note] InnoDB: Compressed tables use zlib 1.2.13
2023-06-28 17:59:45 0 [Note] InnoDB: Number of pools: 1
2023-06-28 17:59:45 0 [Note] InnoDB: Using SSE2 crc32 instructions
2023-06-28 17:59:45 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2023-06-28 17:59:45 0 [Note] InnoDB: Completed initialization of buffer pool
2023-06-28 17:59:45 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2023-06-28 17:59:45 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
2023-06-28 17:59:45 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2023-06-28 17:59:45 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2023-06-28 17:59:46 0 [Note] InnoDB: Starting shutdown...
2023-06-28 17:59:46 0 [ERROR] Plugin 'InnoDB' init function returned error.
2023-06-28 17:59:46 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2023-06-28 17:59:46 0 [Note] Plugin 'FEEDBACK' is disabled.
2023-06-28 17:59:46 0 [ERROR] Could not open mysql.plugin table. Some plugins may be not loaded
2023-06-28 17:59:46 0 [ERROR] Failed to initialize plugins.
2023-06-28 17:59:46 0 [ERROR] Aborting

affects 10.5

$ sql/mysqld --no-defaults --skip-networking --datadir=/tmp/build-mariadb-server-10.5-datadir --socket=/tmp/build-mariadb-server-10.5.sock --plugin-dir=/home/dan/repos/build-mariadb-server-10.5/mysql-test/var/plugins/ --verbose
2023-06-28 18:04:53 0 [Note] Starting MariaDB 10.5.22-MariaDB source revision 24faa5de16c980fa5575cfc4a89618e48e9a3305 as process 315186
2023-06-28 18:04:53 0 [ERROR] mysqld: Can't lock aria control file '/tmp/build-mariadb-server-10.5-datadir/aria_log_control' for exclusive use, error: 11. Will retry for 30 seconds
2023-06-28 18:05:23 0 [ERROR] mysqld: Got error 'Could not get an exclusive lock; file is probably in use by another process' when trying to use aria control file '/tmp/build-mariadb-server-10.5-datadir/aria_log_control'
2023-06-28 18:05:23 0 [ERROR] Plugin 'Aria' init function returned error.
2023-06-28 18:05:23 0 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
2023-06-28 18:05:23 0 [Note] InnoDB: Uses event mutexes
2023-06-28 18:05:23 0 [Note] InnoDB: Compressed tables use zlib 1.2.13
2023-06-28 18:05:23 0 [Note] InnoDB: Number of pools: 1
2023-06-28 18:05:23 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2023-06-28 18:05:23 0 [Note] InnoDB: Using Linux native AIO
2023-06-28 18:05:23 0 [Note] InnoDB: Initializing buffer pool, total size = 134217728, chunk size = 134217728
2023-06-28 18:05:23 0 [Note] InnoDB: Completed initialization of buffer pool
2023-06-28 18:05:23 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=45103,45103
2023-06-28 18:05:23 0 [Note] InnoDB: 128 rollback segments are active.
2023-06-28 18:05:23 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2023-06-28 18:05:23 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2023-06-28 18:05:23 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2023-06-28 18:05:23 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2023-06-28 18:05:23 0 [Note] InnoDB: 10.5.22 started; log sequence number 45115; transaction id 20
2023-06-28 18:05:23 0 [Note] Plugin 'FEEDBACK' is disabled.
2023-06-28 18:05:23 0 [Note] InnoDB: Loading buffer pool(s) from /tmp/build-mariadb-server-10.5-datadir/ib_buffer_pool
2023-06-28 18:05:23 0 [ERROR] Could not open mysql.plugin table: "Unknown storage engine 'Aria'". Some plugins may be not loaded
2023-06-28 18:05:23 0 [ERROR] Failed to initialize plugins.
2023-06-28 18:05:23 0 [ERROR] Aborting



 Comments   
Comment by Daniel Black [ 2023-07-05 ]

Its at this point that ibdata1 is opened without a lock allowing second instance to start:

diff --git a/storage/innobase/srv/srv0start.cc b/storage/innobase/srv/srv0start.cc
index a507b29ffa2..36957ddf490 100644
--- a/storage/innobase/srv/srv0start.cc
+++ b/storage/innobase/srv/srv0start.cc
@@ -1378,7 +1378,7 @@ dberr_t srv_start(bool create_new_db)
         them open until database shutdown */
        ut_d(fil_system.sys_space->recv_size = srv_sys_space_size_debug);
 
*       err = fil_system.sys_space->open(create_new_db)

Unsuccessful thought:

$ git diff
diff --git a/storage/innobase/fsp/fsp0sysspace.cc b/storage/innobase/fsp/fsp0sysspace.cc
index 07a8295a94e..cf3fd1a6b12 100644
--- a/storage/innobase/fsp/fsp0sysspace.cc
+++ b/storage/innobase/fsp/fsp0sysspace.cc
@@ -548,6 +548,20 @@ SysTablespace::open_file(
        return(err);
 }
 
+dberr_t
+SysTablespace::open_all()
+{
+       dberr_t err= DB_SUCCESS;
+       files_t::iterator       begin = m_files.begin();
+       files_t::iterator       end = m_files.end();
+
+       ut_ad(begin->order() == 0);
+
+       for (files_t::iterator it = begin; err == DB_SUCCESS && it != end; ++it)
+               err = open_file(*it);
+       return err;
+}
+
 /** Check the tablespace header for this tablespace.
 @param[out]    flushed_lsn     the value of FIL_PAGE_FILE_FLUSH_LSN
 @return DB_SUCCESS or error code */
diff --git a/storage/innobase/include/fsp0sysspace.h b/storage/innobase/include/fsp0sysspace.h
index 2e0a395f71c..3c2cfacb52b 100644
--- a/storage/innobase/include/fsp0sysspace.h
+++ b/storage/innobase/include/fsp0sysspace.h
@@ -159,6 +159,8 @@ class SysTablespace : public Tablespace
                lsn_t*  flush_lsn)
                MY_ATTRIBUTE((warn_unused_result));
 
+       dberr_t open_all();
+
 private:
        /** Check the tablespace header for this tablespace.
        @param[out]     flushed_lsn     the value of FIL_PAGE_FILE_FLUSH_LSN
@@ -202,6 +204,7 @@ class SysTablespace : public Tablespace
        @return DB_SUCCESS or error code */
        dberr_t open_file(Datafile& file);
 
+
        /** Set the size of the file.
        @param[in,out]  file    data file object
        @return DB_SUCCESS or error code */
diff --git a/storage/innobase/srv/srv0start.cc b/storage/innobase/srv/srv0start.cc
index a507b29ffa2..36957ddf490 100644
--- a/storage/innobase/srv/srv0start.cc
+++ b/storage/innobase/srv/srv0start.cc
@@ -1378,7 +1378,7 @@ dberr_t srv_start(bool create_new_db)
         them open until database shutdown */
        ut_d(fil_system.sys_space->recv_size = srv_sys_space_size_debug);
 
-       err = fil_system.sys_space->open(create_new_db)
+       err = srv_sys_space.open_all()
                ? srv_undo_tablespaces_init(create_new_db)
                : DB_ERROR;
 

Comment by Marko Mäkelä [ 2023-07-05 ]

The problem is that after the initial unconditional advisory locking of the system tablespace in SysTablespace::open_file(), SysTablespace::open_or_create() would close the file(s) and therefore release the advisory lock(s). It needs to be redone later down the path when the system tablespace file(s) are being opened for good:

diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc
index c75144413ac..725345957ee 100644
--- a/storage/innobase/fil/fil0fil.cc
+++ b/storage/innobase/fil/fil0fil.cc
@@ -391,8 +391,16 @@ static bool fil_node_open_file_low(fil_node_t *node)
                                  : OS_FILE_OPEN | OS_FILE_ON_ERROR_NO_EXIT,
                                  OS_FILE_AIO, type,
                                  srv_read_only_mode, &success);
-    if (success)
+    if (node->is_open())
+    {
+      ut_ad(success);
+#ifndef _WIN32
+      if (!node->space->id && !srv_read_only_mode && my_disable_locking &&
+          os_file_lock(node->handle, node->name))
+        goto fail;
+#endif
       break;
+    }
 
     /* The following call prints an error message */
     if (os_file_get_last_error(true) == EMFILE + 100 &&
@@ -406,6 +414,9 @@ static bool fil_node_open_file_low(fil_node_t *node)
   if (node->size);
   else if (!node->read_page0() || !fil_comp_algo_validate(node->space))
   {
+#ifndef _WIN32
+  fail:
+#endif
     os_file_close(node->handle);
     node->handle= OS_FILE_CLOSED;
     return false;

With this patch, it works for me. Tested as follows:

rm -fr /dev/shm/data
mkdir /dev/shm/data
sql/mariadbd --bootstrap --datadir /dev/shm/data &
sleep 1
sql/mariadbd --bootstrap --datadir /dev/shm/data
kill %1
fg

The attempt to start a second server will now properly fail:

10.5 5f2a77cef1cced322d3a6e6a48f4f4e5480283dc with patch

2023-07-05 15:05:52 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
2023-07-05 15:05:52 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2023-07-05 15:05:52 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1295] with error Generic error
2023-07-05 15:05:52 0 [Note] InnoDB: Starting shutdown...
2023-07-05 15:05:52 0 [ERROR] Plugin 'InnoDB' init function returned error.
2023-07-05 15:05:52 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2023-07-05 15:05:52 0 [ERROR] mariadbd: Can't lock aria control file '/dev/shm/data/aria_log_control' for exclusive use, error: 11. Will retry for 30 seconds
2023-07-05 15:06:23 0 [ERROR] mariadbd: Got error 'Could not get an exclusive lock; file is probably in use by another process' when trying to use aria control file '/dev/shm/data/aria_log_control'
2023-07-05 15:06:23 0 [ERROR] Plugin 'Aria' init function returned error.
2023-07-05 15:06:23 0 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
2023-07-05 15:06:23 0 [Note] Plugin 'FEEDBACK' is disabled.
2023-07-05 15:06:23 0 [ERROR] Failed to initialize plugins.
2023-07-05 15:06:23 0 [ERROR] Aborting

On Linux, Error 11 is EAGAIN, or Resource temporarily unavailable.

Comment by Daniel Black [ 2023-07-05 ]

Tested some more as well. Looked at strace of shutdown paths to ensure both innodb and aria stop writing before the locked files are closed and that's true.

Did some of the race conditions where process 1 gets the aria lock, process 2 gets the innodb lock, and both fail to start. But it occurs without corruption.

Generated at Thu Feb 08 10:24:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.