Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.6, 10.11, 11.4, 12.2, 12.3, 11.8
-
Related to performance
-
The locking in the audit plugin was simplified in order to remove some scalability bottlenecks.
-
Q4/2025 Server Development
Description
The implementation of the audit plugin apparently follows obsolete 1990 version of ISO/IEC 9899, a.k.a. C90. Atomic memory operations were not introduced to the language until 2011. Therefore, a combination of mutexes and volatile variables is being used. This is prone to race conditions that could explain some failures such as MDEV-34074, because no well-defined consistency model is being followed. Elsewhere in the code base, we are making use of features that were introduced in ISO/IEC 14882:2011, such as std::atomic. There is some rather convoluted logic around locking, which makes it hard to reason about:
static void update_mode(MYSQL_THD thd __attribute__((unused)), |
struct st_mysql_sys_var *var __attribute__((unused)), |
void *var_ptr __attribute__((unused)), const void *save) |
{
|
unsigned int new_mode= *(unsigned int *) save; |
if (mode_readonly || new_mode == mode) |
return; |
|
|
ADD_ATOMIC(internal_stop_logging, 1);
|
if (!maria_55_started || !debug_server_started) |
mysql_prlock_wrlock(&lock_operations);
|
mark_always_logged(thd);
|
error_header();
|
fprintf(stderr, "Logging mode was changed from %d to %d.\n", mode, new_mode); |
mode= new_mode;
|
if (!maria_55_started || !debug_server_started) |
mysql_prlock_unlock(&lock_operations);
|
ADD_ATOMIC(internal_stop_logging, -1);
|
}
|
The purpose of the variable internal_stop_logging is unclear here. Possibly it would be needed in the dead code path where the exclusive lock on lock_operations would be avoided. A simplified implementation based on C++11 and a simpler rw-lock (MDEV-24167) could look as follows:
static void update_mode(MYSQL_THD thd, st_mysql_sys_var *, void *, |
const void *save) |
{
|
unsigned new_mode= *static_cast<const unsigned*>(save); |
lock_operations.wr_lock();
|
unsigned old_mode= mode;
|
if (new_mode != old_mode) |
{
|
mode= new_mode;
|
mark_always_logged(thd);
|
error_header();
|
fprintf(stderr, "Logging mode was changed from %u to %u.\n", |
old_mode, new_mode);
|
}
|
lock_operations.wr_unlock();
|
}
|
Why could the original code skip the lock_operations acquisition? Apparently, at some point, the implementation was tested to be binary compatible with older versions of MySQL or MariaDB. Currently, the plugin is being distributed and compiled together with the MariaDB release that it is compatible with. Therefore, we had better remove the dead code. For what it is worth, the release of MySQL 5.7.44 (the last one in the 5.7 series) took place in August 2023. The release of MariaDB 5.5.68 (the last one in the 5.5 series) took place in May 2020.
The variable internal_stop_logging feels hacky, but I was not able to completely remove it. What can be done is transform it to an Atomic_counter<int>, which will use std::memory_order_relaxed semantics. Previously, the variable was declared as volatile int. Reads were prone to race condition, and writes would involve acquiring and releasing a mutex, which risks avoidable context switches to the operating system kernel.
I would also remove any PERFORMANCE_SCHEMA instrumentation from the synchronization primitives of the audit plugin. No significant amount of contention is expected, and the instrumentation did not help us to identify the scalability bottlenecks inside mysql_prlock_rdlock(&lock_operations) and mysql_prlock_unlock(&lock_operations). (Analysis with perf record and other tools did.) Removing the instrumentation will allow us to use the KERNEL32.DLL native SRWLOCK on Microsoft Windows.
Attachments
Issue Links
- is blocked by
-
MDEV-34680 Asynchronous and Buffered Logging for Audit Plugin
-
- Closed
-
- relates to
-
MDEV-21923 LSN allocation is a bottleneck
-
- Closed
-
-
MDEV-34074 plugins.thread_pool_server_audit crashes in get_loc_info() with null pointer dereference
-
- Open
-