Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
12.3
-
None
-
None
Description
Current Problem
=================
The current background encryption key rotation architecture relies on multiple "encryption threads"
(fil_crypt_thread) acting as producers to perform dummy writes and dirty data pages within the
buffer pool. However, this design introduces significant inefficiencies and architectural bottlenecks:
The actual CPU-intensive operations—buf_page_encrypt() and CRC-32C computation—are bottlenecked
inside a single consumer thread: buf_flush_page_cleaner(). Having multiple threads dirtying
pages provides no apparent throughput benefit while heavily taxing the buffer pool.
- Multiple fil_crypt_thread instances concurrently increase the size of buf_pool.flush_list
and invoke buf_flush_list_space(). This repeatedly invalidates the hazard pointer
buf_pool.flush_hp, disrupting normal page flushing activity.
When pages must be read into the buffer pool for encryption, the current implementation
invokes synchronous buf_page_read() via buf_page_get_gen(), blocking progress.
Proposal Solution:
==================
Tightly couple the encryption/rotation process directly
into the buf_flush_page_cleaner() loop:
Eliminate the independent page-dirtying fil_crypt_thread loops.
Let buf_flush_page_cleaner() manage key rotation natively during its regular buffer pool scans.
Embed an "encryption step" processing missed pages from buf_flush_list_holding_mutex() and buf_flush_LRU().
This allows key rotation to naturally respect the innodb_io_capacity budget and scale
back during high application workloads (e.g., when threads are blocked in buf_flush_wait()).
Replace synchronous page reads with asynchronous reads (keeping future MDEV-11378 compliance in mind).
The read completion callback will write the dummy log record to queue the page for re-encryption.
Repurpose innodb_encrypt_threads to act as parallel worker tasks managed by buf_flush_page_cleaner().
Split buf_page_t::flush() so the single cleaner thread handles the initial phase, offloading the
heavy encryption/CRC-32C computation tasks to these workers via a task queue.
Move relevant elements of fil_space_rotate_state_t and key_state_t into fil_space_t or fil_space_crypt_t.
A dedicated testing track must focus entirely on isolating the behavior of the innodb_encrypt_threads
parameter. Because the current architecture splits the workflow between multiple page-dirtying
producer threads and a single page-cleaning consumer thread, tests should explicitly scale
innodb_encrypt_threads from 1 up to higher parallel counts (e.g., 4, 8, and 16)
under identical hardware configurations.
Updated Encryption Performance Testing Matrix
To accurately evaluate the system under continuous stress, all tests must be executed with innodb_encrypt_tables
set to always encrypt / force re-encryption. This ensures that key rotation activity never goes idle,
keeping the fil_crypt_thread loops continuously active and constantly forcing page-dirtying behavior throughout
the entire duration of the benchmark. The revised testing suite eliminates standard read-only
profiles—which mask true engine contention—and instead relies on four distinct write-driven configurations
mapped across varying thread scales (1, 4, 8, and 16 threads).
Test TP-01 (Read-Heavy / Light-Write): This configuration pairs a small innodb_buffer_pool_size
(forcing high page churn from disk) with a large innodb_log_file_size (minimizing log boundary pressure)
to specifically isolate synchronous read interference. It measures how much synchronous buf_page_read calls
from the encryption threads delay legitimate application reads when the buffer pool is under
constant replacement pressure.
Test TP-02 (Balanced Read-Write): Utilizing a standard OLTP workload, this setup leverages a large buffer pool
(ensuring an in-memory operational fit) and a large log file size to evaluate hazard pointer contention.
It directly tracks the rate at which buf_pool.flush_hp is invalidated by concurrent buf_flush_list_space()
calls coming from the active encryption threads against normal background transactional flushes.
Test TP-03 (Write-Heavy): This profile uses bulk inserts/updates alongside a large
buffer pool and a small log file size to evaluate I/O capacity and back-off budgeting. It determines
whether the background rotation loops correctly scale back under aggressive, immediate flushing
pressure, or if they blindly consume the innodb_io_capacity budget while user threads starve for log space.
Test TP-04 (High-Saturation Stress): Operating under severe simultaneous pressure from both a small buffer
pool and a small log file. By forcing extreme flushing contention from both user transactions and key
rotation threads simultaneously, it creates the precise boundary conditions needed to expose and verify
the fix for the thread lockup vulnerability.
Measure TPS and also essential metrics for buffer pool contention:
SHOW STATUS LIKE 'Innodb_buffer_pool_read_requests';
SHOW STATUS LIKE 'Innodb_buffer_pool_reads';
SHOW STATUS LIKE 'Innodb_buffer_pool_wait_free';
From this variables, we can figure out the following like
Hit_rate = (read_requests - reads) / read_requests * 100%
SHOW STATUS LIKE 'Innodb_buffer_pool_reads'; in reads_before
... 10 sec...
SHOW STATUS LIKE 'Innodb_buffer_pool_reads'; in reads_now
Physical_rd_per_sec = (reads_now - reads_before) / time_interval
Flush List Growth Rate= Flush List Now - Flush List Before \ Time_interval;
Expectation is that more encryption threads lead to more TPS degradation