[MDEV-17757] MariaDB 10.1 encryption threads consume CPU 100% Created: 2018-11-17  Updated: 2018-12-29

Status: Open
Project: MariaDB Server
Component/s: Encryption, Storage Engine - InnoDB
Affects Version/s: 10.1.34
Fix Version/s: 10.1

Type: Bug Priority: Major
Reporter: Stefan Assignee: Thirunarayanan Balathandayuthapani
Resolution: Unresolved Votes: 0
Labels: None
Environment:

CentOS 6.10


Issue Links:
Relates
relates to MDEV-11738 Mariadb uses 100% of several of my 8 ... Closed

 Description   

Hi,

We activated the encryption and the replication on our servers.

For some reasons during the restart the MariaDB encryption threads consumes 100% CPU without terminating when traffic is present.

For example if no traffic happens on Master, during the mysql restart (service mysql restart) the CPU will be consumed for some minutes (20 minutes or so) and then will go back to normal.
We observed on slave that if the slave was restarted during the traffic on Master then the Slave will recover the transactions from binlogs BUT the 4 encryption threads configured remain present in the CPU for hours consuming all 4 core from the total of 8 CPU cores. Even if the Master was shutdown and only slave remained active. The threads looks to be stuck.

What we observed is that the encryption threads might get stuck if the server is restarted on a loading coming.

So is not that the threads would consume some resources but might get the server stuck in a infinite loop.

I tried to run a compiled version with debug but I'm not sure what should I do in order to see what happens on that threads ?

top -H

Cpu(s): 37.3%us, 0.4%sy, 0.0%ni, 62.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 16329900k total, 8526724k used, 7803176k free, 323848k buffers
Swap: 2047996k total, 0k used, 2047996k free, 6857584k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13315 mysql 20 0 5169m 512m 10m R 79.7 3.2 0:26.42 mysqld
13316 mysql 20 0 5169m 512m 10m R 79.0 3.2 0:26.20 mysqld
13317 mysql 20 0 5169m 512m 10m R 71.7 3.2 0:25.80 mysqld
13314 mysql 20 0 5169m 512m 10m S 71.0 3.2 0:25.89 mysqld

gdb -p 3840 /usr/sbin/mysqld
set height 0
set logging file /tmp/mysqld.log
set logging on
thread apply all backtrace full

This is an output from the thread:

Thread 1 (process 3840):
#0 0x0000000000d47055 in ut_delay(unsigned long) ()
No symbol table info available.
#1 0x0000000000d112ac in mutex_spin_wait(void*, bool, char const*, unsigned long) ()
No symbol table info available.
#2 0x0000000000e04664 in mutex_enter_func ()
No symbol table info available.
#3 0x0000000000e04722 in pfs_mutex_enter_func ()
No symbol table info available.
#4 0x0000000000e1322a in fil_space_next(fil_space_t*) ()
No symbol table info available.
#5 0x0000000000e1965f in fil_crypt_find_space_to_rotate(key_state_t*, rotate_thread_t*, bool*) ()
No symbol table info available.
#6 0x0000000000e1acb7 in fil_crypt_thread ()
No symbol table info available.
#7 0x0000003bf1607aa1 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#8 0x0000003bf12e8aad in clone () from /lib64/libc.so.6
No symbol table info available.

Configuration is:

innodb-file-format = Antelope
innodb-file-format-max = Antelope

interactive_timeout = 3700
wait_timeout = 3700
connect_timeout = 10

server-id = 1
log-bin = /var/lib/mysql/binarylog/binarylog-bin
relay-log = /var/lib/mysql/relaylog/relaylog-bin
auto-increment-increment = 2
auto-increment-offset = 1
max_binlog_size = 1G
expire_logs_days = 14
master-retry-count = 0
slave-net-timeout = 60
gtid_domain_id = 1

back_log = 3000
max_connections = 2000
max_user_connections = 2000

performance_schema = 0
thread_handling = pool-of-threads
thread_pool_size = 8
thread_pool_max_threads = 3000
thread_cache_size = 0
innodb_stats_on_metadata = 0
innodb_stats_auto_recalc = 0
innodb_stats_persistent = 0
innodb_use_native_aio = 1
innodb_use_fallocate = 0
innodb_adaptive_hash_index = 0
innodb_use_mtflush = 0
innodb_mtflush_threads = 8
innodb_sync_array_size = 1
innodb_adaptive_flushing = 1
innodb_checksums = 1
innodb_checksum_algorithm = crc32
innodb_doublewrite = 0
innodb_compression_algorithm = 0
sync_binlog = 0
secure_auth = 0

binlog_format=STATEMENT
default_storage_engine=InnoDB
transaction-isolation=REPEATABLE-READ

max-allowed-packet = 4M
max-connect-errors = 1000000

query_cache_type = 0
query-cache-size = 0
query-cache-limit = 2M
table-open-cache = 8000
open-files-limit = 65535

key_buffer_size = 16M
read_buffer_size = 262144
read_rnd_buffer_size = 524288
thread_stack = 262144

innodb-flush-method = O_DIRECT

innodb-log-files-in-group = 3
innodb-log-file-size = 256M
innodb-flush-log-at-trx-commit = 2
innodb-file-per-table = 1
innodb-buffer-pool-size = 4G
innodb_additional_mem_pool_size = 8388608

tmpdir = /mysql_database/mysql/
innodb_data_home_dir = /mysql_database/mysql/data
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /mysql_database/mysql/logs
innodb_buffer_pool_instances = 4

innodb_io_capacity = 200
innodb_io_capacity_max=1000
innodb_write_io_threads = 4
innodb_read_io_threads = 8
innodb_log_buffer_size = 8M
innodb_thread_concurrency = 6
innodb_lock_wait_timeout = 50
innodb_thread_sleep_delay = 20000
innodb_concurrency_tickets = 5000
open_files_limit=65535
datadir = /mysql_database/mysql
slow-query-log = 1
long_query_time = 2
[safe_mysqld]
open_files_limit=65535

[mysqlhotcopy]
interactive-timeout/log

[mysqld_safe]
open-files-limit=65535

[mariadb]

plugin-load-add=file_key_management.so
file-key-management
file_key_management_filename=/opt/keys/key.key
file_key_management_encryption_algorithm=AES_CTR
innodb_default_encryption_key_id=1
innodb-tablespaces-encryption
aria_encrypt_tables=1
encrypt-tmp-disk-tables=1
encrypt-tmp-files=1
encrypt-binlog=1
innodb-encrypt-tables=ON
innodb-encrypt-log=ON
#must be half then CPU number otherwise encryption threads can consume 100% the CPU
innodb-encryption-threads=4
#When innodb_encryption_rotate_key_age=0 is set, server won't encrypt ibdata tablespaces and innodb system tables
#innodb-encryption-rotate-key-age=0
innodb-encryption-rotate-key-age=1



 Comments   
Comment by Marko Mäkelä [ 2018-11-20 ]

stefan2017, do you see this on an otherwise idle server? Could you try to create and upload a reproducible test case?

Comment by Stefan [ 2018-11-21 ]

No, if I restart the server while is idle (NO traffic done on it and no app is connected), the threads runs a while and stops, usually 5 to 10 minutes.

If for example I start the loading on Master node and after a while I restart the Slave node once still traffic on Master, the threads continue to run forever, even after some minutes after the Slave restart, the loading tests ends on Master.

I can reproduce this behavior. All I need is what commands to run and what to provide to you.

A separate question, I tried to recompile MariaDB with more debug options. How can I add symbols to this process ?

wget .... sources
tar zxvf ....
cd mariadb-10.1.37
cmake -DRPM=centos6 . -DCMAKE_BUILD_TYPE=Debug -DWITH_VALGRIND=ON
make -j4
make package

Comment by Thirunarayanan Balathandayuthapani [ 2018-11-21 ]

innodb_encryption_key_rotate_age is 1 could lead to re-encryption of tablespace frequently. May be, Increasing the variable will avoid re-encryption.

Comment by Stefan [ 2018-11-21 ]

Can you please let me know how I should change this values to repeat the test?
I don't plan to rotate keys so which are the correct values ?

innodb_default_encryption_key_id=1 - what value should I set here
innodb-encryption-rotate-key-age=1 - what value whould I set here

According to the documentation: innodb-encryption-rotate-key-age = Re-encrypt in background any page having a key older than this number of key versions.
Maybe I need to set my default encryption key to something higher like

innodb_default_encryption_key_id=10

and keep

innodb-encryption-rotate-key-age=1 ?

Generated at Thu Feb 08 08:38:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.