[MDEV-29652] data-at-rest enabled high CPU use for long times Created: 2022-09-28  Updated: 2022-10-11  Resolved: 2022-10-11

Status: Closed
Project: MariaDB Server
Component/s: Encryption, Storage Engine - InnoDB
Affects Version/s: 10.3.36
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Jim Dutton Assignee: Marko Mäkelä
Resolution: Won't Fix Votes: 0
Labels: None
Environment:

Oracle Linux Server release 7.9
CentOS Linux release 7.9.2009


Issue Links:
Relates
relates to MDEV-14180 Automatically disable key rotation ch... Closed

 Description   

Either when creating a new database to have data-at-rest enabled, or adding new tables to database with other data-at-rest tables, or database startup with data-at-rest employed - there are times when all of the data-at-rest background tasks are consuming 80-90% of multiple relatively fast CPU's anywhere from 20 minutes to 6 hours on systems with 32GB RAM.

There doesn't appear to be any consistent and determinable cause for the overly long utlizations.



 Comments   
Comment by Marko Mäkelä [ 2022-09-28 ]

DBrunner, can you please provide a some more details? Which storage engine and which encryption plugin are you using, and with which configuration parameters?

If this is about innodb_encrypt_tables=ON, the so-called "encryption threads" are actually only marking pages dirty, so that when the pages will eventually be written back from the buffer pool to the data file, they will be re-encrypted.

There have been several improvements to this area in MariaDB Server 10.5 and 10.6. The innodb_log_scrub code was removed in MDEV-21870; I do not think it ever worked correctly. Similarly, most of the page scrubbing code was removed in MDEV-8139, and that scrubbing will take place as part of normal page write activity.

Something was fixed also in older versions a while back; see MDEV-14180.

Comment by Jim Dutton [ 2022-09-29 ]

my.cnf
---------------------------------------
innodb_log_file_size=16M
encrypt_binlog = ON
encrypt_tmp_disk_tables = ON
encrypt_tmp_files = ON
innodb_adaptive_hash_index = Off
innodb_encrypt_log = ON
innodb_encryption_threads= 4
innodb_encrypt_tables=FORCE
innodb_encryption_rotate_key_age = 1
innodb_default_encryption_key_id=1
loose_file_key_management_encryption_algorithm = AES_CTR

Storage Engine: InnoDB
Different encryption key per database

/var/lib/mysql
-------------------------
rw-rw---. 1 mysql mysql 1635 Sep 28 03:07 ib_buffer_pool
rw-rw---. 1 mysql mysql 79691776 Sep 28 03:07 ibdata1
rw-rw---. 1 mysql mysql 16777216 Sep 28 03:07 ib_logfile0
rw-rw---. 1 mysql mysql 16777216 Sep 28 03:04 ib_logfile1

/var/lib/mysql/mysql
-----------------------------
rw-rw---. 1 mysql mysql 5404 Sep 23 01:32 innodb_index_stats.frm
rw-rw---. 1 mysql mysql 4194304 Sep 27 05:04 innodb_index_stats.ibd
rw-rw---. 1 mysql mysql 1909 Sep 23 01:32 innodb_table_stats.frm
rw-rw---. 1 mysql mysql 98304 Sep 27 05:04 innodb_table_stats.ibd

Comment by Marko Mäkelä [ 2022-09-30 ]

DBrunner, which exact version of MariaDB Server are you using? If it is older than 10.3.31, please update to a newer version, to get the fix of MDEV-14180.

If you are running something newer than 10.3.31, I would like to get more details for reproducing this. How many InnoDB tables do you have?

Comment by Marko Mäkelä [ 2022-09-30 ]

Sorry, I was looking at the "Affects Version" field. Since 10.3.36 should contain the fix of MDEV-14180, we need enough information to reproduce this issue.

Comment by Jim Dutton [ 2022-09-30 ]

Sometimes when MariaDB is started, the encryption threads "thrash around" for only 10-20 minutes.

Some other times, Mariadb will start but issue an error message: "[ERROR] InnoDB: Table `mysql`.`innodb_table_stats` not found.", but the file does exist, though it may not show changes as of the current date

rw-rw---. 1 mysql mysql 1909 Sep 27 13:59 /var/lib/mysql/mysql/innodb_table_stats.frm
rw-rw---. 1 mysql mysql 98304 Sep 27 13:59 /var/lib/mysql/mysql/innodb_table_stats.ibd

One fix that I have used to "correct" this error is to delete stop MariaDB, delete the "affected files", re-start Mariadb, and apply the "mysql_upgrade --force" application to rebuild them. No more "stats not found".

Comment by Jim Dutton [ 2022-09-30 ]

One MariaDB has 3 databases with only 1 as the primary application DB, 9 tables each, 2 with one BLOB column. Some VARCHARS are 128, 256, 384, or 512 sizes.

A second MariaDB has 2 databases with only 1 as a primary application DB, 3 tables each, 2 columns are VARCHAR(512).

Comment by Jim Dutton [ 2022-09-30 ]

Systems are Intel NUC10i7FNH with 32GB DDR4 2667 MHz RAM, 6 core/12 threads 4.7Mhz CPU (shown as "@1.10Ghz" by lshw), 1TB Samsung NVMe.

Comment by Marko Mäkelä [ 2022-09-30 ]

The InnoDB data dictionary is stored in an internal table SYS_TABLES. You could check if the metadata exists at all:

SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE NAME='mysql/innodb_table_stats';

For reproducing this problem, we would need something like a mysqldump of a database and the configuration parameters. The data can be artificial, but it needs to reproduce the problem.

Are there any messages being written to the server error log? I have a feeling that this might be similar to MDEV-27530.

Comment by Jim Dutton [ 2022-09-30 ]

In the instance of one DB server, with few tables and vew entries, where DB log shows,

ERROR] InnoDB: Table `mysql`.`innodb_table_stats` not found.

the "SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE NAME='mysql/innodb_table_stats';" results are

Empty set (0.008 sec)

and the /var/lib/mysq/mysql/innodb* files show no changes for several days:

rw-rw---. 1 mysql mysql 5404 Sep 27 13:59 innodb_index_stats.frm
rw-rw---. 1 mysql mysql 98304 Sep 27 13:59 innodb_index_stats.ibd
rw-rw---. 1 mysql mysql 1909 Sep 27 13:59 innodb_table_stats.frm
rw-rw---. 1 mysql mysql 98304 Sep 27 13:59 innodb_table_stats.ibd

--------------------------------------------

On the other DB server, which has two separate DBs for separate projects with only one being used at a time, there are no DB log errors but there are encryption threads "thrashing"

"SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE NAME='mysql/innodb_table_stats';" results are

TABLE_ID: 32
NAME: mysql/innodb_table_stats
FLAG: 33
N_COLS: 9
SPACE: 17
ROW_FORMAT: Dynamic
ZIP_PAGE_SIZE: 0
SPACE_TYPE: Single

Innodb.stats files don't yet show current date, but have had some "update" only a few days ago

rw-rw---. 1 mysql mysql 5404 Sep 23 01:32 /var/lib/mysql_CSRmanage/mysql/innodb_index_stats.frm
rw-rw---. 1 mysql mysql 4194304 Sep 27 05:04 /var/lib/mysql_CSRmanage/mysql/innodb_index_stats.ibd
rw-rw---. 1 mysql mysql 1909 Sep 23 01:32 /var/lib/mysql_CSRmanage/mysql/innodb_table_stats.frm
rw-rw---. 1 mysql mysql 98304 Sep 27 05:04 /var/lib/mysql_CSRmanage/mysql/innodb_table_stats.ibd

Comment by Jim Dutton [ 2022-10-01 ]

For what it is worth, I am also in the process of copying DBs to two new test servers with newer OS's, one running Mdb-10.5.x and the other Mdb-10.9.3.

Comment by Jim Dutton [ 2022-10-01 ]

My largest databse is now also running under MariaDB-10.9.3 with data-at-rest encryption. No encryption-thread thrashing noticed (so far).

Comment by Sergei Golubchik [ 2022-10-01 ]

DBrunner, you can use https://mariadb.com/kb/en/meta/mariadb-ftp-server/ to upload files that you don't want to attach. If you do that, please, add a comment here specifying the file name.

Comment by Jim Dutton [ 2022-10-03 ]

Have uploaded "MDEV-29652_sample_bundle.tar" which consists of 3 files.

Comment by Jim Dutton [ 2022-10-08 ]

The observed MariaDB encryption thread "thrashing" occured under CentOS-7.9 and Oracle EL-7.9.

The previous notes about 10.5/10.9 were on Oracle EL-9, Alama-9, and/or RHEL-8.6/9.

The Centos-7.9 DB server has just had its MariaDB instance changed from 10.3.x to 10.5.x and it is not exhibiting the previously noted excessive encryption thread "thrashing". Three databases (one populated, others empty) each with different encryption KeyIDs.

Comment by Marko Mäkelä [ 2022-10-10 ]

DBrunner, thank you. The InnoDB I/O was heavily refactored by MDEV-15053, MDEV-15058, MDEV-23399 and MDEV-23855 and a few follow-up tickets in MariaDB Server 10.5.

What you are observing in MariaDB Server 10.3 might be something similar to MDEV-27466, which was observed in MariaDB Server 10.4. I am reluctant to touch the I/O layer in MariaDB Server before version 10.5 or 10.6. MariaDB Server 10.6 included some more performance improvements, such as simpler buffer page latches (MDEV-24142).

Comment by Jim Dutton [ 2022-10-10 ]

I have rebuilt my databases on two servers using MariaDB-10.5 without any problems. I have observed system and MariaDB startups and have not noticed any long running encryption threads. I have monitored the database servers as I worked through my application. There were "normal" increases in database activity but at most only a 0.7 percent increase in CPU utilization and for only a few seconds. The two database servers are now running with MariaDB-10.5.17.

I tested one database server on MariaDB-10.9 and that also had no encryption-thread-thrashing issue.

Since my application is now running successfully using MariaDB-10.5, I do not intend to (re-) install 10.3 again. Unfortunately for me - I now have the task of upgrading all of my application documentation to reflect the DB changes. Oh well.

As far as I am concerned, MariaDB-10.5 has relieved my systems and application of the excessive long-running CPU utilization issues. In this case, I consider my issue and this problem solved. (now if I could just disable the various keyboard shorcuts various applications have implemented so as to stop popping up things as I type ....)

You may close this problem out but I would suggest you do it with a stipulation that a MariaDB-10.3 user/site that exhibits the problems noted here should be "encouraged" to migrate to MariaDB >=10.5.

Comment by Marko Mäkelä [ 2022-10-11 ]

DBrunner, thank you. It is a good idea to upgrade from 10.3 anyway, because it will reach its End of Life (EOL) in May 2023, 5 years after the initial 10.3.7 generally available (GA) release.

Personally, I would recommend 10.6.10 or later. The MariaDB Server 10.6 release series is the first one where most DDL operations are crash-safe (MDEV-17567).

Generated at Thu Feb 08 10:10:15 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.