Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17757

MariaDB 10.1 encryption threads consume CPU 100%

Details

    Description

      Hi,

      We activated the encryption and the replication on our servers.

      For some reasons during the restart the MariaDB encryption threads consumes 100% CPU without terminating when traffic is present.

      For example if no traffic happens on Master, during the mysql restart (service mysql restart) the CPU will be consumed for some minutes (20 minutes or so) and then will go back to normal.
      We observed on slave that if the slave was restarted during the traffic on Master then the Slave will recover the transactions from binlogs BUT the 4 encryption threads configured remain present in the CPU for hours consuming all 4 core from the total of 8 CPU cores. Even if the Master was shutdown and only slave remained active. The threads looks to be stuck.

      What we observed is that the encryption threads might get stuck if the server is restarted on a loading coming.

      So is not that the threads would consume some resources but might get the server stuck in a infinite loop.

      I tried to run a compiled version with debug but I'm not sure what should I do in order to see what happens on that threads ?

      top -H

      Cpu(s): 37.3%us, 0.4%sy, 0.0%ni, 62.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
      Mem: 16329900k total, 8526724k used, 7803176k free, 323848k buffers
      Swap: 2047996k total, 0k used, 2047996k free, 6857584k cached

      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      13315 mysql 20 0 5169m 512m 10m R 79.7 3.2 0:26.42 mysqld
      13316 mysql 20 0 5169m 512m 10m R 79.0 3.2 0:26.20 mysqld
      13317 mysql 20 0 5169m 512m 10m R 71.7 3.2 0:25.80 mysqld
      13314 mysql 20 0 5169m 512m 10m S 71.0 3.2 0:25.89 mysqld

      gdb -p 3840 /usr/sbin/mysqld
      set height 0
      set logging file /tmp/mysqld.log
      set logging on
      thread apply all backtrace full

      This is an output from the thread:

      Thread 1 (process 3840):
      #0 0x0000000000d47055 in ut_delay(unsigned long) ()
      No symbol table info available.
      #1 0x0000000000d112ac in mutex_spin_wait(void*, bool, char const*, unsigned long) ()
      No symbol table info available.
      #2 0x0000000000e04664 in mutex_enter_func ()
      No symbol table info available.
      #3 0x0000000000e04722 in pfs_mutex_enter_func ()
      No symbol table info available.
      #4 0x0000000000e1322a in fil_space_next(fil_space_t*) ()
      No symbol table info available.
      #5 0x0000000000e1965f in fil_crypt_find_space_to_rotate(key_state_t*, rotate_thread_t*, bool*) ()
      No symbol table info available.
      #6 0x0000000000e1acb7 in fil_crypt_thread ()
      No symbol table info available.
      #7 0x0000003bf1607aa1 in start_thread () from /lib64/libpthread.so.0
      No symbol table info available.
      #8 0x0000003bf12e8aad in clone () from /lib64/libc.so.6
      No symbol table info available.

      Configuration is:

      innodb-file-format = Antelope
      innodb-file-format-max = Antelope

      interactive_timeout = 3700
      wait_timeout = 3700
      connect_timeout = 10

      server-id = 1
      log-bin = /var/lib/mysql/binarylog/binarylog-bin
      relay-log = /var/lib/mysql/relaylog/relaylog-bin
      auto-increment-increment = 2
      auto-increment-offset = 1
      max_binlog_size = 1G
      expire_logs_days = 14
      master-retry-count = 0
      slave-net-timeout = 60
      gtid_domain_id = 1

      back_log = 3000
      max_connections = 2000
      max_user_connections = 2000

      performance_schema = 0
      thread_handling = pool-of-threads
      thread_pool_size = 8
      thread_pool_max_threads = 3000
      thread_cache_size = 0
      innodb_stats_on_metadata = 0
      innodb_stats_auto_recalc = 0
      innodb_stats_persistent = 0
      innodb_use_native_aio = 1
      innodb_use_fallocate = 0
      innodb_adaptive_hash_index = 0
      innodb_use_mtflush = 0
      innodb_mtflush_threads = 8
      innodb_sync_array_size = 1
      innodb_adaptive_flushing = 1
      innodb_checksums = 1
      innodb_checksum_algorithm = crc32
      innodb_doublewrite = 0
      innodb_compression_algorithm = 0
      sync_binlog = 0
      secure_auth = 0

      binlog_format=STATEMENT
      default_storage_engine=InnoDB
      transaction-isolation=REPEATABLE-READ

      max-allowed-packet = 4M
      max-connect-errors = 1000000

      query_cache_type = 0
      query-cache-size = 0
      query-cache-limit = 2M
      table-open-cache = 8000
      open-files-limit = 65535

      key_buffer_size = 16M
      read_buffer_size = 262144
      read_rnd_buffer_size = 524288
      thread_stack = 262144

      innodb-flush-method = O_DIRECT

      innodb-log-files-in-group = 3
      innodb-log-file-size = 256M
      innodb-flush-log-at-trx-commit = 2
      innodb-file-per-table = 1
      innodb-buffer-pool-size = 4G
      innodb_additional_mem_pool_size = 8388608

      tmpdir = /mysql_database/mysql/
      innodb_data_home_dir = /mysql_database/mysql/data
      innodb_data_file_path = ibdata1:10M:autoextend
      innodb_log_group_home_dir = /mysql_database/mysql/logs
      innodb_buffer_pool_instances = 4

      innodb_io_capacity = 200
      innodb_io_capacity_max=1000
      innodb_write_io_threads = 4
      innodb_read_io_threads = 8
      innodb_log_buffer_size = 8M
      innodb_thread_concurrency = 6
      innodb_lock_wait_timeout = 50
      innodb_thread_sleep_delay = 20000
      innodb_concurrency_tickets = 5000
      open_files_limit=65535
      datadir = /mysql_database/mysql
      slow-query-log = 1
      long_query_time = 2
      [safe_mysqld]
      open_files_limit=65535

      [mysqlhotcopy]
      interactive-timeout/log

      [mysqld_safe]
      open-files-limit=65535

      [mariadb]

      plugin-load-add=file_key_management.so
      file-key-management
      file_key_management_filename=/opt/keys/key.key
      file_key_management_encryption_algorithm=AES_CTR
      innodb_default_encryption_key_id=1
      innodb-tablespaces-encryption
      aria_encrypt_tables=1
      encrypt-tmp-disk-tables=1
      encrypt-tmp-files=1
      encrypt-binlog=1
      innodb-encrypt-tables=ON
      innodb-encrypt-log=ON
      #must be half then CPU number otherwise encryption threads can consume 100% the CPU
      innodb-encryption-threads=4
      #When innodb_encryption_rotate_key_age=0 is set, server won't encrypt ibdata tablespaces and innodb system tables
      #innodb-encryption-rotate-key-age=0
      innodb-encryption-rotate-key-age=1

      Attachments

        Issue Links

          Activity

            stefan2017, do you see this on an otherwise idle server? Could you try to create and upload a reproducible test case?

            marko Marko Mäkelä added a comment - stefan2017 , do you see this on an otherwise idle server? Could you try to create and upload a reproducible test case?
            stefan2017 Stefan added a comment - - edited

            No, if I restart the server while is idle (NO traffic done on it and no app is connected), the threads runs a while and stops, usually 5 to 10 minutes.

            If for example I start the loading on Master node and after a while I restart the Slave node once still traffic on Master, the threads continue to run forever, even after some minutes after the Slave restart, the loading tests ends on Master.

            I can reproduce this behavior. All I need is what commands to run and what to provide to you.

            A separate question, I tried to recompile MariaDB with more debug options. How can I add symbols to this process ?

            wget .... sources
            tar zxvf ....
            cd mariadb-10.1.37
            cmake -DRPM=centos6 . -DCMAKE_BUILD_TYPE=Debug -DWITH_VALGRIND=ON
            make -j4
            make package

            stefan2017 Stefan added a comment - - edited No, if I restart the server while is idle (NO traffic done on it and no app is connected), the threads runs a while and stops, usually 5 to 10 minutes. If for example I start the loading on Master node and after a while I restart the Slave node once still traffic on Master, the threads continue to run forever, even after some minutes after the Slave restart, the loading tests ends on Master. I can reproduce this behavior. All I need is what commands to run and what to provide to you. A separate question, I tried to recompile MariaDB with more debug options. How can I add symbols to this process ? wget .... sources tar zxvf .... cd mariadb-10.1.37 cmake -DRPM=centos6 . -DCMAKE_BUILD_TYPE=Debug -DWITH_VALGRIND=ON make -j4 make package

            innodb_encryption_key_rotate_age is 1 could lead to re-encryption of tablespace frequently. May be, Increasing the variable will avoid re-encryption.

            thiru Thirunarayanan Balathandayuthapani added a comment - innodb_encryption_key_rotate_age is 1 could lead to re-encryption of tablespace frequently. May be, Increasing the variable will avoid re-encryption.
            stefan2017 Stefan added a comment - - edited

            Can you please let me know how I should change this values to repeat the test?
            I don't plan to rotate keys so which are the correct values ?

            innodb_default_encryption_key_id=1 - what value should I set here
            innodb-encryption-rotate-key-age=1 - what value whould I set here

            According to the documentation: innodb-encryption-rotate-key-age = Re-encrypt in background any page having a key older than this number of key versions.
            Maybe I need to set my default encryption key to something higher like

            innodb_default_encryption_key_id=10

            and keep

            innodb-encryption-rotate-key-age=1 ?

            stefan2017 Stefan added a comment - - edited Can you please let me know how I should change this values to repeat the test? I don't plan to rotate keys so which are the correct values ? innodb_default_encryption_key_id=1 - what value should I set here innodb-encryption-rotate-key-age=1 - what value whould I set here According to the documentation: innodb-encryption-rotate-key-age = Re-encrypt in background any page having a key older than this number of key versions. Maybe I need to set my default encryption key to something higher like innodb_default_encryption_key_id=10 and keep innodb-encryption-rotate-key-age=1 ?

            In MDEV-14180, InnoDB disables key rotation checks for file_key_management_plugin.
            Basically, InnoDB creates the default_encrypt_list which has the list of tablespace to be encrypted/decrypted based on innodb_encrypt_tables value.
            If the tables in the list is encrypted/decrypted succesfully then list will be empty. So there won't be any un-necessary I/O from
            background encryption threads. We can close this bug as a duplicate of MDEV-14180.

            thiru Thirunarayanan Balathandayuthapani added a comment - In MDEV-14180 , InnoDB disables key rotation checks for file_key_management_plugin. Basically, InnoDB creates the default_encrypt_list which has the list of tablespace to be encrypted/decrypted based on innodb_encrypt_tables value. If the tables in the list is encrypted/decrypted succesfully then list will be empty. So there won't be any un-necessary I/O from background encryption threads. We can close this bug as a duplicate of MDEV-14180 .

            People

              thiru Thirunarayanan Balathandayuthapani
              stefan2017 Stefan
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.