[MDEV-6824] MariaDB 10.0.14 constant 100% CPU load ! Created: 2014-10-01  Updated: 2016-08-13  Resolved: 2016-04-21

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - XtraDB
Affects Version/s: 10.0.13, 10.0, 10.1
Fix Version/s: 10.0.25

Type: Bug Priority: Major
Reporter: Marek Zakrzewski Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

CentOS 6.5 64-bit, Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz, 64 GB DDR3 1333 MHz, 2x 240 GB SSD in S/W RAID1


Attachments: File lowmem.cnf     Text File mysql.log     File mysql.svg     Text File mysql_global.log     Text File mysqld.log    
Sprint: 10.0.25

 Description   

Hello,

I found out that my MariaDB server is using my CPU 100% non-stop !

It's completely idle:

MariaDB [(none)]> show processlist;
+--------+-----------------+-----------------+-----------------+---------+------+-------+------------------+----------+
| Id     | User            | Host            | db              | Command | Time | State | Info             | Progress |
+--------+-----------------+-----------------+-----------------+---------+------+-------+------------------+----------+
|     12 | powerdns        | localhost:24535 | powerdns        | Sleep   |    9 |       | NULL             |    0.000 |
|     13 | powerdns        | localhost:24537 | powerdns        | Sleep   |    9 |       | NULL             |    0.000 |
|     14 | powerdns        | localhost:24538 | powerdns        | Sleep   |   10 |       | NULL             |    0.000 |
|     39 | powerdns        | localhost:24594 | powerdns        | Sleep   |   16 |       | NULL             |    0.000 |
| 594049 | msbkm_superocen | localhost       | msbkm_superocen | Sleep   |  125 |       | NULL             |    0.000 |
| 594252 | root            | localhost       | NULL            | Query   |    0 | init  | show processlist |    0.000 |
+--------+-----------------+-----------------+-----------------+---------+------+-------+------------------+----------+
6 rows in set (0.00 sec)

My config is:

#
# This group is read both both by the client and the server
# use it for options that affect everything
#
[client-server]
 
#
# include all files from the config directory
#
!includedir /etc/my.cnf.d
 
[root@static ~]# cat /etc/my.cnf.d/server.cnf
[mysqld]
#skip-networking
#bind-address = 0.0.0.0
#server-id=1
#log-bin = mysql-bin
#skip-name-resolve
#skip-grant-tables
datadir=/var/lib/mysql
tmpdir=/dev/shm
socket=/var/lib/mysql/mysql.sock
user=mysql
#log_warnings=2
skip-external-locking
query_cache_limit=8M
query_cache_size=8M
query_cache_type=1
max_connections=1000
max_user_connections=50
max_connect_errors=5000
open_files_limit = 1024000
thread_handling = pool-of-threads
thread_cache_size=8
key_cache_segments=8
key_buffer=8M
innodb-buffer-pool-instances=12
innodb_buffer_pool_size = 8M
innodb_thread_concurrency=12
innodb_log_file_size=64M
innodb_log_buffer_size = 8M
innodb_lock_wait_timeout = 100
innodb_flush_log_at_trx_commit = 2
innodb_support_xa=0
innodb_read_io_threads = 8
innodb_write_io_threads = 8
innodb_max_dirty_pages_pct = 90
innodb_file_per_table=1
innodb_use_native_aio=1
innodb_doublewrite=1
innodb_use_fallocate=1
innodb_use_atomic_writes=0
innodb_flush_method = O_DIRECT
innodb_file_format=barracuda
innodb_lru_scan_depth=2000
innodb_io_capacity=400
innodb_io_capacity_max=1000
sync_binlog=0
join_buffer_size=1M
max_allowed_packet=32M
table_open_cache=1024
table_cache=32k
sort_buffer_size=1M
read_buffer_size=1M
read_rnd_buffer_size=1M
tmp_table_size=2M
max_heap_table_size=1M
long_query_time=2
thread_concurrency=12
myisam_sort_buffer_size=1M
max_write_lock_count = 1
default-storage-engine=MyISAM
 
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
plugin-load=archive=ha_archive.so;blackhole=ha_blackhole.so
open_files_limit = 1024000
 
[mysqldump]
quick
max_allowed_packet=16M

top - 16:10:32 up 26 days, 15:28,  1 user,  load average: 1.13, 1.19, 1.24
Tasks: 302 total,   1 running, 297 sleeping,   0 stopped,   4 zombie
Cpu(s): 10.5%us,  0.2%sy,  0.0%ni, 89.2%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  65580984k total, 64516740k used,  1064244k free,  4138816k buffers
Swap:  2097084k total,    44924k used,  2052160k free, 51518372k cached
 
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
19647 mysql     20   0 2050m 943m  10m S 101.4  1.5   2558:12 mysqld
 
[root@static ~]# mysql -V
mysql  Ver 15.1 Distrib 10.0.14-MariaDB, for Linux (x86_64) using readline 5.1



 Comments   
Comment by Elena Stepanova [ 2014-10-01 ]

Hi Marek,

1) You specified 10.0.13 in the 'Affects versions', and 10.0.14 in the summary – which is it?

2) Has it started happening recently, after an upgrade?

3) Did you check that you don't have a disk space problem?

4) Please attach the error log file;

5) Please run set global userstat=1, wait for a while, and then run select * from information_schema.user_statistics, select * from information_schema.client_statistics, and check if it indicates any spikes in CPU_TIME fields.

Comment by Patryk Pomykalski [ 2014-10-01 ]

You can also try to use `perf top` to see what functions are being executed.

Comment by Marek Zakrzewski [ 2014-10-01 ]

1. Both versions. When I typed 10.0.14 it showed me an error "Version does not exist".

2. It happened when we moved all customers from this machine to other machine, but MariaDB high CPU load still existed. Restarting MariaDB server does not help.

3. No problems with disk space:

[root@static ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/md2         15G  9.2G  4.7G  67% /
tmpfs            32G  742M   31G   3% /dev/shm
/dev/md1        239M   27M  196M  12% /boot
/dev/md3        203G  140G   53G  73% /var/lib/mysql
/dev/md4        2.7T  1.7T  1.1T  63% /home

4. Only some small errors because I tried to kill the process:

141001 17:09:24 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
141001 17:09:24 [Note] InnoDB: Using mutexes to ref count buffer pool pages
141001 17:09:24 [Note] InnoDB: The InnoDB memory heap is disabled
141001 17:09:24 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
141001 17:09:24 [Note] InnoDB: Memory barrier is not used
141001 17:09:24 [Note] InnoDB: Compressed tables use zlib 1.2.3
141001 17:09:24 [Note] InnoDB: Using Linux native AIO
141001 17:09:24 [Note] InnoDB: Using CPU crc32 instructions
141001 17:09:24 [Note] InnoDB: Initializing buffer pool, size = 8.0M
141001 17:09:24 [Note] InnoDB: Completed initialization of buffer pool
141001 17:09:24 [Note] InnoDB: Highest supported file format is Barracuda.
141001 17:09:24 [Note] InnoDB: 128 rollback segment(s) are active.
141001 17:09:24 [Note] InnoDB: Waiting for purge to start
141001 17:09:24 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.20-68.0 started; log sequence number 265772500845
141001 17:09:24 [Note] Plugin 'FEEDBACK' is disabled.
141001 17:09:24 [Note] Server socket created on IP: '::'.
141001 17:09:24 [Note] Event Scheduler: Loaded 0 events
141001 17:09:24 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.0.14-MariaDB'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server
141001 17:09:46 [ERROR] mysqld: Table './monia_ma_12/komentarze_obrazki' is marked as crashed and should be repaired
141001 17:09:46 [Warning] Checking table:   './monia_ma_12/komentarze_obrazki'
141001 17:11:18 [ERROR] mysqld: Table './smogo_nalewki/wp_options' is marked as crashed and should be repaired
141001 17:11:18 [Warning] Checking table:   './smogo_nalewki/wp_options'
141001 17:11:53 [ERROR] mysqld: Table './zbyn77_home/wp_options' is marked as crashed and should be repaired
141001 17:11:53 [Warning] Checking table:   './zbyn77_home/wp_options'
141001 17:12:17 [ERROR] mysqld: Table './drobiazg_games/e79213370a65d95b4b5d44f02003dfe1' is marked as crashed and should be repaired
141001 17:12:17 [Warning] Checking table:   './drobiazg_games/e79213370a65d95b4b5d44f02003dfe1'
141001 17:12:27 [ERROR] mysqld: Table './drobiazg_games/01bdbbad1d37de2a9f5b38333f210866' is marked as crashed and should be repaired
141001 17:12:27 [Warning] Checking table:   './drobiazg_games/01bdbbad1d37de2a9f5b38333f210866'

5. Please see:

http://pastebin.com/s7sG50vh

and:

MariaDB [(none)]> select * from information_schema.client_statistics;
+-----------+-------------------+------------------------+----------------+-----------------+--------------------+----------------+------------+----------------------+-----------+-----------+--------------+---------------+--------------+-----------------+-----------------+----------------+---------------------+-----------------------+--------------------+------------------+---------------+---------------+
| CLIENT    | TOTAL_CONNECTIONS | CONCURRENT_CONNECTIONS | CONNECTED_TIME | BUSY_TIME       | CPU_TIME           | BYTES_RECEIVED | BYTES_SENT | BINLOG_BYTES_WRITTEN | ROWS_READ | ROWS_SENT | ROWS_DELETED | ROWS_INSERTED | ROWS_UPDATED | SELECT_COMMANDS | UPDATE_COMMANDS | OTHER_COMMANDS | COMMIT_TRANSACTIONS | ROLLBACK_TRANSACTIONS | DENIED_CONNECTIONS | LOST_CONNECTIONS | ACCESS_DENIED | EMPTY_QUERIES |
+-----------+-------------------+------------------------+----------------+-----------------+--------------------+----------------+------------+----------------------+-----------+-----------+--------------+---------------+--------------+-----------------+-----------------+----------------+---------------------+-----------------------+--------------------+------------------+---------------+---------------+
| localhost |               380 |                      0 |           2098 | 4.0122479999997 | 3.8422957999999348 |        1873474 |   33940612 |                    0 |   2981840 |    428357 |          178 |           592 |          286 |           14622 |            1313 |            301 |                  94 |                     0 |                  0 |                0 |             0 |          1380 |
+-----------+-------------------+------------------------+----------------+-----------------+--------------------+----------------+------------+----------------------+-----------+-----------+--------------+---------------+--------------+-----------------+-----------------+----------------+---------------------+-----------------------+--------------------+------------------+---------------+---------------+
1 row in set (0.00 sec)

Comment by Elena Stepanova [ 2014-10-06 ]

Hi Marek,

Thank you.
Since you said that restarting the server does not help, could you please also restart the server with performance_schema enabled (--performance-schema), and run the queries below? The set is most likely majorly excessive, but since we have no idea what causes the high load, I've included all instruments and pretty much all consumers.
Please also see Patryk's suggestion above.

 
# Restart server with --performance schema
# Run
update performance_schema.setup_consumers set enabled='yes';
update performance_schema.setup_instruments set enabled='yes', timed='yes';
update performance_schema.setup_objects set enabled = 'yes', timed = 'yes' where object_schema in ( 'mysql', 'information_schema');

Let it work a bit to collect more relevant data (assuming it's still on 100% CPU), then run

select event_name, count_star, sum_timer_wait from performance_schema.events_statements_summary_global_by_event_name order by sum_timer_wait desc limit 50;
select event_name, count_star, sum_timer_wait from performance_schema.events_waits_summary_global_by_event_name order by sum_timer_wait desc limit 50;
select event_name, count_star, sum_timer_wait from performance_schema.events_stages_summary_global_by_event_name order by sum_timer_wait desc limit 50;
select event_name, count_star, sum_timer_wait from performance_schema.file_summary_by_event_name order by sum_timer_wait desc limit 50;
select file_name, event_name, count_star, sum_timer_wait from performance_schema.file_summary_by_instance order by sum_timer_wait desc limit 50;
select object_type, object_schema, object_name, count_star, sum_timer_wait from performance_schema.objects_summary_global_by_type order by sum_timer_wait desc limit 50;
select event_name, count_star, sum_timer_wait from performance_schema.socket_summary_by_event_name order by sum_timer_wait desc limit 50;
select object_type, object_schema, object_name, count_star, sum_timer_wait from performance_schema.table_io_waits_summary_by_table order by sum_timer_wait desc limit 50;
select object_type, object_schema, object_name, count_star, sum_timer_wait from performance_schema.table_lock_waits_summary_by_table order by sum_timer_wait desc limit 50;
 
select * from performance_schema.events_stages_current order by timer_wait desc;
select thread_id, event_name, source, timer_wait, lock_time, digest_text from performance_schema.events_statements_current order by timer_wait desc;
select thread_id, event_name, source, timer_wait from performance_schema.events_waits_current order by timer_wait desc;
 
select * from events_stages_history order by timer_wait desc;
select thread_id, event_name, source, timer_wait, lock_time, digest_text from performance_schema.events_statements_history order by timer_wait desc;
select thread_id, event_name, source, timer_wait from performance_schema.events_waits_history order by timer_wait desc;

Comment by Marek Zakrzewski [ 2014-10-06 ]

No possible now I've already cancelled my old dedicated server.

All data were moved to a new machine. I don't see this bug anymore.

Comment by Elena Stepanova [ 2014-10-14 ]

Is the new machine running the same version of MariaDB server?
If so, and without more information, we'll have to assume it was an environmental problem of some sort. Of course, it would be very useful to find out what caused it, but since the environment is gone now, it'll probably remain a mystery.

Comment by Elena Stepanova [ 2014-10-31 ]

As said above, closing for now as "Cannot reproduce". If you encounter the problem again, or if you get new information, please comment to re-open the issue.

Comment by Markus [ 2015-10-16 ]

The mysql_global.log, only showing the questioned queries.
The mysql.log is the output for those queries.

The server was just started but mysqld sits on 100% cpu.

Comment by Markus [ 2015-10-16 ]

rebuilt the server with debug symbols... attached debugger to 100% thread... backtrace is always in "buf_flush_lru_manager_thread"

it seems to be "innodb_buffer_pool_size" in combination with "innodb_log_file_size". I had both at 5M... changing "innodb_buffer_pool_size" to 10M made the problem disappear.
(The original poster had similar changes.)

Perhaps add some sanity checks?

Comment by José JORGE [ 2015-10-27 ]

Thank you, I have just hitten the same bug. Growing "innodb_buffer_pool_size" from 4M to 10M stopped the 100% CPU bug.
I suppose the bug report should be reopened as it is reproductible.

Comment by Elena Stepanova [ 2015-11-26 ]

Thanks all.

Reproducible on recent 10.0 (2553f143fdeb) , 10.1 (edf6354bd6389ce), only with XtraDB.

To reproduce,

  • create a simple InnoDB table, e.g.
    create table t1 (pk int auto_increment primary key, i int) engine=InnoDB;
  • insert ~1M rows;
  • restart server with {[innodb-buffer-pool-size=5M}}, otherwise defaults;
  • run select count(*) from t1;
  • wait till the query finishes;
  • observe further CPU usage – it stays at 100% and does not go down.

If the same is done with InnoDB plugin, it works OK, which makes me think it might be not strictly necessary even with such a low innodb_buffer_pool_size, and can be fixed.
However, if it's impossible to fix in XtraDB, probably the minimal value for innodb_buffer_pool_size should be increased. It cannot be done in a post-GA version, but for 10.2 it's okay, while in 10.0 and 10.1 we could issue a warning upon startup.

Comment by Laurynas Biveinis [ 2015-11-26 ]

We have at least a partial fix in review for this

Comment by Jan Lindström (Inactive) [ 2016-04-21 ]

Based on my testing this issue is already fixed.

Comment by Ragnar Rova [ 2016-08-13 ]

This reappeared on 10.1.16, with debian jessie packages, also with very aggressive memory settings: lowmem.cnf.
CPU flamegraph: mysql.svg . Most of time is spent in buf_flush_LRU_tail in buf_flush_lru_manager_thread.
mysqld log: mysqld.log

Have no exact steps to reproduce yet... Using the mariadb:10.1.16 docker image and the attached lowmem.cnf as an additional config in /etc/mysql/conf.d. The application using the db was wordpress, very light usage, on a very low-memory machine (t2.nano aws).

Can try to provide gdb backtrace when I reproduce it again.

Will increase innodb-buffer-pool-size to 20M and see if it reappears.

Generated at Thu Feb 08 07:14:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.