[MDEV-24299] mariadb 100% memory consumption crashes mysqld Created: 2020-11-27  Updated: 2020-12-28

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.4.17
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Thies Meincke Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Environment:

SLES 12SP5 with Mariadb repo https://yum.mariadb.org/10.4/sles/12/x86_64


Attachments: File gstack.out.gz     File gstack.out2.gz     File perftop.out.gz    

 Description   

Dumps on a SLES-mariadb-slave server use all available Memory+Swap. OS finally kills mysqld:
Out of memory: Kill process 50603 (mysqld) score 978 or sacrifice child
[553689.183105] Killed process 50603 (mysqld) total-vm:36304280kB, anon-rss:32185504kB, file-rss:0kB, shmem-rss:8kB

OS: SLES 12.5 with 32 GB RAM + 2 GB swap

After restart a dump of a 60 GB database with InnoDb-Tables claims all memory within minutes. It is not freed after the dump has finished. The next dump of a database (500MB) with MyIsam and 20.000 tables crashes mysqld with a good chance.
It's just a mariadb slave server. no further user interaction.

Some configuration parameters:

innodb_buffer_pool_size = 28G
innodb_log_file_size = 7G
read_buffer_size = 256k
read_rnd_buffer_size = 256k
max_heap_table_size=16M
tmp_table_size = 16M
join_buffer_size = 256k
key_buffer_size = 256M
max_connect_errors = 1000
open_files_limit = 32000
table_definition_cache = 10000
table_open_cache = 5400

Status data around crash time:
Dump of database mentioned above (MyIsam, 20.000 tables):

free -th
             total       used       free     shared    buffers     cached
Mem:           31G        31G       245M        81M        47M       394M
-/+ buffers/cache:        30G       686M
Swap:         2.0G       2.0G        35M
Total:         33G        33G       281M

Variable_name   Value
Memory_used     2439645840
Memory_used_initial     1213447200
Qcache_free_memory      536852776
Variable_name   Value
Binlog_snapshot_file    mysql-bin.000211
Com_show_profile        0
Com_show_profiles       0
Created_tmp_files       5
Innodb_num_open_files   2440
Innodb_scrub_background_page_split_failures_out_of_filespace    0
Open_files      1351
Opened_files    892245
Performance_schema_file_classes_lost    0
Performance_schema_file_handles_lost    0
Performance_schema_file_instances_lost  0
Variable_name   Value
Acl_table_grants        2
Com_alter_table 50
Com_alter_tablespace    0
Com_create_table        132
Com_create_temporary_table      42928
Com_drop_table  6
Com_drop_temporary_table        42865
Com_lock_tables 121
Com_rename_table        57
Com_show_create_table   71397
Com_show_open_tables    0
Com_show_table_status   0
Com_show_tables 133
Com_unlock_tables       133
Created_tmp_disk_tables 133670
Created_tmp_tables      271943
Open_table_definitions  10000
Open_tables     5400
Opened_table_definitions        142861
Opened_tables   329535
Performance_schema_table_handles_lost   0
Performance_schema_table_instances_lost 0
Slave_open_temp_tables  1
Table_locks_immediate   944290
Table_locks_waited      4
Table_open_cache_active_instances       1
Table_open_cache_hits   23994122
Table_open_cache_misses 286655
Table_open_cache_overflows      91586

real    1m26.419s
user    0m0.089s
sys     0m0.058s

ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/DB/mysql.sock' (111)
crash here!



 Comments   
Comment by Marko Mäkelä [ 2020-11-27 ]

Can you provide any more information on what could have consumed the memory? If this is repeatable, then I would suggest running http://poormansprofiler.org/ or sudo perf top while the server is in the process of consuming the memory, to see what is causing the consumption.

If it is a slower memory leak, then it might be useful to try tcmalloc and https://gperftools.github.io/gperftools/heapprofile.html (or similar tools in jemalloc or the libc of your choice).

The stack traces could also confirm whether this could in any way be related to MDEV-24188.

Comment by Daniel Black [ 2020-11-27 ]

Is the binlog_format, statement, row or mixed? 28G of buffer pool on a 32G machine is a touch tight. What are your innodb buffer pool stats for non-mysqldump operations? Did it not run out of memory in a previous mariadb version with the same configuration?

Comment by Thies Meincke [ 2020-11-27 ]

binlog_format mixed
After upgrading to 10.4 (from 10.2) we experienced sporadic out-of-memory-issues concerning creation of new threads and high memory consumption both on master and slave servers . so the RAM was enlarged from 16 to 32 GB and the configuration reviewed and changed. Since then we have no problems on the master server. The problems on the slave server however remained and have increased. This however that may be due to to a strongly growing size of some databases. On 10.2 we hat no such problems, however, the databases where smaller and the number of databases has also increased.
innodb buffer pool stats:
Master server (no dumps):

MariaDB [(none)]> show global status like '%innodb%buffer%pool%';
+---------------------------------------+--------------------------------------------------+
| Variable_name                         | Value                                            |
+---------------------------------------+--------------------------------------------------+
| Innodb_buffer_pool_dump_status        |                                                  |
| Innodb_buffer_pool_load_status        | Buffer pool(s) load completed at 201121 11:11:44 |
| Innodb_buffer_pool_resize_status      |                                                  |
| Innodb_buffer_pool_load_incomplete    | OFF                                              |
| Innodb_buffer_pool_pages_data         | 1754838                                          |
| Innodb_buffer_pool_bytes_data         | 28096004096                                      |
| Innodb_buffer_pool_pages_dirty        | 19559                                            |
| Innodb_buffer_pool_bytes_dirty        | 318046208                                        |
| Innodb_buffer_pool_pages_flushed      | 10212979                                         |
| Innodb_buffer_pool_pages_free         | 8197                                             |
| Innodb_buffer_pool_pages_misc         | 35013                                            |
| Innodb_buffer_pool_pages_total        | 1798048                                          |
| Innodb_buffer_pool_read_ahead_rnd     | 0                                                |
| Innodb_buffer_pool_read_ahead         | 1362440                                          |
| Innodb_buffer_pool_read_ahead_evicted | 1                                                |
| Innodb_buffer_pool_read_requests      | 30451465654                                      |
| Innodb_buffer_pool_reads              | 1369897                                          |
| Innodb_buffer_pool_wait_free          | 0                                                |
| Innodb_buffer_pool_write_requests     | 809335875                                        |
+----------------------------

Slave Server after Restart and dump of the largest database (60 GB):

MariaDB [(none)]> show global status like '%innodb%buffer%pool%';
+---------------------------------------+--------------------------------------------------+
| Variable_name                         | Value                                            |
+---------------------------------------+--------------------------------------------------+
| Innodb_buffer_pool_dump_status        |                                                  |
| Innodb_buffer_pool_load_status        | Buffer pool(s) load completed at 201127 14:10:27 |
| Innodb_buffer_pool_resize_status      |                                                  |
| Innodb_buffer_pool_load_incomplete    | OFF                                              |
| Innodb_buffer_pool_pages_data         | 1787658                                          |
| Innodb_buffer_pool_bytes_data         | 29323870208                                      |
| Innodb_buffer_pool_pages_dirty        | 4352                                             |
| Innodb_buffer_pool_bytes_dirty        | 69877760                                         |
| Innodb_buffer_pool_pages_flushed      | 19237                                            |
| Innodb_buffer_pool_pages_free         | 8191                                             |
| Innodb_buffer_pool_pages_misc         | 2199                                             |
| Innodb_buffer_pool_pages_total        | 1798048                                          |
| Innodb_buffer_pool_read_ahead_rnd     | 0                                                |
| Innodb_buffer_pool_read_ahead         | 1820872                                          |
| Innodb_buffer_pool_read_ahead_evicted | 38010                                            |
| Innodb_buffer_pool_read_requests      | 106728264                                        |
| Innodb_buffer_pool_reads              | 777306                                           |
| Innodb_buffer_pool_wait_free          | 0                                                |
| Innodb_buffer_pool_write_requests     | 2281372                                          |
+---------------------------------------+--------------------------------------------------+

http://poormansprofiler.org/ or jemalloc or different libc are no choice currently. mysqld is from the binary package for SLES 12 and not compiled locally with debug symbols.

2 gstack outputs and perftop output are attached. gstack.out.gz gstack.out2.gz perftop.out.gz

Generated at Thu Feb 08 09:28:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.