[MDEV-10499] Extreme memory use when rebuilding a large table Created: 2016-08-05  Updated: 2016-10-07  Resolved: 2016-10-06

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Affects Version/s: 10.1.16
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Mark Wadham Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: need_feedback


 Description   

When rebuilding a very large table as a result of executing DDL, eg:

alter table <table> add column boris int(11) null;

it gets to this point with very little memory use:

2016-08-04 13:17:46 139771367865088 [Note] InnoDB: Online DDL : Start building index `PRIMARY` (1 / 4), estimated cost : 22.1238

At which point it begins to consume all the available ram and then all the available swap and if there isn't enough it gets OOM'd by the kernel.

We have run this test many times with a 160GB table on nodes that have 64GB ram and no swap and it exhausted all of it and got OOM'd by the kernel. We succeeded once with an innodb buffer pool lowered from 50GB to 16GB and a 16GB swap volume added, but attempts to repeat it failed. It does succeed if we add a really big swap volume, for example one of the size of the data length of the table which is around ~90GB.

It therefore seems as if rebuilds of the primary index (which includes all of the table data) are being done purely in-memory rather than on-disk.



 Comments   
Comment by Elena Stepanova [ 2016-08-07 ]

Could you please paste the output of SHOW CREATE TABLE and attach your cnf file(s)?

I tried to reproduce on a smaller table (ibd file is ~11 GB) with buffer pool 8 GB; but I'm not getting a memory growth above buffer pool on an arbitrary table with the default configuration (apart from the increased buffer pool).

Comment by Mark Wadham [ 2016-08-09 ]

CREATE TABLE `a` (
  `a_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `b_id` int(10) unsigned NOT NULL,
  `c_id` int(10) unsigned NOT NULL,
  `d_id` int(10) unsigned DEFAULT NULL,
  `e` varchar(20) NOT NULL,
  `f` tinyint(3) unsigned NOT NULL,
  `g` tinyint(3) unsigned NOT NULL,
  `h` int(10) unsigned NOT NULL,
  `i` char(1) NOT NULL,
  `j_id` int(10) unsigned DEFAULT NULL,
  `k` int(10) DEFAULT NULL,
  `l` datetime DEFAULT NULL,
  PRIMARY KEY (`a_id`),
  KEY `a_FKIndex1` (`d_id`),
  KEY `a_FKIndex2` (`c_id`),
  KEY `a_FKIndex3` (`b_id`),
  CONSTRAINT `a_ibfk_1` FOREIGN KEY (`d_id`) REFERENCES `a` (`a_id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
  CONSTRAINT `a_ibfk_2` FOREIGN KEY (`c_id`) REFERENCES `c` (`c_id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
  CONSTRAINT `a_ibfk_3` FOREIGN KEY (`b_id`) REFERENCES `b` (`b_id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=1907542996 DEFAULT CHARSET=latin1;

Comment by Mark Wadham [ 2016-08-09 ]

[mysqld]
#
# MySQL directories
# -----------------
datadir=/mysql/data
socket=/mysql/mysql.sock
tmpdir = /mysql/tmp/
 
skip-slave-start
 
# MyISAM
# ------
key_buffer_size = 32M
 
# Core server config
# ------------------
user = mysql
default-storage-engine=InnoDB
max_allowed_packet = 16M
max_connections=768
 
lc_messages     = en_GB
skip-external-locking
 
 
# Cache/Buffer config
# -------------------
table_open_cache = 2048           # Added for 5.6 was table_cache = 2048
thread_cache_size = 16
query_cache_size = 64M
query_cache_limit = 256K
sort_buffer_size = 32M
read_buffer_size = 16M
read_rnd_buffer_size = 64M
 
 
# Replication
# -----------
log-bin = /mysql/logs/galera0-bin
log_bin_index = /mysql/logs/galera0-bin.index
log_slave_updates = 1
expire_logs_days = 3
binlog_format=ROW
relay-log=/mysql/logs/galera0-relay-bin
server-id = 1020144245
 
# Logging
# -------
log-queries-not-using-indexes = 0
slow_query_log = 1
slow-query_log_file = /mysql/logs/galera0-slow.log
# slow_query_log_timestamp_precision=microsecond
long_query_time = 60 # We should use this log to cherry pick queries for optimisation and drive down this value
 
 
 
# InnoDB
# ------
innodb_file_format = Barracuda
innodb_data_file_path = ibdata1:5M;ibdata2:5M:autoextend
innodb_log_group_home_dir = /mysql/innodb
innodb_file_per_table = 1
innodb_buffer_pool_size = 49155M
innodb_log_file_size = 1024M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 2
innodb_thread_concurrency=16
innodb_flush_method = O_DIRECT
innodb_lock_wait_timeout = 10
innodb_io_capacity      = 20000
innodb_flush_neighbors = 0
innodb_flush_method     = O_DIRECT
innodb_autoinc_lock_mode = 2
innodb_doublewrite       = 1
 
 
#######################################################################################
#
[mysqldump]
#
# Dump every row of a table to disk rather than buffering it all up
quick
#
# Allow up to 16MB of network traffic
max_allowed_packet = 16M
socket=/var/lib/mysql/mysql.sock
#
#
[myisamchk]
#
# These are repair options for MyISAM
key_buffer_size = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M
 
[mysql]
#
# MySQL directories
# -----------------
# socket=/var/lib/mysql/mysql.sock
 
[mysqladmin]
socket=/var/lib/mysql/mysql.sock
 
 
!includedir /etc/my.cnf.d

Comment by Elena Stepanova [ 2016-09-06 ]

No matter what I do, I'm not getting the described memory consumption. The only way I can make the server eat up memory on this operation is if I map the datadir where the temporary file is created to RAM, but I assume if it were the case here, it would have been mentioned.

Which environment are you working in – linux distribution/version etc.?

Your cnf file includes /etc/my.cnf.d directory, is there anything in there?

Names of your binary logs mention galera, but there is no other Galera-related configuration – is the server a Galera node, by any chance?

When the problem happens, does top show the corresponding figures for mysqld? Could you please paste output of top for the process, where it has clearly overgrown innodb_buffer_pool_size, but hasn't been killed by OOM watchdog yet? Can you take a stack trace from the running server at this time? gdb --batch --eval-command="thread apply all bt" <mysqld path> <pid>.

Comment by Elena Stepanova [ 2016-10-06 ]

Please comment to re-open the issue if you have more information.

Comment by Mark Wadham [ 2016-10-07 ]

Hi Elena,

Sorry for my slow response.

It is a galera node, unfortunately I don't have time to do any further testing and the test environment has been destroyed now anyway. We have worked around the problem with operational procedures now.

The galera cluster was running on AWS, I don't know why that would affect it in this way but I guess it's possible.

Thanks,
Mark

Generated at Thu Feb 08 07:42:40 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.