[MDEV-13446] fts_create_doc_id() unnecessarily allocates 8 bytes for every inserted row Created: 2017-08-04  Updated: 2017-10-09  Resolved: 2017-10-09

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, Storage Engine - XtraDB
Affects Version/s: 10.0, 10.1, 10.1.23, 10.2
Fix Version/s: 10.0.33, 10.1.29, 10.2.10, 10.3.3

Type: Bug Priority: Critical
Reporter: Armin Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: upstream
Environment:

Debian 9, Server version: 10.1.23-MariaDB-9+deb9u1


Attachments: File crashtest_db.sql     File crashtest_script.php     File mysql_catcreate.log     File server.cnf    
Sprint: 10.2.10

 Description   

After a lot of activity (SELECTs and INSERTS for hours) with 100% CPU load of mysqld, the memory consumption grows and grows.
Then mysqld crashes and error.log says:

2017-08-03 19:51:53 139829468153600 [ERROR] InnoDB:  InnoDB: Unable to allocate memory of size 8104.
 
2017-08-03 19:51:53 7f2c95c74700  InnoDB: Assertion failure in thread 139829468153600 in file ha_innodb.cc line 21990
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
170803 19:51:53 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.1.23-MariaDB-9+deb9u1
key_buffer_size=16777216
read_buffer_size=131072
max_used_connections=9
max_threads=153
thread_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 78257 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f2bf46cc008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f2c95c73cb8 thread_stack 0x30000
(my_addr_resolve failure: fork)
/usr/sbin/mysqld(my_print_stacktrace+0x2e) [0x555d95eb39ce]
/usr/sbin/mysqld(handle_fatal_signal+0x3bd) [0x555d959fbd1d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x110c0) [0x7f2c959ab0c0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcf) [0x7f2c942a5fcf]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a) [0x7f2c942a73fa]
/usr/sbin/mysqld(+0x82009b) [0x555d95c8209b]
/usr/sbin/mysqld(+0x876fb7) [0x555d95cd8fb7]
/usr/sbin/mysqld(+0x9e2eed) [0x555d95e44eed]
/usr/sbin/mysqld(+0x9f07c6) [0x555d95e527c6]
/usr/sbin/mysqld(+0x8cb313) [0x555d95d2d313]
/usr/sbin/mysqld(+0x8228b5) [0x555d95c848b5]
/usr/sbin/mysqld(handler::ha_write_row(unsigned char*)+0x37f) [0x555d95a05c5f]
/usr/sbin/mysqld(write_record(THD*, TABLE*, st_copy_info*)+0x238) [0x555d95851b38]
/usr/sbin/mysqld(select_insert::send_data(List<Item>&)+0xf1) [0x555d958523f1]
/usr/sbin/mysqld(+0x456813) [0x555d958b8813]
/usr/sbin/mysqld(+0x43a096) [0x555d9589c096]
/usr/sbin/mysqld(sub_select(JOIN*, st_join_table*, bool)+0x12d) [0x555d958a51fd]
/usr/sbin/mysqld(+0x43a096) [0x555d9589c096]
/usr/sbin/mysqld(sub_select(JOIN*, st_join_table*, bool)+0x15b) [0x555d958a522b]
/usr/sbin/mysqld(+0x453ebd) [0x555d958b5ebd]
/usr/sbin/mysqld(JOIN::exec_inner()+0xbe2) [0x555d958c8912]
/usr/sbin/mysqld(JOIN::exec()+0x54) [0x555d958ca5b4]
/usr/sbin/mysqld(mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*)+0xe7) [0x555d958c6f47]
/usr/sbin/mysqld(handle_select(THD*, LEX*, select_result*, unsigned long)+0x13f) [0x555d958c791f]
/usr/sbin/mysqld(mysql_execute_command(THD*)+0x860b) [0x555d9587511b]
/usr/sbin/mysqld(mysql_parse(THD*, char*, unsigned int, Parser_state*)+0x311) [0x555d95875a41]
/usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x2463) [0x555d95878dc3]
/usr/sbin/mysqld(do_command(THD*)+0x155) [0x555d958794d5]
/usr/sbin/mysqld(do_handle_one_connection(THD*)+0x1d2) [0x555d959435d2]
/usr/sbin/mysqld(handle_one_connection+0x40) [0x555d95943730]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7494) [0x7f2c959a1494]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f2c9435baff]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f2bf4020020): REPLACE INTO isys_search_idx (isys_search_idx__version, isys_search_idx__type, isys_search_idx__key, isys_search_idx__value, isys_search_idx__reference) SELECT 1, 'cmdb', CONCAT(isys_obj__isys_obj_type__id, '.custom_fields.', isys_catg_custom_fields_list__data__id, '.', REPLACE(isysgui_catg_custom__title, '.', '_'), '.', isysgui_catg_custom__id, '.', isys_catg_custom_fields_list__field_key), isys_catg_custom_fields_list__field_content, isys_obj__id FROM isys_catg_custom_fields_list INNER JOIN isys_obj ON isys_catg_custom_fields_list__isys_obj__id = isys_obj__id INNER JOIN isysgui_catg_custom ON isys_catg_custom_fields_list__isysgui_catg_custom__id = isysgui_catg_custom__id WHERE isys_catg_custom_fields_list__field_type IN ('f_text', 'f_textarea', 'f_link') AND isys_catg_custom_fields_list__isysgui_catg_custom__id = 9
Connection ID (thread ID): 11
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
2017-08-03 19:52:01 140411564941888 [Warning] option 'innodb-sort-buffer-size': unsigned value 134217728 adjusted to 67108864
2017-08-03 19:52:01 140411564941888 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2017-08-03 19:52:01 140411564941888 [Note] InnoDB: The InnoDB memory heap is disabled
2017-08-03 19:52:01 140411564941888 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-08-03 19:52:01 140411564941888 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2017-08-03 19:52:01 140411564941888 [Note] InnoDB: Compressed tables use zlib 1.2.8
2017-08-03 19:52:01 140411564941888 [Note] InnoDB: Using Linux native AIO
2017-08-03 19:52:01 140411564941888 [Note] InnoDB: Using SSE crc32 instructions
2017-08-03 19:52:01 140411564941888 [Note] InnoDB: Initializing buffer pool, size = 2.0G
2017-08-03 19:52:01 140411564941888 [Note] InnoDB: Completed initialization of buffer pool
2017-08-03 19:52:01 140411564941888 [Note] InnoDB: Highest supported file format is Barracuda.
2017-08-03 19:52:01 140411564941888 [Note] InnoDB: Starting crash recovery from checkpoint LSN=269528798739
2017-08-03 19:52:02 140411564941888 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...



 Comments   
Comment by Elena Stepanova [ 2017-08-04 ]

Please attach your cnf file(s).

From the description, it seems that you are observing this problem on a regular basis. Is it production? Would you be able to enable the general log for one cycle of restart - memory growth - crash, and then provide it, ideally along with the schema?

Do you happen to have the coredump from the crash, and if not, can you get one next time when the problem occurs?

Comment by Armin [ 2017-08-07 ]

I have attached a part of the general log file. This part is one cycle (of more than thousand) which create the error.
When I execute just the first "REPLACE INTO" of that log (line 241) for many times, I can observe mysqld using more and more memory. But the memory consumption stops at about 90% of host memory without the crash.
The crash seems to occur with the full actions from that log only.
The full log would be too big, but it includes the part of attached mysql_catcreate.log only (again and again).

Yes, it a production system and I can recreate the crash with the mentioned queries each time.
I will try to clone the production system and reduce it to what is needed to reproduce it. Then I will attach the infos.

It was also possible to trigger the crash using
mysqld Ver 5.6.31-0ubuntu0.15.10.1 for debian-linux-gnu on x86_64 ((Ubuntu))

Comment by Elena Stepanova [ 2017-08-07 ]

Thanks for this. Could you please also attach your cnf file(s) and paste the output of SHOW CREATE TABLE for the involved tables? (It might be easier to run mysqldump with for the schema only, without data). When we have this, we can set up a test running the same workflow on artificial data, and hopefully reproduce the problem.

Comment by Armin [ 2017-08-19 ]

I have attached the cnf file I use on a Debian 9 base installation. Also attached is a mysqldump and a php script to reproduce the error.
I was able to strip down the database contents and the sql queries to a minimum.
When a database is filled with the crashtest_db.sql contents and you run the php script, after some hours the crash occurs. mysqld takes a lot of memory, which is no problem at the beginning. But after about cycle# 13000 it is a problem and either mysqld or any other process does not get any memory. Most of the time mysql crashes then.

Comment by Alice Sherepa [ 2017-09-18 ]

Memory leak, reproducible on 10.0, 10.1, 10.2
test case

CREATE TABLE `t5` (
  `t5_id` int  NOT NULL AUTO_INCREMENT  PRIMARY KEY,
  `t5_type_id` int 
) ENGINE=InnoDB AUTO_INCREMENT=217236 ;
 
INSERT INTO `t5` VALUES (9,53),(23,53);
 
CREATE TABLE `t6` (
  `t6_version` int  NOT NULL DEFAULT '1',
  `t6_type` varchar(100)  NOT NULL DEFAULT '',
  `t6_key` varchar(100)  NOT NULL DEFAULT '',
  `t6_value` text ,
  `t6_reference` int  DEFAULT NULL,
  PRIMARY KEY (`t6_version`,`t6_key`),
  FULLTEXT KEY `fulltext_value` (`t6_value`),
  FULLTEXT KEY `fulltext_key` (`t6_key`)
) ENGINE=InnoDB ;
 
CREATE TABLE `t7` (
  `t7_id` int  NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `t7_title` varchar(255)
) ENGINE=InnoDB AUTO_INCREMENT=10 ;
 
INSERT INTO `t7` VALUES (9,'Repository Access List');
 
CREATE TABLE `t1` (
  `t1_id` int(10)  NOT NULL AUTO_INCREMENT  PRIMARY KEY,
  `t1_t5_id` int(10)  NOT NULL,
  `t1_t7_id` int(10)  NOT NULL,
  `t1_field_type` varchar(255) NOT NULL,
  `t1_status` int(10) NOT NULL,
  `t1_data_id` int(10)  DEFAULT '1',
  CONSTRAINT `t1_ibfk_2` FOREIGN KEY (`t1_t7_id`) REFERENCES `t7` (`t7_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=164942;
 
DROP PROCEDURE IF EXISTS pr1;
 
DELIMITER $$
CREATE PROCEDURE pr1(p1 INT)
BEGIN
  l1: LOOP
    SET p1 = p1 + 1;
    IF p1 < 100000 THEN
 
    INSERT INTO t1 SET t1_t5_id = 23, t1_t7_id = 9,
        t1_field_type = 'f_popup',t1_status = 2, t1_data_id = p1;
 
    REPLACE INTO t6 ( t6_version, t6_type, t6_key, t6_value, t6_reference) 
    SELECT  1, 'cmdb', CONCAT( t5_type_id, t1_data_id,t7_title, t7_id), '163589', t5_id 
    FROM t1 
    JOIN t5 ON t1_t5_id = t5_id 
    JOIN t7 ON t1_t7_id = t7_id 
    WHERE t1_field_type IN ('f_text', 'f_textarea', 'f_link','f_popup');
 
      ITERATE l1;
    END IF;
  LEAVE l1;
END LOOP l1;
END;
$$
DELIMITER ;
 
CALL pr1(0);

from top:

 4917 alice     20   0 1179852 435556  23932 S  50,2  2,7   0:13.18 /10.1/bin/mysqld --no-defaults  
 4917 alice     20   0 1179852 436332  23932 S  55,3  2,7   0:14.84 /10.1/bin/mysqld --no-defaults  
 4917 alice     20   0 1179852 437124  23932 S  52,2  2,7   0:16.41 /10.1/bin/mysqld --no-defaults  
 4917 alice     20   0 1179852 438148  23932 S  54,8  2,7   0:18.06 /10.1/bin/mysqld --no-defaults  
 4917 alice     20   0 1179852 438940  23932 S  56,5  2,7   0:19.76 /10.1/bin/mysqld --no-defaults  
 4917 alice     20   0 1179852 439700  23932 S  57,8  2,7   0:21.50 /10.1/bin/mysqld --no-defaults  
 4917 alice     20   0 1183948 440492  23932 S  57,3  2,7   0:23.22 /10.1/bin/mysqld --no-defaults  
 4917 alice     20   0 1183948 441268  23932 S  55,1  2,7   0:24.88 /10.1/bin/mysqld --no-defaults  

Comment by Marko Mäkelä [ 2017-10-04 ]

alice, I cannot repeat this on 10.0. There were some syntax errors in your SQL snippet, which I may of course have fixed incorrectly. Here is my version of the test:

--source include/have_innodb.inc
 
CREATE TABLE `t5` (
  `t5_id` int  NOT NULL AUTO_INCREMENT  PRIMARY KEY,
  `t5_type_id` int 
) ENGINE=InnoDB AUTO_INCREMENT=217236 ;
 
INSERT INTO `t5` VALUES (9,53),(23,53);
 
CREATE TABLE `t6` (
  `t6_version` int  NOT NULL DEFAULT '1',
  `t6_type` varchar(100)  NOT NULL DEFAULT '',
  `t6_key` varchar(100)  NOT NULL DEFAULT '',
  `t6_value` text ,
  `t6_reference` int  DEFAULT NULL,
  PRIMARY KEY (`t6_version`,`t6_key`),
  FULLTEXT KEY `fulltext_value` (`t6_value`),
  FULLTEXT KEY `fulltext_key` (`t6_key`)
) ENGINE=InnoDB ;
 
CREATE TABLE `t7` (
  `t7_id` int  NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `t7_title` varchar(255)
) ENGINE=InnoDB AUTO_INCREMENT=10 ;
 
INSERT INTO `t7` VALUES (9,'Repository Access List');
 
CREATE TABLE `t1` (
  `t1_id` int(10)  NOT NULL AUTO_INCREMENT  PRIMARY KEY,
  `t1_t5_id` int(10)  NOT NULL,
  `t1_t7_id` int(10)  NOT NULL,
  `t1_field_type` varchar(255) NOT NULL,
  `t1_status` int(10) NOT NULL,
  `t1_data_id` int(10)  DEFAULT '1',
  CONSTRAINT `t1_ibfk_2` FOREIGN KEY (`t1_t7_id`) REFERENCES `t7` (`t7_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=164942;
 
DELIMITER $$;
CREATE PROCEDURE pr1(p1 INT)
BEGIN
  l1: LOOP
    SET p1 = p1 + 1;
    IF p1 < 100000 THEN
 
    INSERT INTO t1 SET t1_t5_id = 23, t1_t7_id = 9,
        t1_field_type = 'f_popup',t1_status = 2, t1_data_id = p1;
 
    REPLACE INTO t6 ( t6_version, t6_type, t6_key, t6_value, t6_reference) 
    SELECT  1, 'cmdb', CONCAT( t5_type_id, t1_data_id,t7_title, t7_id), '163589', t5_id 
    FROM t1 
    JOIN t5 ON t1_t5_id = t5_id 
    JOIN t7 ON t1_t7_id = t7_id 
    WHERE t1_field_type IN ('f_text', 'f_textarea', 'f_link','f_popup');
 
      ITERATE l1;
    END IF;
  LEAVE l1;
END LOOP l1;
END;
$$
DELIMITER ;$$
 
CALL pr1(0);

For me, for both xtradb and innodb_plugin (10.0 debug build), the resident set size is growing slowly, but it is around 120 megabytes.

After some 2 minutes of CPU time, the resident set size of xtradb and innodb_plugin (10.1 non-debug build) exceeded 160 megabytes.

To rule out a memory leak, I built the executable -DWITH_ASAN and shut down the server (killall -QUIT mysqld) while the test was running. The initial RSS is larger (over 300 megabytes instead of some 100 megabytes) due to ASAN overhead. At 1 minute and 36 seconds of CPU usage it exceeds 400 megabytes. No leaks or other errors were reported by AddressSanitizer.

I think that this is a memory fragmentation problem. alice, can you please narrow it down? If the FULLTEXT INDEX are omitted, will it improve?

Comment by Alice Sherepa [ 2017-10-06 ]

Marko Mäkelä, test was correct, but not mtr-friendly, you need semicolon after DELIMITER $$ in mtr, but not if you just use client, as I did.
I checked once more and it looks like it is memory leak there, memory grows slowly, but if I leave it for a night, it grows to GB;
i tried with 10.0.33-MariaDB-debug, commit dbeffabc83ed0111
her is results of top:

14128 alice     20   0  760984 363704  19948 S  68,8  2,2   0:49.91 sql//mysqld --defaults-file=/home/alice+
14128 alice     20   0  760984 364496  19948 S  63,8  2,2   0:51.83 sql//mysqld --defaults-file=/home/alice+
14128 alice     20   0  760984 365812  19948 S  64,1  2,2   0:53.76 sql//mysqld --defaults-file=/home/alice+
14128 alice     20   0  760984 366868  19948 S  65,0  2,2   0:55.71 sql//mysqld --defaults-file=/home/alice+
14128 alice     20   0  760984 367924  19948 S  62,1  2,3   0:57.58 sql//mysqld --defaults-file=/home/alice+

after 15 min

14128 alice     20   0  814232 515024  20008 S  67,0  3,2  15:54.16 sql//mysqld --defaults-file=/home/alice+
14128 alice     20   0  814232 515288  20008 S  81,1  3,2  15:56.60 sql//mysqld --defaults-file=/home/alice+
14128 alice     20   0  814232 515552  20008 S  78,7  3,2  15:58.97 sql//mysqld --defaults-file=/home/alice+
14128 alice     20   0  814232 515552  20008 S  74,8  3,2  16:01.22 sql//mysqld --defaults-file=/home/alice+
14128 alice     20   0  814232 515812  20008 S  76,1  3,2  16:03.51 sql//mysqld --defaults-file=/home/alice+
14128 alice     20   0  814232 515812  20008 S  77,4  3,2  16:05.84 sql//mysqld --defaults-file=/home/alice+

If there is no FULLTEXT INDEX there, then memory is not growing.

Comment by Marko Mäkelä [ 2017-10-06 ]

alice, thanks! I did observe a slow growth as well, maybe a few dozen megabytes per minute. But no memory leak.
The problem seems to be memory fragmentation in the InnoDB fulltext index implementation. That is, freed memory blocks are scattered and new allocation requests will not reuse these 'holes' in the address space.
It is more tricky than a simple memory leak, because the fix would involve deep analysis of the allocation patterns and potentially large code refactoring (replacing malloc()/free() with allocations from the stack or from a mem_heap_t, which is the InnoDB equivalent of MEM_ROOT).

Comment by Marko Mäkelä [ 2017-10-09 ]

With pprof from Google pertools, I determined that the culprit is the call to fts_create_doc_id(). It is unnecessarily allocating a new 8-byte buffer for the value of the hidden FTS_DOC_ID column. This buffer could be allocated just once, at the end of row_prebuilt_t::ins_upd_rec_buff.

This is not actually memory fragmentation. This is unnecessary allocation of memory when a previous allocation in the same mem_heap_t could be reused.

Generated at Thu Feb 08 08:05:40 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.