[MDEV-17904] Server crashes in fts_is_sync_needed upon timed out FLUSH TABLES after unsuccessful ADD FOREIGN KEY Created: 2018-12-05  Updated: 2018-12-07  Resolved: 2018-12-07

Status: Closed
Project: MariaDB Server
Component/s: Full-text Search, Storage Engine - InnoDB
Affects Version/s: 10.2
Fix Version/s: 10.4.1, 10.1.38, 10.0.38, 10.2.20, 10.3.12

Type: Bug Priority: Critical
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: affects-tests


 Description   

Note: I can't say whether the problem is new, because until recently the test case failed with MDEV-17432 (and it still does on 10.3+).

Run the test case with --repeat=N. It usually fails for me on the 2nd attempt. Possibly when the reason is known, it can be converted into a deterministic test case. It would also be good to get rid of the long lock_wait_timeout.

--source include/have_innodb.inc
 
CREATE TABLE t1 ( 
  pk INT AUTO_INCREMENT,
  c CHAR(105) NOT NULL DEFAULT '',
  PRIMARY KEY(pk),
  FULLTEXT KEY(c)
) ENGINE=InnoDB;
 
--error ER_CANT_CREATE_TABLE
ALTER TABLE t1 ADD FOREIGN KEY (c) REFERENCES x(x);
LOCK TABLE t1 READ;
 
--connect (con1,localhost,root,,test)
 
SET lock_wait_timeout= 10;
--error ER_LOCK_WAIT_TIMEOUT
FLUSH TABLES;
 
# Cleanup
--disconnect con1
--connection default
UNLOCK TABLES;
DROP TABLE t1;

10.2 5ec9b88e1

#3  <signal handler called>
#4  0x000055bb3f86e67b in fts_is_sync_needed (tables=0x55bb424f0508) at /data/src/10.2-bug/storage/innobase/fts/fts0opt.cc:2805
#5  0x000055bb3f86e9cc in fts_optimize_thread (arg=0x55bb424f02a0) at /data/src/10.2-bug/storage/innobase/fts/fts0opt.cc:2901
#6  0x00007f7d723be494 in start_thread (arg=0x7f7d577fe700) at pthread_create.c:333
#7  0x00007f7d709bf93f in clone () from /lib/x86_64-linux-gnu/libc.so.6



 Comments   
Comment by Elena Stepanova [ 2018-12-06 ]

The following assertion failure has also occurred in the tests:

10.3 b6f203984bc519a31ac695cbcb6de7f1f638d321

mysqld: /home/travis/src/storage/innobase/fts/fts0opt.cc:2658: ulint fts_optimize_new_table(ib_vector_t*, dict_table_t*): Assertion `slot->table_id == table->id' failed.
181205 14:59:57 [ERROR] mysqld got signal 6 ;
 
#7  0x00007fe99914fc82 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x00005587951bbcb7 in fts_optimize_new_table (tables=0x558798865d08, table=0x7fe99413df58) at /home/travis/src/storage/innobase/fts/fts0opt.cc:2658
#9  0x00005587951bc566 in fts_optimize_thread (arg=0x558798861140) at /home/travis/src/storage/innobase/fts/fts0opt.cc:2915
#10 0x00007fe999d946ba in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#11 0x00007fe99922941d in clone () from /lib/x86_64-linux-gnu/libc.so.6

I suppose it's a variation of the same problem. I don't have a separate test case for the assertion failure.

Comment by Marko Mäkelä [ 2018-12-07 ]

Both variations should be equivalent. The cause is that slot->table is pointing to freed memory. AddressSanitizer confirms it:

10.2 5ec9b88e11118c798ff2381771a72f76b2b72f9e

==10651==ERROR: AddressSanitizer: heap-use-after-free on address 0x61700006ab98 at pc 0x5558fc2c9857 bp 0x7f00e9e43b60 sp 0x7f00e9e43b58
READ of size 8 at 0x61700006ab98 thread T15
    #0 0x5558fc2c9856 in fts_is_sync_needed /mariadb/10.2/storage/innobase/fts/fts0opt.cc:2805
    #1 0x5558fc2c9e2b in fts_optimize_thread /mariadb/10.2/storage/innobase/fts/fts0opt.cc:2901
    #2 0x7f00f5d57fa2 in start_thread /build/glibc-CEX3HA/glibc-2.28/nptl/pthread_create.c:486
    #3 0x7f00f4e1688e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xf988e)
 
0x61700006ab98 is located 664 bytes inside of 760-byte region [0x61700006a900,0x61700006abf8)
freed by thread T27 here:
    #0 0x7f00f5e59b50 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe8b50)
    #1 0x5558fbcb233c in mem_heap_block_free(mem_block_info_t*, mem_block_info_t*) /mariadb/10.2/storage/innobase/mem/mem0mem.cc:440
    #2 0x5558fc1becca in mem_heap_free /mariadb/10.2/storage/innobase/include/mem0mem.ic:527
    #3 0x5558fc1c11f2 in dict_mem_table_free(dict_table_t*) /mariadb/10.2/storage/innobase/dict/dict0mem.cc:220
    #4 0x5558fc1755e3 in dict_table_remove_from_cache_low(dict_table_t*, unsigned long) /mariadb/10.2/storage/innobase/dict/dict0dict.cc:2104
    #5 0x5558fc1756bc in dict_table_remove_from_cache(dict_table_t*) /mariadb/10.2/storage/innobase/dict/dict0dict.cc:2114
    #6 0x5558fbe1631a in row_drop_table_from_cache /mariadb/10.2/storage/innobase/row/row0mysql.cc:3198
    #7 0x5558fbe19626 in row_drop_table_for_mysql(char const*, trx_t*, enum_sql_command, bool, bool) /mariadb/10.2/storage/innobase/row/row0mysql.cc:3720
    #8 0x5558fbb62456 in ha_innobase::create(char const*, TABLE*, HA_CREATE_INFO*, bool, trx_t*) (/dev/shm/10.2/sql/mysqld+0x19b6456)

Comment by Marko Mäkelä [ 2018-12-07 ]

The problem is that a function was called prematurely, before the table was successfully created. With the following change, I cannot repeat this particular crash:

diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index cadb2969c7f..3eaac0e7878 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -11254,10 +11254,6 @@ create_table_info_t::create_table_def()
 			 : ER_TABLESPACE_EXISTS, MYF(0), display_name);
 	}
 
-	if (err == DB_SUCCESS && (m_flags2 & DICT_TF2_FTS)) {
-		fts_optimize_add_table(table);
-	}
-
 error_ret:
 	DBUG_RETURN(convert_error_code_to_mysql(err, m_flags, m_thd));
 }
@@ -12697,6 +12693,10 @@ int create_table_info_t::create_table(bool create_fk)
 		m_table_name, TRUE, FALSE, DICT_ERR_IGNORE_NONE);
 
 	if (innobase_table != NULL) {
+		if (m_flags2 & DICT_TF2_FTS) {
+			fts_optimize_add_table(innobase_table);
+		}
+
 		dict_table_close(innobase_table, TRUE, FALSE);
 	}
 

But, this code feels fragile. It seems that normal DROP TABLE and ALTER TABLE are covered by the following:

void
dict_mem_table_free(
/*================*/
	dict_table_t*	table)		/*!< in: table */
{
	ut_ad(table);
	ut_ad(table->magic_n == DICT_TABLE_MAGIC_N);
	ut_d(table->cached = FALSE);
 
	if (dict_table_has_fts_index(table)
	    || DICT_TF2_FLAG_IS_SET(table, DICT_TF2_FTS_HAS_DOC_ID)
	    || DICT_TF2_FLAG_IS_SET(table, DICT_TF2_FTS_ADD_DOC_ID)) {
		if (table->fts) {
			fts_optimize_remove_table(table);
 
			fts_free(table);
		}
	}

In the submitted test case, the table was being rebuilt due to ADD FOREIGN KEY, but an error was detected before the indexes were added to the being-created new table stub.

It looks like this bug is also possible in MariaDB 10.0 and 10.1: fts_optimize_add_table() is being invoked already in create_table_def(), before the table has been successfully created. The fix would be a little different, because create_table_info_t was introduced in 10.2 only.

Generated at Thu Feb 08 08:40:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.