[MDEV-21563] FTS thread aborts during shutdown Created: 2020-01-24  Updated: 2020-08-25  Resolved: 2020-02-06

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.1, 10.2, 10.3, 10.4
Fix Version/s: 10.1.45, 10.2.32, 10.3.23, 10.4.13

Type: Bug Priority: Major
Reporter: David Hill (Inactive) Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-12266 Reduce the number of InnoDB tablespac... Closed
relates to MDEV-16198 [Draft] Assertion `index->page != 0xF... Closed

 Description   

This was original opened against Columnstore using build 1.2.5. CS Eng reported it was a crash in InnoDB. May need to be an MDEV ticket. So opening this ticket from MCOL-3740.

Customer reporting that everytime they do a systemShutdown, the mysqld_safe and mysql still show to be running after shutdown is completed.

I worked with them to just try running the below commands to see what happens when nothing else of going on, and it still crashed

/usr/local/mariadb/columnstore/mysql/mysql-Columnstore start
/usr/local/mariadb/columnstore/mysql/mysql-Columnstore stop
/usr/local/mariadb/columnstore/mysql/mysql-Columnstore status

Log files always shows the same thing,

Event Scheduler: Purging the queue. 21 events
200123 13:48:41 [ERROR] mysqld got signal 11 ;
 
/usr/local/mariadb/columnstore/mysql//bin/mysqld (initiated by: unknown): Normal shutdown
2020-01-23 13:48:41 0 [Note] Event Scheduler: Killing the scheduler thread, thread id 6
2020-01-23 13:48:41 0 [Note] Event Scheduler: Waiting for the scheduler thread to reply
2020-01-23 13:48:41 0 [Note] Event Scheduler: Stopped
2020-01-23 13:48:41 0 [Note] Event Scheduler: Purging the queue. 21 events
200123 13:48:41 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Server version: 10.3.16-MariaDB-log
key_buffer_size=536870912
read_buffer_size=4194304
max_used_connections=0
max_threads=65537
thread_count=7
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 537414652 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x80000
/usr/local/mariadb/columnstore/mysql//bin/mysqld(my_print_stacktrace+0x29)[0x55b20cbe6279]
mysys/stacktrace.c:270(my_print_stacktrace)[0x55b20c71eb6f]
sigaction.c:0(__restore_rt)[0x7f9c220485d0]
/usr/local/mariadb/columnstore/mysql//bin/mysqld(+0xa1632a)[0x55b20c9eb32a]
btr/btr0cur.cc:1375(btr_cur_search_to_nth_level_func(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_cur_t*, rw_lock_t*, char const*, unsigned int, mtr_t*, unsigned long))[0x55b20c9631ae]
include/btr0pcur.ic:511(btr_pcur_open_with_no_init_func)[0x55b20c9696fd]
row/row0sel.cc:1681(row_sel(sel_node_t*, que_thr_t*))[0x55b20c96b17b]
row/row0sel.cc:2347(row_sel_step(que_thr_t*))[0x55b20c92007b]
que/que0que.cc:1018(que_thr_step)[0x55b20caa5695]
fts/fts0sql.cc:221(fts_eval_sql(trx_t*, que_fork_t*))[0x55b20ca8f5f4]
fts/fts0fts.cc:2684(fts_cmp_set_sync_doc_id(dict_table_t const*, unsigned long, unsigned long, unsigned long*))[0x55b20ca8f8dc]
fts/fts0fts.cc:4193(fts_sync_commit(fts_sync_t*))[0x55b20ca92fca]
fts/fts0fts.cc:4391(fts_sync(fts_sync_t*, bool, bool, bool) [clone .part.105])[0x55b20ca96942]
fts/fts0opt.cc:2782(fts_optimize_sync_table(unsigned long))[0x55b20ca9c785]
pthread_create.c:0(start_thread)[0x7f9c22040dd5]
/lib64/libc.so.6(clone+0x6d)[0x7f9c1ffecead]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /usr/local/mariadb/columnstore/mysql/db
Resource Limits:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 982518 982518 processes
Max open files 65535 65535 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 982518 982518 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Core pattern: core

------------------------
From customer

There are 21 events in the mysql.event table that are enabled

i tried turning off event scheduler but that didn't help still crashed on shutdown



 Comments   
Comment by Marko Mäkelä [ 2020-02-04 ]

The source code line that causes the SIGSEGV in btr_cur_search_to_nth_level_func() was modified in MDEV-12266:

	const page_size_t	page_size(index->table->space->flags);

A possible explanation might be that ALTER TABLE…DISCARD TABLESPACE had been executed on the table. Would we remove the table from the work queue of the fulltext index optimizer in that case? Or in other cases where dict_table_t::space can be assigned to be the null pointer?

Comment by Thirunarayanan Balathandayuthapani [ 2020-02-06 ]

The following test case does crash during shutdown :

--source include/have_innodb.inc
let $MYSQLD_TMPDIR = `SELECT @@tmpdir`;
let $MYSQLD_DATADIR = `SELECT @@datadir`;
 
create table t1(f1 int not null, f2 varchar(100),
                fulltext idx(f2))engine=innodb;
set debug_dbug="+d,fts_instrument_sync_request";
insert into t1 values(1, "mariadb");
alter table t1 discard tablespace;
--source include/restart_mysqld.inc
drop table t1; 

The following patch solves the issue:

diff --git a/storage/innobase/row/row0mysql.cc b/storage/innobase/row/row0mysql.cc
index c9cab91cb1f..22628acd166 100644
--- a/storage/innobase/row/row0mysql.cc
+++ b/storage/innobase/row/row0mysql.cc
@@ -3140,6 +3140,16 @@ row_discard_tablespace_for_mysql(
                err = DB_ERROR;
 
        } else {
+               bool    fts_exist =(dict_table_has_fts_index(table)
+                                   || DICT_TF2_FLAG_IS_SET(
+                                       table, DICT_TF2_FTS_HAS_DOC_ID));
+
+               if (fts_exist) {
+                       row_mysql_unlock_data_dictionary(trx);
+                       fts_optimize_remove_table(table);
+                       row_mysql_lock_data_dictionary(trx);
+               }
+
                ut_ad(!table->n_foreign_key_checks_running);
 
                /* Do foreign key constraint checks. */
@@ -3149,6 +3159,10 @@ row_discard_tablespace_for_mysql(
                if (err == DB_SUCCESS) {
                        err = row_discard_tablespace(trx, table);
                }
+
+               if (fts_exist && err != DB_SUCCESS) {
+                       fts_optimize_add_table(table);
+               }
        }
 
        return(row_discard_tablespace_end(trx, table, err));

Problem is that, while discarding the fts tablespace, InnoDB fails to remove the table from fts_optimize_wq. This patch does remove the table from fts_optimize_wq.

Generated at Thu Feb 08 09:08:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.