[MDEV-14154] Failing assertion: slot->last_run <= current_time in fts0opt.cc - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Blocker
Resolution: Fixed
Affects Version/s: 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5
Fix Version/s: 10.2.26, 10.1.41, 10.3.17, 10.4.7
Component/s: Full-text Search, Storage Engine - InnoDB, Storage Engine - XtraDB, Tests
Labels:
- fulltext
Environment:
https://travis-ci.org/elenst/travis-tests/jobs/394371314

Description

Note: The bug report was initially filed due to a buildbot failure on 10.0 tree. According to buildbot cross-reference, the test stopped failing some time at the end of 2017 – maybe the test has changed, or something else did. I'm keeping it here for the record though, the second part of the description relates to one of those old test failures.

The first part is the same assertion failure which happened on the current 10.3 (as of 2018-06-20) during non-buildbot concurrent tests. It's not reproducible so far, but for it we have a full stack trace, coredump, binary, datadir and logs.

Failure on current 10.3

10.3 956b296248acbe35aa87f056ccf99dbb999eea02
2018-06-20 04:01:30 0x7f9d927fc700 InnoDB: Assertion failure in file /home/travis/src/storage/innobase/fts/fts0opt.cc line 2844
InnoDB: Failing assertion: slot->last_run <= current_time

Thread 1 (Thread 0x7f9d927fc700 (LWP 23704)):
#0 0x00007f9dc9dd4611 in pthread_kill () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x000055e3c07716a3 in my_write_core (sig=6) at /home/travis/src/mysys/stacktrace.c:481
#2 0x000055e3bffc735f in handle_fatal_signal (sig=6) at /home/travis/src/sql/signal_handler.cc:305
#3 <signal handler called>
#4 0x00007f9dc9214c37 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x00007f9dc9218028 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x000055e3c0409095 in ut_dbg_assertion_failed (expr=0x55e3c0b185f8 "slot->last_run <= current_time", file=0x55e3c0b17a48 "/home/travis/src/storage/innobase/fts/fts0opt.cc", line=2844) at /home/travis/src/storage/innobase/ut/ut0dbg.cc:61
#7 0x000055e3c054ff0f in fts_optimize_how_many (tables=0x55e3c2e16c98) at /home/travis/src/storage/innobase/fts/fts0opt.cc:2844
#8 0x000055e3c0550522 in fts_optimize_thread (arg=0x55e3c2e109c0) at /home/travis/src/storage/innobase/fts/fts0opt.cc:3101
#9 0x00007f9dc9dcf184 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#10 0x00007f9dc92dbffd in clone () from /lib/x86_64-linux-gnu/libc.so.6

All threads' stack traces are attached as threads_full.
Coredump, datadir etc. are available separately.

Old innodb_fts.innodb-fts-fic failure on 10.0

innodb_fts.innodb-fts-fic 'xtradb'       w1 [ fail ]

        Test ended at 2017-08-28 20:34:08

CURRENT_TEST: innodb_fts.innodb-fts-fic

mysqltest: At line 181: query 'CREATE FULLTEXT INDEX idx2 ON articles (body)' failed: 2013: Lost connection to MySQL server during query

The result from queries just before the failure was:

< snip >

1	aaaa	DBMS stands for DataBase ...

drop table articles;

CREATE TABLE articles (

FTS_DOC_ID BIGINT UNSIGNED NOT NULL ,

title VARCHAR(200),

body TEXT

) ENGINE=InnoDB;

CREATE FULLTEXT INDEX idx on articles (title);

INSERT INTO articles  VALUES (9, 'MySQL Tutorial','DBMS stands for DataBase ...');

UPDATE articles set title = 'bbbb'  WHERE MATCH(title) AGAINST ('tutorial' IN NATURAL LANGUAGE MODE);

ERROR HY000: Invalid InnoDB FTS Doc ID

UPDATE articles set title = 'bbbb', FTS_DOC_ID=8  WHERE MATCH(title) AGAINST ('tutorial' IN NATURAL LANGUAGE MODE);

ERROR HY000: Invalid InnoDB FTS Doc ID

UPDATE articles set title = 'bbbb', FTS_DOC_ID=10  WHERE MATCH(title) AGAINST ('tutorial' IN NATURAL LANGUAGE MODE);

SELECT * FROM articles WHERE MATCH (title) AGAINST ('bbbb' IN NATURAL LANGUAGE MODE);

FTS_DOC_ID	title	body

10	bbbb	DBMS stands for DataBase ...

SELECT * FROM articles WHERE MATCH (title) AGAINST ('tutorial' IN NATURAL LANGUAGE MODE);

FTS_DOC_ID	title	body

CREATE FULLTEXT INDEX idx2 ON articles (body);

More results from queries before failure can be found in /dev/shm/var/1/log/innodb-fts-fic.log

Server [mysqld.1 - pid: 26345, winpid: 26345, exit: 256] failed during test run

Server log from this test:

----------SERVER LOG START-----------

2017-08-28 20:33:54 aa7f9b40 InnoDB: FTS Optimize Removing table test/articles

2017-08-28 20:33:55 aa7f9b40 InnoDB: FTS Optimize Removing table test/articles

2017-08-28 20:33:55 aa7f9b40 InnoDB: FTS Optimize Removing table test/articles

2017-08-28 20:33:55 aa7f9b40 InnoDB: FTS Optimize Removing table test/articles

2017-08-28 20:33:55 aa7f9b40 InnoDB: FTS Optimize Removing table test/articles

2017-08-28 20:33:54 b17c3b40 InnoDB: A new Doc ID must be supplied while updating FTS indexed columns.

InnoDB: FTS Doc ID must be larger than 9 for table`test`.`articles`

2017-08-28 20:33:54 aa7f9b40  InnoDB: Assertion failure in thread 2860489536 in file fts0opt.cc line 2856

InnoDB: Failing assertion: slot->last_run <= current_time

InnoDB: We intentionally generate a memory trap.

InnoDB: Submit a detailed bug report to http://bugs.mysql.com.

InnoDB: If you get repeated assertion failures or crashes, even

InnoDB: immediately after the mysqld startup, there may be

InnoDB: corruption in the InnoDB tablespace. Please refer to

InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html

InnoDB: about forcing recovery.

"170828 20:33:54 [ERROR] mysqld got signal 6 ;

This could be because you hit a bug. It is also possible that this binary

or one of the libraries it was linked against is corrupt, improperly built,

or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help

diagnose the problem, but since we have already crashed,

something is definitely wrong and this may fail.

Server version: 10.0.33-MariaDB-1~xenial

key_buffer_size=1048576

read_buffer_size=131072

max_used_connections=1

max_threads=153

thread_count=1

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 61689 K  bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 0x0 thread_stack 0x48000

/usr/sbin/mysqld(my_print_stacktrace+0x36)[0x87dc6a6]

/usr/sbin/mysqld(handle_fatal_signal+0x3b3)[0x83639b3]

[0xb7769c14]

[0xb7769c31]

/lib/i386-linux-gnu/libc.so.6(gsignal+0x39)[0xb70b2e89]

/lib/i386-linux-gnu/libc.so.6(abort+0x157)[0xb70b43e7]

/usr/sbin/mysqld[0x8647ecc]

/usr/sbin/mysqld[0x864c1d8]

/lib/i386-linux-gnu/libpthread.so.0(+0x62b5)[0xb76832b5]

/lib/i386-linux-gnu/libc.so.6(clone+0x6e)[0xb716e16e]

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains

information that should help you find out what is causing the crash.

Writing a core file

----------SERVER LOG END-------------

Logs are not available.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

threads_full
2018-06-20 14:18
242 kB
Elena Stepanova

Issue Links

blocks

MDEV-20133 Merge new release of InnoDB 5.7.27 to 10.2

Closed

relates to

MDEV-20126 Semaphore timeout due to large fulltext indexes

Open

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2019-07-22 06:10

The problem is limited to FULLTEXT INDEX in InnoDB tables. The assertion expression seems to assume that ut_time() will be returning monotonically increasing values. Maybe this is not always the case?

Marko Mäkelä added a comment - 2019-07-22 06:10 The problem is limited to FULLTEXT INDEX in InnoDB tables. The assertion expression seems to assume that ut_time() will be returning monotonically increasing values. Maybe this is not always the case?

Marko Mäkelä added a comment - 2019-07-22 06:29

Could it be that the assertion can fail whenever the system time is moved backwards? For example, if the system clock was drifting (moving too fast), and it was subsequently moved backwards, for example by NTP? If we can provoke a crash by moving the system time backwards, then the logic must be rewritten to be more stable.

There currently are about 60 calls to ut_time() in InnoDB. Also, btr_defragment_thread is the only user of a function pointer ut_timer_now. Maybe we should just use a 64-bit my_timer_cycles everywhere where we assume monotonically increasing time.

Marko Mäkelä added a comment - 2019-07-22 06:29 Could it be that the assertion can fail whenever the system time is moved backwards? For example, if the system clock was drifting (moving too fast), and it was subsequently moved backwards, for example by NTP? If we can provoke a crash by moving the system time backwards, then the logic must be rewritten to be more stable. There currently are about 60 calls to ut_time() in InnoDB. Also, btr_defragment_thread is the only user of a function pointer ut_timer_now . Maybe we should just use a 64-bit my_timer_cycles everywhere where we assume monotonically increasing time.

Marko Mäkelä added a comment - 2019-07-23 10:52

Another problematic call is ut_time_ms(). Percona fixed one piece of code by introducing ut_monotonic_time_ms(), a wrapper for clock_gettime(CLOCK_MONOTONIC, &timespec).

Marko Mäkelä added a comment - 2019-07-23 10:52 Another problematic call is ut_time_ms() . Percona fixed one piece of code by introducing ut_monotonic_time_ms() , a wrapper for clock_gettime(CLOCK_MONOTONIC, &timespec) .

Marko Mäkelä added a comment - 2019-07-23 12:08

I think that the portability wrapper my_interval_timer() is the way to go. I did not yet touch ut_time():

diff --git a/storage/xtradb/ut/ut0ut.cc b/storage/xtradb/ut/ut0ut.cc

index 7b76d968e28..eda3921c5de 100644

--- a/storage/xtradb/ut/ut0ut.cc

+++ b/storage/xtradb/ut/ut0ut.cc

@@ -1,6 +1,7 @@

 /*****************************************************************************

 Copyright (c) 1994, 2017, Oracle and/or its affiliates. All Rights Reserved.

+Copyright (c) 2019, MariaDB Corporation.

 This program is free software; you can redistribute it and/or modify it under

 the terms of the GNU General Public License as published by the Free Software

@@ -189,12 +190,7 @@ ut_time_us(

 /*=======*/

 	ullint*	tloc)	/*!< out: us since epoch, if non-NULL */

-	struct timeval	tv;

-	ullint		us;

-	ut_gettimeofday(&tv, NULL);

-	us = (ullint) tv.tv_sec * 1000000 + tv.tv_usec;

+	ullint us = my_interval_timer() / 1000;

 	if (tloc != NULL) {

 		*tloc = us;

@@ -213,11 +209,7 @@ ulint

 ut_time_ms(void)

 /*============*/

-	struct timeval	tv;

-	ut_gettimeofday(&tv, NULL);

-	return((ulint) tv.tv_sec * 1000 + tv.tv_usec / 1000);

+	return static_cast<ulint>(my_interval_timer() / 1000000);

 #endif /* !UNIV_HOTBACKUP */

Marko Mäkelä added a comment - 2019-07-23 12:08 I think that the portability wrapper my_interval_timer() is the way to go. I did not yet touch ut_time() : diff --git a/storage/xtradb/ut/ut0ut.cc b/storage/xtradb/ut/ut0ut.cc index 7b76d968e28..eda3921c5de 100644 --- a/storage/xtradb/ut/ut0ut.cc +++ b/storage/xtradb/ut/ut0ut.cc @@ -1,6 +1,7 @@ /***************************************************************************** Copyright (c) 1994, 2017, Oracle and/or its affiliates. All Rights Reserved. +Copyright (c) 2019, MariaDB Corporation. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software @@ -189,12 +190,7 @@ ut_time_us( /*=======*/ ullint* tloc) /*!< out: us since epoch, if non-NULL */ { - struct timeval tv; - ullint us; - - ut_gettimeofday(&tv, NULL); - - us = (ullint) tv.tv_sec * 1000000 + tv.tv_usec; + ullint us = my_interval_timer() / 1000; if (tloc != NULL) { *tloc = us; @@ -213,11 +209,7 @@ ulint ut_time_ms(void) /*============*/ { - struct timeval tv; - - ut_gettimeofday(&tv, NULL); - - return((ulint) tv.tv_sec * 1000 + tv.tv_usec / 1000); + return static_cast<ulint>(my_interval_timer() / 1000000); } #endif /* !UNIV_HOTBACKUP */

Marko Mäkelä added a comment - 2019-07-23 16:46

It looks like MySQL 5.7.27 might fix this. I believe that we must fix this differently. I already replaced ut_time_us() and ut_time_ms() with the use of the monotonic clock in my_interval_timer(). I believe that we must do the same for most or all use of ut_time().

Marko Mäkelä added a comment - 2019-07-23 16:46 It looks like MySQL 5.7.27 might fix this. I believe that we must fix this differently. I already replaced ut_time_us() and ut_time_ms() with the use of the monotonic clock in my_interval_timer() . I believe that we must do the same for most or all use of ut_time() .

Marko Mäkelä added a comment - 2019-07-24 19:08

The high-precision timers may incur significantly more overhead than time(NULL). I will not introduce new calls to my_interval_timer() or equivalent in normal operation, because it could reduce performance.

What we can and will do is remove some redundant calls to time(NULL). The failing assertion is simply too strict. I will change that logic so that if the system clock was moved to the past, the time margins will expire immediately.

Marko Mäkelä added a comment - 2019-07-24 19:08 The high-precision timers may incur significantly more overhead than time(NULL) . I will not introduce new calls to my_interval_timer() or equivalent in normal operation, because it could reduce performance. What we can and will do is remove some redundant calls to time(NULL) . The failing assertion is simply too strict. I will change that logic so that if the system clock was moved to the past, the time margins will expire immediately.

Marko Mäkelä added a comment - 2019-07-25 12:54

As related cleanup, I also removed ut_timer_now(), which was only used by innodb_defragment. The call to ut_init_timer() was missing in 10.2, so it is possible that the innodb_defragment_frequency had been broken in MariaDB 10.2.2.

Marko Mäkelä added a comment - 2019-07-25 12:54 As related cleanup, I also removed ut_timer_now() , which was only used by innodb_defragment . The call to ut_init_timer() was missing in 10.2, so it is possible that the innodb_defragment_frequency had been broken in MariaDB 10.2.2.

People

Assignee:: Marko Mäkelä

Reporter:: Elena Stepanova

Votes:: 1 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2017-10-26 21:33

Updated:: 2024-07-08 00:37

Resolved:: 2019-07-25 12:54

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Failure on current 10.3

Old innodb_fts.innodb-fts-fic failure on 10.0

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration