[MDEV-14154] Failing assertion: slot->last_run <= current_time in fts0opt.cc Created: 2017-10-26 Updated: 2023-11-27 Resolved: 2019-07-25 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Full-text Search, Storage Engine - InnoDB, Storage Engine - XtraDB, Tests |
| Affects Version/s: | 10.0, 10.1, 10.2, 10.3, 10.4, 10.5 |
| Fix Version/s: | 10.2.26, 10.1.41, 10.3.17, 10.4.7 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Elena Stepanova | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | fulltext | ||
| Environment: | |||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Description |
|
Note: The bug report was initially filed due to a buildbot failure on 10.0 tree. According to buildbot cross-reference, the test stopped failing some time at the end of 2017 – maybe the test has changed, or something else did. I'm keeping it here for the record though, the second part of the description relates to one of those old test failures. The first part is the same assertion failure which happened on the current 10.3 (as of 2018-06-20) during non-buildbot concurrent tests. It's not reproducible so far, but for it we have a full stack trace, coredump, binary, datadir and logs. Failure on current 10.3
All threads' stack traces are attached as threads_full Old innodb_fts.innodb-fts-fic failure on 10.0
Logs are not available. |
| Comments |
| Comment by Marko Mäkelä [ 2019-07-22 ] | |||||||||||||||||||||||||||||||||||||||
|
The problem is limited to FULLTEXT INDEX in InnoDB tables. The assertion expression seems to assume that ut_time() will be returning monotonically increasing values. Maybe this is not always the case? | |||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-07-22 ] | |||||||||||||||||||||||||||||||||||||||
|
Could it be that the assertion can fail whenever the system time is moved backwards? For example, if the system clock was drifting (moving too fast), and it was subsequently moved backwards, for example by NTP? If we can provoke a crash by moving the system time backwards, then the logic must be rewritten to be more stable. There currently are about 60 calls to ut_time() in InnoDB. Also, btr_defragment_thread is the only user of a function pointer ut_timer_now. Maybe we should just use a 64-bit my_timer_cycles everywhere where we assume monotonically increasing time. | |||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-07-23 ] | |||||||||||||||||||||||||||||||||||||||
|
Another problematic call is ut_time_ms(). Percona fixed one piece of code by introducing ut_monotonic_time_ms(), a wrapper for clock_gettime(CLOCK_MONOTONIC, ×pec). | |||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-07-23 ] | |||||||||||||||||||||||||||||||||||||||
|
I think that the portability wrapper my_interval_timer() is the way to go. I did not yet touch ut_time():
| |||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-07-23 ] | |||||||||||||||||||||||||||||||||||||||
|
It looks like MySQL 5.7.27 might fix this. I believe that we must fix this differently. I already replaced ut_time_us() and ut_time_ms() with the use of the monotonic clock in my_interval_timer(). I believe that we must do the same for most or all use of ut_time(). | |||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-07-24 ] | |||||||||||||||||||||||||||||||||||||||
|
The high-precision timers may incur significantly more overhead than time(NULL). I will not introduce new calls to my_interval_timer() or equivalent in normal operation, because it could reduce performance. What we can and will do is remove some redundant calls to time(NULL). The failing assertion is simply too strict. I will change that logic so that if the system clock was moved to the past, the time margins will expire immediately. | |||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-07-25 ] | |||||||||||||||||||||||||||||||||||||||
|
As related cleanup, I also removed ut_timer_now(), which was only used by innodb_defragment. The call to ut_init_timer() was missing in 10.2, so it is possible that the innodb_defragment_frequency had been broken in MariaDB 10.2.2. |