[MDEV-20126] Semaphore timeout due to large fulltext indexes Created: 2019-07-23  Updated: 2023-04-27

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, Storage Engine - XtraDB
Affects Version/s: 10.0, 10.1, 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Thirunarayanan Balathandayuthapani
Resolution: Unresolved Votes: 0
Labels: crash, fulltext, hang, upstream

Issue Links:
Blocks
Relates
relates to MDEV-14154 Failing assertion: slot->last_run <= ... Closed
relates to MDEV-16264 Implement a common work queue for Inn... Closed
relates to MDEV-20127 Merge new release of InnoDB 5.6.45 to... Closed

 Description   

MySQL 5.6.45 contains a change that refers to Oracle Bug #25289359 DML/DDL ON LARGE FULLTEXT TABLES CAUSE SEMAPHORE TIMEOUTS AND ASSERTION/SUICIDE.

There is no test case, but there is debug instrumentation. The change continues the bogus assumption that time(NULL)) is a monotonically increasing sequence. That assumption is demonstrably broken in MDEV-14154. The change might also assume fair scheduling among the threads, which might not hold on a heavily loaded system.

I think that we should study what the problem is and whether it affects MariaDB, and then come up with a better fix.



 Comments   
Comment by Marko Mäkelä [ 2019-07-25 ]

The main idea of the Oracle fix is related to the limiting the atomicity of fts_sync_write_words().
This part of the fix wrongly assumes that the system clock is monotonic (never moving backwards):

ulint cache_lock_time = ut_time() - sync_start_time;
if (cache_lock_time > lock_threshold) {

Similar to our MDEV-14154 changes, in particular the one that removed bogus assertions, we should use something like this:

ulint interval = ulint(time(NULL) - start_time);
if (lint(interval) < 0 || interval > time_limit)) {

That is, we will time out if the time moved backwards.

Because time(NULL) may have a much lower overhead than my_interval_timer() or other monotonic clock sources and because the precision of one second suffices here, I think that we should stick to time(NULL).

Anyway, the main idea of the Oracle change is to extend the innodb_fatal_semaphore_wait_threshold (srv_fatal_semaphore_wait_threshold) if a fts_sync_table() operation from outside the optimizer thread is taking longer. That is, it will prevent the operation of the InnoDB built-in watchdog, instead of actually fixing the root cause of the problem.

Comment by Thirunarayanan Balathandayuthapani [ 2019-10-03 ]

To fix this issue, InnoDB should have multiple fts_optimize_threads to process the messages from the queue.
By using multiple fts_optimize_threads, InnoDB can reduce the cache size significantly and it can make
lesser wait time for DDL/dict_table_mem_free() for fts_optimize_remove_table().

Generated at Thu Feb 08 08:57:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.