Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
10.5, 10.6, 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
-
Ubuntu 18.04 on AMD64
Ubuntu 20.04 on AMD64
Description
After implementing MDEV-32757, we are seeing a performance anomaly with innodb_undo_log_truncate=ON. The server is not actually hung or deadlocked (it will eventually recover), but buf_pool.mutex is being occupied for an extremely long time (several minutes).
- trx_purge_truncate_history() writes the message InnoDB: Truncating and is about to truncate an undo log tablespace.
- trx_purge_truncate_history() is busy-looping in a scan of buf_pool.flush_list because one of the pages belonging to the undo tablespace is write-fixed.
- During the time trx_purge_truncate_history() releases and re-acquires buf_pool.flush_list_mutex, buf_flush_page_cleaner (which is holding buf_pool.mutex in buf_do_flush_list_batch()) cannot grab it, in this Ubuntu 18.04 version of GNU libc and Linux kernel (4.15.0-112-generic). This could be similar to
MDEV-31343andMDEV-30180, which could only be reproduced in the same particular environment. - Most threads are blocked because the buf_flush_page_cleaner thread is holding buf_pool.mutex.
There is some indication that buf_flush_list_batch() may be making some progress (writing out some pages), but it would be extremely slow.
Attachments
Issue Links
- relates to
-
MDEV-26733 assert on shutdown lock->lock_word == X_LOCK_DECR in test
-
- Open
-
-
MDEV-33062 innodb_undo_log_truncate=ON prevents fast shutdown
-
- Closed
-
-
MDEV-33112 innodb_undo_log_truncate=ON is blocking page writes
-
- Closed
-
-
MDEV-33213 History list is not shrunk unless there is a pause in the workload
-
- Closed
-
-
MDEV-30180 Server hang with innodb_undo_log_truncate=ON
-
- Closed
-
-
MDEV-31343 Another server hang with innodb_undo_log_truncate=ON
-
- Closed
-
-
MDEV-32757 innodb_undo_log_truncate=ON is not crash safe
-
- Closed
-
Here is a summary plot of the performance/behavior of various 10.5 and 10.6 commits for community server. 4 commits are shown:
MDEV-33009branch (less aggressive version)MDEV-33009branch (aggressive version)Attachment: 24x5_high_threads.pdf
the tests with data set size 12x5 (12 thd) and data set size 24x5 (24 thd) did not make the undo logs grow and thus caused no truncate operation.
It seems the pink line gives the best (but not good) result.