[MDEV-13039] innodb_fast_shutdown=0 may fail to purge all undo logs Created: 2017-06-08  Updated: 2017-10-05  Resolved: 2017-06-09

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 5.5, 10.0, 10.1, 10.2.3, 10.2.4, 10.2.5, 10.3.0, 10.2.6
Fix Version/s: 10.1.25, 10.0.32, 10.2.7, 10.3.1

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: shutdown

Issue Links:
Problem/Incident
is caused by MDEV-5800 indexes on virtual (not materialized)... Closed
is caused by MDEV-12057 Embedded server shutdown hangs in InnoDB Closed
Relates
relates to MDEV-12091 Shutdown fails to wait for rollback o... Closed
relates to MDEV-12994 innodb_fast_shutdown=0 skips change b... Closed
relates to MDEV-13059 XtraDB hangs on Windows due to failin... Closed
relates to MDEV-13472 rpl.rpl_semi_sync_wait_point crashes ... Closed
relates to MDEV-13603 innodb_fast_shutdown=0 may fail to pu... Closed

 Description   

When a slow shutdown is performed soon after creating some activity for fts_optimize_thread(), it is possible that the optimize thread is starting new transactions after the purge has finished. This is violating the specification of innodb_fast_shutdown, namely that the purge must be completed. (None of the history of new transactions would be purged.)

The proper fix would seem to declare a flag that indicates whether undo-log-generating background threads are active:

/** Shut down background threads that can generate undo log. */
void
srv_shutdown_bg_undo_sources()
{
	if (srv_bg_undo_sources) {
		ut_ad(!srv_read_only_mode);
		fts_optimize_shutdown();
		dict_stats_shutdown();
		srv_bg_undo_sources = false;
	}
}

As long as this flag is set, srv_purge_should_exit() must return false.



 Comments   
Comment by Marko Mäkelä [ 2017-06-08 ]

bb-10.2-marko
I plan to backport this back to 10.0, along with some MDEV-13015 compatibility tests and fixes.

Comment by Jan Lindström (Inactive) [ 2017-06-08 ]

patch is ok to push, but these shutdown changes are hard as so many timing things can happen, buildbot has not yet tested this thoroughly, please wait that first.

Comment by Marko Mäkelä [ 2017-06-08 ]

This is also related to MDEV-12091, because in slow shutdown, the purge threads may exit prematurely while the rollback of recovered incomplete transactions is in progress. Thus, the slow shutdown could fail to purge any undo log that would become purgeable after the rollback finishes.

Comment by Marko Mäkelä [ 2017-06-08 ]

Revised fix (with test) on bb-10.2-marko

Comment by Jan Lindström (Inactive) [ 2017-06-09 ]

Changes look otherwise good expect based on bb that checkpoint writing on startup should be restored.

Comment by Marko Mäkelä [ 2017-06-09 ]

Thanks. I filed a follow-up task: MDEV-13044 Do not create a redo log checkpoint at startup

Comment by Marko Mäkelä [ 2017-06-09 ]

In all InnoDB versions, the background DROP TABLE queue (which is blatantly breaking ACID properties) is potentially interfering with the purge at slow shutdown.

Starting with MariaDB 10.0, the dict_stats_thread() and fts_optimize_thread() could create undo log records that would fail to be purged at slow shutdown.

Generated at Thu Feb 08 08:02:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.