Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.3(EOL)
-
None
-
bb-10.3-monty
Description
rpl.rpl_semi_sync_wait_point crashes because of thd_destructor_proxy kills innodb
service threads before all slave threads has ended.
What happens is that proxy detects that no transactions are active and starts
srv_shutdown_bg_undo_sources(), but fails to take into account that new transactions
can still start, especially be slave but also by other threads. In addition there is no
mute when checking for active transaction so this is not safe.
Suggestion is to mark innodb server threads and in close_connection first shutdown all other threads, including events, and then last inform destructor proxy and other innodb threads that they can now safely be shut down.
Attachments
Issue Links
- relates to
-
MDEV-5800 indexes on virtual (not materialized) columns
-
- Closed
-
-
MDEV-13039 innodb_fast_shutdown=0 may fail to purge all undo logs
-
- Closed
-
-
MDEV-14080 InnoDB shutdown sometimes hangs
-
- Closed
-
There is a much simpler solution: relax the failing InnoDB debug assertion that I made too strict.
diff --git a/storage/innobase/trx/trx0purge.cc b/storage/innobase/trx/trx0purge.cc
index c046c8b7b52..0f7b36266bc 100644
--- a/storage/innobase/trx/trx0purge.cc
+++ b/storage/innobase/trx/trx0purge.cc
@@ -293,14 +293,16 @@ trx_purge_add_update_undo_to_history(
After the purge thread has been given permission to exit,
in fast shutdown, we may roll back transactions (trx->undo_no==0)
- in THD::cleanup() invoked from unlink_thd(). */
+ in THD::cleanup() invoked from unlink_thd(), and we may also
+ continue to execute user transactions. */
ut_ad(srv_undo_sources
|| ((srv_startup_is_before_trx_rollback_phase
|| trx_rollback_or_clean_is_active)
&& purge_sys->state == PURGE_STATE_INIT)
|| (srv_force_recovery >= SRV_FORCE_NO_BACKGROUND
&& purge_sys->state == PURGE_STATE_DISABLED)
- || (trx->undo_no == 0 && srv_fast_shutdown));
+ || ((trx->undo_no == 0 || trx->in_mysql_trx_list)
+ && srv_fast_shutdown));
/* Add the log as the first in the history list */
I am sorry that this did not occur to me until now. It takes time to ‘populate the cache’ of my brain after a long vacation.