Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.0(EOL), 10.1(EOL)
-
None
Description
During server shutdown slave SQL thread broadcasts "stopped" too early:
mysql_cond_broadcast(&rli->stop_cond);
|
DBUG_EXECUTE_IF("simulate_slave_delay_at_terminate_bug38694", sleep(5););
|
mysql_mutex_unlock(&rli->run_lock); // tell the world we are done
|
|
/*
|
Deactivate the parallel replication thread pool, if there are now no more
|
SQL threads running. Do this here, when we have released all locks, but
|
while our THD (and current_thd) is still valid.
|
*/
|
mysql_mutex_lock(&LOCK_active_mi);
|
if (opt_slave_parallel_threads > 0 &&
|
!master_info_index->any_slave_sql_running())
|
rpl_parallel_inactivate_pool(&global_rpl_thread_pool);
|
mysql_mutex_unlock(&LOCK_active_mi);
|
|
mysql_mutex_lock(&LOCK_thread_count);
|
THD_CHECK_SENTRY(thd);
|
delete thd;
|
mysql_mutex_unlock(&LOCK_thread_count);
|
This causes enormous number of sporadic replication test failures with crashes, hangs and warnings like "Plugin '...' has ref_count=1 after shutdown".
To reproduce this failure reliably, just add sleep(1) immediately after condition is signaled and run tests as usual:
diff --git a/sql/slave.cc b/sql/slave.cc
|
index 4129c4c..efc75f3 100644
|
--- a/sql/slave.cc
|
+++ b/sql/slave.cc
|
@@ -4837,6 +4837,7 @@ log '%s' at position %s, relay log '%s' position: %s%s", RPL_LOG_NAME,
|
mysql_cond_broadcast(&rli->stop_cond);
|
DBUG_EXECUTE_IF("simulate_slave_delay_at_terminate_bug38694", sleep(5););
|
mysql_mutex_unlock(&rli->run_lock); // tell the world we are done
|
+ sleep(1);
|
|
/*
|
Deactivate the parallel replication thread pool, if there are now no more
|
This seem to be appeared after:
commit 682ed005c57dabb4615052a79789e44e5c93525c
|
Author: Kristian Nielsen <knielsen@knielsen-hq.org>
|
Date: Wed Jun 10 11:57:42 2015 +0200
|
|
MDEV-8294: Inconsistent behavior of slave parallel threads at runtime
|
|
There were some cases where the slave SQL thread could stop without
|
the pool of parallel replication worker threads being correctly
|
de-activated.
|
Attachments
Issue Links
- relates to
-
MDEV-7884 rpl_binlog_errors fails sporadically in buildbot with "Plugin 'binlog' has ref_count=1 after shutdown"
- Closed
svoj,
Is it a 10.x-specific problem, as specified in Affected versions, or 5.5 can be affected as well?
Apart from
MDEV-7884, we've recently had another flavor of a similar failure in rpl.rpl_binlog_errors on 5.5:http://buildbot.askmonty.org/buildbot/builders/p8-trusty-bintar-debug/builds/672/steps/test/logs/stdio
rpl.rpl_binlog_errors 'mix,xtradb' w1 [ fail ] Found warnings/errors in server log file!
Test ended at 2015-09-25 00:11:51
line
150925 0:10:46 [ERROR] mysqld: Table './mysql/plugin' is marked as crashed and should be repaired
150925 0:10:46 [Warning] Checking table: './mysql/plugin'
150925 0:10:46 [ERROR] mysql.plugin: 1 client is using or hasn't closed the table properly
150925 0:10:46 [ERROR] mysqld: Table './mysql/event' is marked as crashed and should be repaired
150925 0:10:46 [Warning] Checking table: './mysql/event'
150925 0:10:46 [ERROR] mysql.event: 1 client is using or hasn't closed the table properly
Which in fact looks like this in the slave's error log:
150925 0:10:13 [Note] /var/lib/buildbot/maria-slave/power8-vlp04-bintar-debug/build/sql/mysqld: Normal shutdown
150925 0:10:43 [Note] /var/lib/buildbot/maria-slave/power8-vlp04-bintar-debug/build/sql/mysqld (mysqld 5.5.45-MariaDB-debug-log) starti
ng as process 7173 ...
So, obviously the slave had hung on shutdown.
Can it be the same problem? (and
MDEV-7884as well).