Details
-
Bug
-
Status: Closed (View Workflow)
-
Minor
-
Resolution: Fixed
-
10.0(EOL)
Description
Happened once so far (according to the cross-reference report):
http://buildbot.askmonty.org/buildbot/builders/bld-dan-release/builds/1867/steps/test/logs/stdio
rpl.rpl_gtid_basic 'innodb_plugin,stmt' w3 [ fail ]
|
Test ended at 2014-11-28 19:50:11
|
|
CURRENT_TEST: rpl.rpl_gtid_basic
|
mysqltest: In included file "./include/wait_for_slave_param.inc":
|
included from ./include/wait_for_slave_sql_error.inc at line 41:
|
included from /opt/buildbot-slave/mariadb/dan_demeter2/build/mysql-test/suite/rpl/t/rpl_gtid_basic.test at line 471:
|
At line 115: Timeout in include/wait_for_slave_param.inc
|
|
The result from queries just before the failure was:
|
< snip >
|
master-bin.000001 2984 Query 1 3073 use `test`; INSERT INTO t1 VALUES (17)
|
master-bin.000001 3073 Xid 1 3100 COMMIT /* xid=399 */
|
master-bin.000001 3100 Gtid 1 3138 BEGIN GTID 2-1-1031
|
master-bin.000001 3138 Query 1 3227 use `test`; INSERT INTO t1 VALUES (18)
|
master-bin.000001 3227 Xid 1 3254 COMMIT /* xid=400 */
|
master-bin.000001 3254 Gtid 1 3292 BEGIN GTID 2-1-1032
|
master-bin.000001 3292 Query 1 3381 use `test`; INSERT INTO t1 VALUES (19)
|
master-bin.000001 3381 Xid 1 3408 COMMIT /* xid=401 */
|
master-bin.000001 3408 Gtid 1 3446 GTID 3-1-100
|
master-bin.000001 3446 Query 1 3544 use `test`; CREATE TABLE t2 (a INT PRIMARY KEY)
|
master-bin.000001 3544 Gtid 1 3582 GTID 3-1-101
|
master-bin.000001 3582 Query 1 3686 use `test`; DROP TABLE `t2` /* generated by server */
|
master-bin.000001 3686 Gtid 1 3724 BEGIN GTID 2-1-1040
|
master-bin.000001 3724 Query 1 3813 use `test`; INSERT INTO t1 VALUES (20)
|
master-bin.000001 3813 Xid 1 3840 COMMIT /* xid=421 */
|
|
**** SHOW RELAYLOG EVENTS on server_1 ****
|
relaylog_name = 'No such row'
|
SHOW RELAYLOG EVENTS IN 'No such row';
|
Log_name Pos Event_type Server_id End_log_pos Info
|
Attachments
Issue Links
- relates to
-
MDEV-7069 Fix buildbot failures in main server trees
-
- Closed
-
So I checked the error log from the failing test.
What we see is that START SLAVE starts the SQL thread, but the I/O thread is
not started. This causes the wait to time out, as no events arrive from the
master.
I notice that just before this point, the test uses START SLAVE UNTIL, then
waits for the SQL thread to stop. But there is no wait for the IO thread to
stop. It seems plausible that the IO thread is still shutting down due to
UNTIL when the test reaches START SLAVE. Thus, START SLAVE does not start the
IO thread (it is not running), and a split-second after, the IO thread then
stops due to the UNTIL condition. This would leave to a failure like what is
seen.
EDIT: confirmed, I was able to reproduce with a sleep just before the I/O
thread marks itself as stopped.