The failure is caused by a simple race condition in the test. But it's not enough to fix it, the test itself doesn't really test the bugfix.
The test flow, if it runs normally without any delays, is this:
- replication is turned on;
- master executes a fast event e1 with timestamp e1_start_time;
- master executes a slow (2-second long) event e2 with timestamp e2_start_time;
- as soon as the second even is started executing on the master, the test starts checking Seconds_Behind_Master on slave;
- as soon as Seconds_Behind_Master value is above zero, the test saves it and stops checking;
- the test expects that the saved value would be 1.
The problem is that if the test runs without delays, since the event e1 is fast, we will normally have e2_start_time == e1_start_time.
When the slave executes event e1, it's fast, so Seconds_Behind_Master remains 0.
Then it starts executing e2. The bug MDEV-5114 was that while the slave executes e2, Seconds_Behind_Master was calculated as an offset from e1_start_time, while it should be from e2_start_time.
But in this test, e1_start_time == e2_start_time, so the test would pass even without MDEV-5114 fix.
The test is very similar to the draft suggested in MDEV-5114 for debugging purposes, but it misses the most important component – a delay between e1 and e2 which would ensure that e1_start_time != e2_start_time, and the difference which the fix makes is visible.
Now, the race condition which makes the test fail as pasted in the description is simply that on a slow server it might happen that the first show slave status which the test uses to monitor Seconds_Behind_Master will be executed when the slave is already over 1 second within executing the slow event (sleep), the value is correct, it's just not expected.
Newer link: http://buildbot.askmonty.org/buildbot/builders/kvm-deb-utopic-x86/builds/762/steps/test_4/logs/stdio