[MDEV-7189] main.processlist fails sporadically in buildbot Created: 2014-11-24  Updated: 2015-01-08  Resolved: 2015-01-06

Status: Closed
Project: MariaDB Server
Component/s: Tests
Affects Version/s: 5.5, 10.0, 10.1
Fix Version/s: 5.5.42, 10.0.16, 10.1.3

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Kristian Nielsen
Resolution: Fixed Votes: 0
Labels: buildbot, tests

Issue Links:
Blocks
blocks MDEV-7069 Fix buildbot failures in main server ... Stalled
blocks MDEV-7172 Fix buildbot failures in 10.1 tree Closed

 Description   

http://buildbot.askmonty.org/buildbot/builders/win32-debug2/builds/6568/steps/test/logs/stdio

main.processlist                         [ fail ]
        Test ended at 2014-11-24 05:14:33
 
CURRENT_TEST: main.processlist
--- C:/bzr/bb-win32/win32-debug2/build/mysql-test/r/processlist.result	2014-10-29 01:18:28.187152000 -0700
+++ C:\bzr\bb-win32\win32-debug2\build\mysql-test\r\processlist.reject	2014-11-24 05:14:30.816286800 -0800
@@ -12,5 +12,5 @@
 0
 select command, time < 5 from information_schema.processlist where id != connection_id();
 command	time < 5
-Sleep	1
+Query	0
 set debug_sync='reset';
 
mysqltest: Result content mismatch

Variation with ps-protocol:
http://buildbot.askmonty.org/buildbot/builders/kvm-dgcov-jaunty-i386/builds/4530/steps/test_3/logs/stdio

main.processlist                         w2 [ fail ]
        Test ended at 2014-12-22 01:07:55
 
CURRENT_TEST: main.processlist
--- /home/buildbot/build/mysql-test/r/processlist.result	2014-12-21 20:37:16.794532000 +0200
+++ /home/buildbot/build/mysql-test/r/processlist.reject	2014-12-22 01:07:55.454312832 +0200
@@ -16,5 +16,5 @@
 SET DEBUG_SYNC = 'now WAIT_FOR query_done';
 select command, time < 5 from information_schema.processlist where id != connection_id();
 command	time < 5
-Sleep	1
+Execute	0
 set debug_sync='reset';
 
mysqltest: Result length mismatch

Another variation:
http://buildbot.askmonty.org/buildbot/builders/work-amd64-valgrind/builds/6424

main.processlist                         w1 [ fail ]
        Test ended at 2014-12-02 05:34:58
 
CURRENT_TEST: main.processlist
--- /var/lib/buildbot/maria-slave/work-opensuse-amd64/build/mysql-test/r/processlist.result	2014-12-01 23:07:01.000000000 +0200
+++ /var/lib/buildbot/maria-slave/work-opensuse-amd64/build/mysql-test/r/processlist.reject	2014-12-02 05:34:58.000000000 +0200
@@ -6,7 +6,6 @@
 1
 SET DEBUG_SYNC = 'now SIGNAL fill_schema_proceed';
 INFO	TIME	TIME_MS
-NULL	0	0.000
 select sleep(5);
 sleep(5)
 0
 
mysqltest: Result length mismatch



 Comments   
Comment by Elena Stepanova [ 2014-11-24 ]

Updated affected versions according to the cross-reference report

Comment by Elena Stepanova [ 2014-12-23 ]

knielsen,

I see you already made some attempts to fix the test, but apparently there are still some flaws.

The first two variations in the description seem to be the same problem that you were fixing; but the 3rd one is puzzling, looking at the test I can't see how it can possibly happen.

Comment by Kristian Nielsen [ 2015-01-06 ]

I think the third one is already fixed, in revid:knielsen@knielsen-hq.org-20141031114817-art3oxugmfgfd6ca

This fix appeared in 5.5 around 2014-10-31 and in 10.0 around 2014-11-03, and there are no failures of that kind seen in cross-reference after those dates.

However, I do not understand why it still fails (even in 5.5) with the first two variants.

Comment by Kristian Nielsen [ 2015-01-06 ]

Ok, I found it.

The test fails only in --ps-protocol (variation 2). Cross-reference shows some
failures without --ps-protocol (variation 1), but these are in trees where the
existing fix in 5.5 was not yet merged at the time of the failure. Same for
variation 3, also only in trees in which the fix is not yet merged from 5.5.

The problem is that the test case sets up a DEBUG_SYNC that is meant to
trigger at the end of the execution of the SELECT SLEEP(5). But in
--ps-protocol, a query happens in two steps, Prepare followed by Execute. And
the DEBUG_SYNC ends up triggering at the end of the Prepare step, which is too
early, allowing a race to cause the test to fail sporadically.

Comment by Kristian Nielsen [ 2015-01-06 ]

Pushed to 5.5.42:

http://lists.askmonty.org/pipermail/commits/2015-January/007230.html

Note that the test failures might be visible in Buildbot for some time yet,
until the fix becomes merged from 5.5 into each tree.

Generated at Thu Feb 08 07:17:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.