Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7189

main.processlist fails sporadically in buildbot

Details

    Description

      http://buildbot.askmonty.org/buildbot/builders/win32-debug2/builds/6568/steps/test/logs/stdio

      main.processlist                         [ fail ]
              Test ended at 2014-11-24 05:14:33
       
      CURRENT_TEST: main.processlist
      --- C:/bzr/bb-win32/win32-debug2/build/mysql-test/r/processlist.result	2014-10-29 01:18:28.187152000 -0700
      +++ C:\bzr\bb-win32\win32-debug2\build\mysql-test\r\processlist.reject	2014-11-24 05:14:30.816286800 -0800
      @@ -12,5 +12,5 @@
       0
       select command, time < 5 from information_schema.processlist where id != connection_id();
       command	time < 5
      -Sleep	1
      +Query	0
       set debug_sync='reset';
       
      mysqltest: Result content mismatch

      Variation with ps-protocol:
      http://buildbot.askmonty.org/buildbot/builders/kvm-dgcov-jaunty-i386/builds/4530/steps/test_3/logs/stdio

      main.processlist                         w2 [ fail ]
              Test ended at 2014-12-22 01:07:55
       
      CURRENT_TEST: main.processlist
      --- /home/buildbot/build/mysql-test/r/processlist.result	2014-12-21 20:37:16.794532000 +0200
      +++ /home/buildbot/build/mysql-test/r/processlist.reject	2014-12-22 01:07:55.454312832 +0200
      @@ -16,5 +16,5 @@
       SET DEBUG_SYNC = 'now WAIT_FOR query_done';
       select command, time < 5 from information_schema.processlist where id != connection_id();
       command	time < 5
      -Sleep	1
      +Execute	0
       set debug_sync='reset';
       
      mysqltest: Result length mismatch

      Another variation:
      http://buildbot.askmonty.org/buildbot/builders/work-amd64-valgrind/builds/6424

      main.processlist                         w1 [ fail ]
              Test ended at 2014-12-02 05:34:58
       
      CURRENT_TEST: main.processlist
      --- /var/lib/buildbot/maria-slave/work-opensuse-amd64/build/mysql-test/r/processlist.result	2014-12-01 23:07:01.000000000 +0200
      +++ /var/lib/buildbot/maria-slave/work-opensuse-amd64/build/mysql-test/r/processlist.reject	2014-12-02 05:34:58.000000000 +0200
      @@ -6,7 +6,6 @@
       1
       SET DEBUG_SYNC = 'now SIGNAL fill_schema_proceed';
       INFO	TIME	TIME_MS
      -NULL	0	0.000
       select sleep(5);
       sleep(5)
       0
       
      mysqltest: Result length mismatch

      Attachments

        Issue Links

          Activity

            Updated affected versions according to the cross-reference report

            elenst Elena Stepanova added a comment - Updated affected versions according to the cross-reference report

            knielsen,

            I see you already made some attempts to fix the test, but apparently there are still some flaws.

            The first two variations in the description seem to be the same problem that you were fixing; but the 3rd one is puzzling, looking at the test I can't see how it can possibly happen.

            elenst Elena Stepanova added a comment - knielsen , I see you already made some attempts to fix the test, but apparently there are still some flaws. The first two variations in the description seem to be the same problem that you were fixing; but the 3rd one is puzzling, looking at the test I can't see how it can possibly happen.

            I think the third one is already fixed, in revid:knielsen@knielsen-hq.org-20141031114817-art3oxugmfgfd6ca

            This fix appeared in 5.5 around 2014-10-31 and in 10.0 around 2014-11-03, and there are no failures of that kind seen in cross-reference after those dates.

            However, I do not understand why it still fails (even in 5.5) with the first two variants.

            knielsen Kristian Nielsen added a comment - I think the third one is already fixed, in revid:knielsen@knielsen-hq.org-20141031114817-art3oxugmfgfd6ca This fix appeared in 5.5 around 2014-10-31 and in 10.0 around 2014-11-03, and there are no failures of that kind seen in cross-reference after those dates. However, I do not understand why it still fails (even in 5.5) with the first two variants.

            Ok, I found it.

            The test fails only in --ps-protocol (variation 2). Cross-reference shows some
            failures without --ps-protocol (variation 1), but these are in trees where the
            existing fix in 5.5 was not yet merged at the time of the failure. Same for
            variation 3, also only in trees in which the fix is not yet merged from 5.5.

            The problem is that the test case sets up a DEBUG_SYNC that is meant to
            trigger at the end of the execution of the SELECT SLEEP(5). But in
            --ps-protocol, a query happens in two steps, Prepare followed by Execute. And
            the DEBUG_SYNC ends up triggering at the end of the Prepare step, which is too
            early, allowing a race to cause the test to fail sporadically.

            knielsen Kristian Nielsen added a comment - Ok, I found it. The test fails only in --ps-protocol (variation 2). Cross-reference shows some failures without --ps-protocol (variation 1), but these are in trees where the existing fix in 5.5 was not yet merged at the time of the failure. Same for variation 3, also only in trees in which the fix is not yet merged from 5.5. The problem is that the test case sets up a DEBUG_SYNC that is meant to trigger at the end of the execution of the SELECT SLEEP(5). But in --ps-protocol, a query happens in two steps, Prepare followed by Execute. And the DEBUG_SYNC ends up triggering at the end of the Prepare step, which is too early, allowing a race to cause the test to fail sporadically.

            Pushed to 5.5.42:

            http://lists.askmonty.org/pipermail/commits/2015-January/007230.html

            Note that the test failures might be visible in Buildbot for some time yet,
            until the fix becomes merged from 5.5 into each tree.

            knielsen Kristian Nielsen added a comment - Pushed to 5.5.42: http://lists.askmonty.org/pipermail/commits/2015-January/007230.html Note that the test failures might be visible in Buildbot for some time yet, until the fix becomes merged from 5.5 into each tree.

            People

              knielsen Kristian Nielsen
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.