Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7430

rpl.rpl_gtid_crash still fails in buildbot

Details

    Description

      http://buildbot.askmonty.org/buildbot/builders/p8-trusty-bintar-debug/builds/97/steps/test/logs/stdio

      rpl.rpl_gtid_crash 'mix,xtradb'          w4 [ fail ]
              Test ended at 2015-01-06 09:50:42
       
      CURRENT_TEST: rpl.rpl_gtid_crash
      mysqltest: In included file "./include/sync_with_master_gtid.inc": 
      included from /var/lib/buildbot/maria-slave/power8-vlp04-bintar-debug/build/mysql-test/suite/rpl/t/rpl_gtid_crash.test at line 78:
      At line 44: Failed to sync with master
       
      The result from queries just before the failure was:
      < snip >
      call mtr.add_suppression("InnoDB: Warning: database page corruption or a failed");
      flush tables;
      ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;
      CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=InnoDB;
      INSERT INTO t1 VALUES (1, 0);
      include/stop_slave.inc
      CHANGE MASTER TO master_host = '127.0.0.1', master_port = MASTER_PORT,
      MASTER_USE_GTID=CURRENT_POS;
      INSERT INTO t1 VALUES (2,1);
      INSERT INTO t1 VALUES (3,1);
      include/start_slave.inc
      include/save_master_gtid.inc
      SET SESSION debug_dbug="+d,crash_dispatch_command_before";
      SELECT 1;
      Got one of the listed errors
      include/sync_with_master_gtid.inc
      INSERT INTO t1 VALUES (1000, 3);
      include/save_master_gtid.inc
      include/sync_with_master_gtid.inc
      Timeout in master_gtid_wait('0-1-211', 120), current slave GTID position is: 0-1-210.

      The failure above is on P8. I'm not sure yet whether it's specific for P8 or not.

      Attachments

        Issue Links

          Activity

            I think it might not be P8 specific, as this one on fulltest2 (which is x86/amd64) looks identical:

            http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest2/builds/3037/steps/test_6/logs/stdio

            The failure does seem quite rate, though

            knielsen Kristian Nielsen added a comment - I think it might not be P8 specific, as this one on fulltest2 (which is x86/amd64) looks identical: http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest2/builds/3037/steps/test_6/logs/stdio The failure does seem quite rate, though

            Failure can be reproduced with this sleep in the code:

            === modified file 'sql/mysqld.cc'
            --- sql/mysqld.cc	2014-11-18 21:25:47 +0000
            +++ sql/mysqld.cc	2015-01-15 14:32:20 +0000
            @@ -5212,6 +5212,7 @@ int mysqld_main(int argc, char **argv)
               }
             #endif
             
            +fprintf(stderr, "XXX2 delay startup...\n"); my_sleep(11000000);
               orig_argc= argc;
               orig_argv= argv;
               my_getopt_use_args_separator= TRUE;

            The problem seems to be just that mysql-test-run.pl configures the slave to
            give up reconnecting after just 9 seconds (10 attempts with 1 second sleep
            in-between). That is apparently too short in rare cases in our buildbot setup.

            (The logs from the failures in buildbot confirm that the slave IO thread exits
            after 9 seconds).

            knielsen Kristian Nielsen added a comment - Failure can be reproduced with this sleep in the code: === modified file 'sql/mysqld.cc' --- sql/mysqld.cc 2014-11-18 21:25:47 +0000 +++ sql/mysqld.cc 2015-01-15 14:32:20 +0000 @@ -5212,6 +5212,7 @@ int mysqld_main(int argc, char **argv) } #endif +fprintf(stderr, "XXX2 delay startup...\n"); my_sleep(11000000); orig_argc= argc; orig_argv= argv; my_getopt_use_args_separator= TRUE; The problem seems to be just that mysql-test-run.pl configures the slave to give up reconnecting after just 9 seconds (10 attempts with 1 second sleep in-between). That is apparently too short in rare cases in our buildbot setup. (The logs from the failures in buildbot confirm that the slave IO thread exits after 9 seconds).
            knielsen Kristian Nielsen added a comment - Pushed to 10.0.16: http://lists.askmonty.org/pipermail/commits/2015-January/007270.html

            People

              knielsen Kristian Nielsen
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.