Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7139

Sporadic failure in innodb.innodb_corrupt_bit on P8

Details

    Description

      http://buildbot.askmonty.org/buildbot/builders/bintar-rhel7-p8-debug/builds/12/steps/test/logs/stdio

      innodb.innodb_corrupt_bit 'innodb_plugin' w1 [ fail ]
              Test ended at 2014-11-19 05:33:19
       
      CURRENT_TEST: innodb.innodb_corrupt_bit
      mysqltest: At line 74: query 'INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1)' failed: 1062: Duplicate entry '1-x' for key 'idxē'
       
      The result from queries just before the failure was:
      < snip >
      @@unique_checks
      0
      select @@innodb_change_buffering_debug;
      @@innodb_change_buffering_debug
      1
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1,z+1 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+10,z+10 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+20,z+20 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+50,z+50 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+100,z+100 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+200,z+200 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+400,z+400 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+800,z+800 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1600,z+1600 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+4000,z+4000 FROM corrupt_bit_test_ā;
      select count(*) from corrupt_bit_test_ā;
      count(*)
      1024
      CREATE INDEX idx3 ON corrupt_bit_test_ā(b, c);
      INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1);
       
      More results from queries before failure can be found in /home/buildbot/maria-slave/power8-vlp03-bintar-debug/build/mysql-test/var/1/log/innodb_corrupt_bit.log

      So far it happened twice, one time above on 10.0 / bintar-rhel7-p8-debug, and another one on 5.5 / bintar-trusty-p8-debug. Apparently there are no logs from the other one, but it can be seen in Cross Reference report:

      [Build Details]	5.5	4339	bintar-trusty-p8-debug	2014-10-31 15:00:29	172	nm	Normal run, no --ps-protocol	innodb.innodb_corrupt_bit	xtradb	
      innodb.innodb_corrupt_bit 'xtradb'       w4 [ fail ]
              Test ended at 2014-10-31 09:47:07
       
      CURRENT_TEST: innodb.innodb_corrupt_bit
      mysqltest: At line 72: query 'INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1)' failed: 1062: Duplicate entry '1-x' for key 'idxā'
       
      The result from queries just before the failure was:
      < snip >
      @@unique_checks
      0
      select @@innodb_change_buffering_debug;
      @@innodb_change_buffering_debug
      1
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1,z+1 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+10,z+10 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+20,z+20 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+50,z+50 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+100,z+100 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+200,z+200 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+400,z+400 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+800,z+800 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1600,z+1600 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+4000,z+4000 FROM corrupt_bit_test_ā;
      select count(*) from corrupt_bit_test_ā;
      count(*)
      1024
      CREATE INDEX idx3 ON corrupt_bit_test_ā(b, c);
      INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1);
       
      More results from queries before failure can be found in /var/lib/buildbot/maria-slave/power8-vlp04-debug/build/mysql-test/var/4/log/innodb_corrupt_bit.log

      I find it highly suspicious that both times the failure happened on P8. It's a plain test, there should be no race conditions involved, so there must be some other reason (build problems? environmental?). In any case, it's worth investigating.

      Attachments

        Issue Links

          Activity

            svoj Sergey Vojtovich added a comment - - edited

            The problem is reproducible on x86 using 5.5/r4376. But it takes hours to fail:

            while [ $? -eq 0 ]; do ./mtr --parallel=2 innodb.innodb_corrupt_bit --repeat=100 --mem; done

            On PPC64 the problem is a bit easier to reproduce. 5.5/r4421 fails within an hour with the following simplified test:

            --source include/have_innodb.inc
            --source include/have_debug.inc
             
            SET GLOBAL innodb_change_buffering_debug=1;
            SET unique_checks=0;
             
            CREATE TABLE t1(
                   a INT AUTO_INCREMENT PRIMARY KEY,
                   b CHAR(100),
                   c INT,
                   z INT,
                   UNIQUE(z, b))
            ENGINE=InnoDB;
             
            INSERT INTO t1 VALUES(0,'x',1, 1);
            INSERT INTO t1 SELECT 0,b,c+1,z+1 FROM t1;
            INSERT INTO t1 SELECT 0,b,c+10,z+10 FROM t1;
            INSERT INTO t1 SELECT 0,b,c+20,z+20 FROM t1;
            INSERT INTO t1 SELECT 0,b,c+50,z+50 FROM t1;
            INSERT INTO t1 SELECT 0,b,c+100,z+100 FROM t1;
            INSERT INTO t1 SELECT 0,b,c+200,z+200 FROM t1;
            INSERT INTO t1 SELECT 0,b,c+400,z+400 FROM t1;
            INSERT INTO t1 SELECT 0,b,c+800,z+800 FROM t1;
            INSERT INTO t1 SELECT 0,b,c+1600,z+1600 FROM t1;
            INSERT INTO t1 SELECT 0,b,c+4000,z+4000 FROM t1;
            INSERT INTO t1 VALUES(13000,'x',1,1);
             
            DROP TABLE t1;
            SET GLOBAL innodb_change_buffering_debug=0;

            The problem seem to be in guts of InnoDB btree implementation.

            svoj Sergey Vojtovich added a comment - - edited The problem is reproducible on x86 using 5.5/r4376. But it takes hours to fail: while [ $? -eq 0 ]; do ./mtr --parallel=2 innodb.innodb_corrupt_bit --repeat=100 --mem; done On PPC64 the problem is a bit easier to reproduce. 5.5/r4421 fails within an hour with the following simplified test: --source include/have_innodb.inc --source include/have_debug.inc   SET GLOBAL innodb_change_buffering_debug=1; SET unique_checks=0;   CREATE TABLE t1( a INT AUTO_INCREMENT PRIMARY KEY, b CHAR(100), c INT, z INT, UNIQUE(z, b)) ENGINE=InnoDB;   INSERT INTO t1 VALUES(0,'x',1, 1); INSERT INTO t1 SELECT 0,b,c+1,z+1 FROM t1; INSERT INTO t1 SELECT 0,b,c+10,z+10 FROM t1; INSERT INTO t1 SELECT 0,b,c+20,z+20 FROM t1; INSERT INTO t1 SELECT 0,b,c+50,z+50 FROM t1; INSERT INTO t1 SELECT 0,b,c+100,z+100 FROM t1; INSERT INTO t1 SELECT 0,b,c+200,z+200 FROM t1; INSERT INTO t1 SELECT 0,b,c+400,z+400 FROM t1; INSERT INTO t1 SELECT 0,b,c+800,z+800 FROM t1; INSERT INTO t1 SELECT 0,b,c+1600,z+1600 FROM t1; INSERT INTO t1 SELECT 0,b,c+4000,z+4000 FROM t1; INSERT INTO t1 VALUES(13000,'x',1,1);   DROP TABLE t1; SET GLOBAL innodb_change_buffering_debug=0; The problem seem to be in guts of InnoDB btree implementation.

            Reproducible with 5.5 up to revision 4423 using my laptop:

            CURRENT_TEST: innodb.innodb_corrupt_bit
            mysqltest: At line 83: query 'INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1)' failed: 1062: Duplicate entry '1-x' for key 'idxē'

            jplindst Jan Lindström (Inactive) added a comment - Reproducible with 5.5 up to revision 4423 using my laptop: CURRENT_TEST: innodb.innodb_corrupt_bit mysqltest: At line 83: query 'INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1)' failed: 1062: Duplicate entry '1-x' for key 'idxē'

            I can repeat this only if I use debug build and innodb_change_buffering_debug=1; Test case intentionally tryes to insert duplicate key entry to index and still continue, there is some bug on insert buffering, not sure if that is visible only when this debug option is set.

            jplindst Jan Lindström (Inactive) added a comment - I can repeat this only if I use debug build and innodb_change_buffering_debug=1; Test case intentionally tryes to insert duplicate key entry to index and still continue, there is some bug on insert buffering, not sure if that is visible only when this debug option is set.

            revno: 4424
            committer: Jan Lindström <jplindst@mariadb.org>
            branch nick: 5.5
            timestamp: Mon 2015-02-09 16:14:27 +0200
            message:
            MDEV-7139: Sporadic failure in innodb.innodb_corrupt_bit on P8

            The testcase fails randomly due to ibuf merge happening in the background.
            Fix not to do any merges with ibuf_debug enabled.

            jplindst Jan Lindström (Inactive) added a comment - revno: 4424 committer: Jan Lindström <jplindst@mariadb.org> branch nick: 5.5 timestamp: Mon 2015-02-09 16:14:27 +0200 message: MDEV-7139 : Sporadic failure in innodb.innodb_corrupt_bit on P8 The testcase fails randomly due to ibuf merge happening in the background. Fix not to do any merges with ibuf_debug enabled.

            Ok, after increasing the number of pad tables on test I can repeat this on my laptop. But then, how critical this really is as it is only debug build and instrumented failure case ?

            jplindst Jan Lindström (Inactive) added a comment - Ok, after increasing the number of pad tables on test I can repeat this on my laptop. But then, how critical this really is as it is only debug build and instrumented failure case ?

            Still repeatable.

            jplindst Jan Lindström (Inactive) added a comment - Still repeatable.

            I will fix this using different kind of instrumentation than ibuf_debug, that way this test is a lot more repeatable and consistent.

            jplindst Jan Lindström (Inactive) added a comment - I will fix this using different kind of instrumentation than ibuf_debug, that way this test is a lot more repeatable and consistent.

            commit 0f252702b15dfb5d90446fd49c9e28aa93424eb1
            Author: Jan Lindström <jan.lindstrom@mariadb.com>
            Date: Wed Jun 8 08:40:10 2016 +0300

            MDEV-7139: Sporadic failure in innodb.innodb_corrupt_bit on P8

            Use direct persistent index corruption set on InnoDB dictionary
            for this test. Do not allow creating new indexes if one of the
            existing indexes is already marked as corrupted.

            jplindst Jan Lindström (Inactive) added a comment - commit 0f252702b15dfb5d90446fd49c9e28aa93424eb1 Author: Jan Lindström <jan.lindstrom@mariadb.com> Date: Wed Jun 8 08:40:10 2016 +0300 MDEV-7139 : Sporadic failure in innodb.innodb_corrupt_bit on P8 Use direct persistent index corruption set on InnoDB dictionary for this test. Do not allow creating new indexes if one of the existing indexes is already marked as corrupted.

            People

              jplindst Jan Lindström (Inactive)
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.