[MDEV-7139] Sporadic failure in innodb.innodb_corrupt_bit on P8 Created: 2014-11-19  Updated: 2016-06-08  Resolved: 2016-06-08

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, Storage Engine - XtraDB, Tests
Affects Version/s: 5.5, 10.0
Fix Version/s: 5.5.50

Type: Bug Priority: Minor
Reporter: Elena Stepanova Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: buildbot, tests

Issue Links:
Relates
relates to MDEV-7069 Fix buildbot failures in main server ... Stalled
relates to MDEV-9463 Sporadic failure in innodb.innodb_cor... Closed

 Description   

http://buildbot.askmonty.org/buildbot/builders/bintar-rhel7-p8-debug/builds/12/steps/test/logs/stdio

innodb.innodb_corrupt_bit 'innodb_plugin' w1 [ fail ]
        Test ended at 2014-11-19 05:33:19
 
CURRENT_TEST: innodb.innodb_corrupt_bit
mysqltest: At line 74: query 'INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1)' failed: 1062: Duplicate entry '1-x' for key 'idxē'
 
The result from queries just before the failure was:
< snip >
@@unique_checks
0
select @@innodb_change_buffering_debug;
@@innodb_change_buffering_debug
1
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1,z+1 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+10,z+10 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+20,z+20 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+50,z+50 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+100,z+100 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+200,z+200 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+400,z+400 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+800,z+800 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1600,z+1600 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+4000,z+4000 FROM corrupt_bit_test_ā;
select count(*) from corrupt_bit_test_ā;
count(*)
1024
CREATE INDEX idx3 ON corrupt_bit_test_ā(b, c);
INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1);
 
More results from queries before failure can be found in /home/buildbot/maria-slave/power8-vlp03-bintar-debug/build/mysql-test/var/1/log/innodb_corrupt_bit.log

So far it happened twice, one time above on 10.0 / bintar-rhel7-p8-debug, and another one on 5.5 / bintar-trusty-p8-debug. Apparently there are no logs from the other one, but it can be seen in Cross Reference report:

[Build Details]	5.5	4339	bintar-trusty-p8-debug	2014-10-31 15:00:29	172	nm	Normal run, no --ps-protocol	innodb.innodb_corrupt_bit	xtradb	
innodb.innodb_corrupt_bit 'xtradb'       w4 [ fail ]
        Test ended at 2014-10-31 09:47:07
 
CURRENT_TEST: innodb.innodb_corrupt_bit
mysqltest: At line 72: query 'INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1)' failed: 1062: Duplicate entry '1-x' for key 'idxā'
 
The result from queries just before the failure was:
< snip >
@@unique_checks
0
select @@innodb_change_buffering_debug;
@@innodb_change_buffering_debug
1
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1,z+1 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+10,z+10 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+20,z+20 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+50,z+50 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+100,z+100 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+200,z+200 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+400,z+400 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+800,z+800 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1600,z+1600 FROM corrupt_bit_test_ā;
INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+4000,z+4000 FROM corrupt_bit_test_ā;
select count(*) from corrupt_bit_test_ā;
count(*)
1024
CREATE INDEX idx3 ON corrupt_bit_test_ā(b, c);
INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1);
 
More results from queries before failure can be found in /var/lib/buildbot/maria-slave/power8-vlp04-debug/build/mysql-test/var/4/log/innodb_corrupt_bit.log

I find it highly suspicious that both times the failure happened on P8. It's a plain test, there should be no race conditions involved, so there must be some other reason (build problems? environmental?). In any case, it's worth investigating.



 Comments   
Comment by Sergey Vojtovich [ 2015-02-06 ]

The problem is reproducible on x86 using 5.5/r4376. But it takes hours to fail:

while [ $? -eq 0 ]; do ./mtr --parallel=2 innodb.innodb_corrupt_bit --repeat=100 --mem; done

On PPC64 the problem is a bit easier to reproduce. 5.5/r4421 fails within an hour with the following simplified test:

--source include/have_innodb.inc
--source include/have_debug.inc
 
SET GLOBAL innodb_change_buffering_debug=1;
SET unique_checks=0;
 
CREATE TABLE t1(
       a INT AUTO_INCREMENT PRIMARY KEY,
       b CHAR(100),
       c INT,
       z INT,
       UNIQUE(z, b))
ENGINE=InnoDB;
 
INSERT INTO t1 VALUES(0,'x',1, 1);
INSERT INTO t1 SELECT 0,b,c+1,z+1 FROM t1;
INSERT INTO t1 SELECT 0,b,c+10,z+10 FROM t1;
INSERT INTO t1 SELECT 0,b,c+20,z+20 FROM t1;
INSERT INTO t1 SELECT 0,b,c+50,z+50 FROM t1;
INSERT INTO t1 SELECT 0,b,c+100,z+100 FROM t1;
INSERT INTO t1 SELECT 0,b,c+200,z+200 FROM t1;
INSERT INTO t1 SELECT 0,b,c+400,z+400 FROM t1;
INSERT INTO t1 SELECT 0,b,c+800,z+800 FROM t1;
INSERT INTO t1 SELECT 0,b,c+1600,z+1600 FROM t1;
INSERT INTO t1 SELECT 0,b,c+4000,z+4000 FROM t1;
INSERT INTO t1 VALUES(13000,'x',1,1);
 
DROP TABLE t1;
SET GLOBAL innodb_change_buffering_debug=0;

The problem seem to be in guts of InnoDB btree implementation.

Comment by Jan Lindström (Inactive) [ 2015-02-09 ]

Reproducible with 5.5 up to revision 4423 using my laptop:

CURRENT_TEST: innodb.innodb_corrupt_bit
mysqltest: At line 83: query 'INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1)' failed: 1062: Duplicate entry '1-x' for key 'idxē'

Comment by Jan Lindström (Inactive) [ 2015-02-09 ]

I can repeat this only if I use debug build and innodb_change_buffering_debug=1; Test case intentionally tryes to insert duplicate key entry to index and still continue, there is some bug on insert buffering, not sure if that is visible only when this debug option is set.

Comment by Jan Lindström (Inactive) [ 2015-02-09 ]

revno: 4424
committer: Jan Lindström <jplindst@mariadb.org>
branch nick: 5.5
timestamp: Mon 2015-02-09 16:14:27 +0200
message:
MDEV-7139: Sporadic failure in innodb.innodb_corrupt_bit on P8

The testcase fails randomly due to ibuf merge happening in the background.
Fix not to do any merges with ibuf_debug enabled.

Comment by Jan Lindström (Inactive) [ 2016-02-04 ]

Ok, after increasing the number of pad tables on test I can repeat this on my laptop. But then, how critical this really is as it is only debug build and instrumented failure case ?

Comment by Jan Lindström (Inactive) [ 2016-02-04 ]

Still repeatable.

Comment by Jan Lindström (Inactive) [ 2016-06-08 ]

I will fix this using different kind of instrumentation than ibuf_debug, that way this test is a lot more repeatable and consistent.

Comment by Jan Lindström (Inactive) [ 2016-06-08 ]

commit 0f252702b15dfb5d90446fd49c9e28aa93424eb1
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Wed Jun 8 08:40:10 2016 +0300

MDEV-7139: Sporadic failure in innodb.innodb_corrupt_bit on P8

Use direct persistent index corruption set on InnoDB dictionary
for this test. Do not allow creating new indexes if one of the
existing indexes is already marked as corrupted.

Generated at Thu Feb 08 07:17:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.