[MDEV-678] LP:881383 - Corrupt ARCHIVE tables failing to repair Created: 2011-10-25  Updated: 2022-09-08

Status: Open
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.1.67, 5.2.14, 5.3.12, 5.5.36, 10.0.9
Fix Version/s: 5.5

Type: Bug Priority: Minor
Reporter: Dreas van Donselaar (Inactive) Assignee: Michael Widenius
Resolution: Unresolved Votes: 1
Labels: Launchpad, upstream

Attachments: XML File LPexportBug881383.xml     File mdev678.test    

 Description   

We regularly have corrupt ARCHIVE engine tables that fail to repair:
================================

MariaDB [(none)]> database;
Database changed
MariaDB [database]> repair table table;
+-----------------------------+--------+----------+----------+
| Table                       | Op     | Msg_type | Msg_text |
+-----------------------------+--------+----------+----------+
| database.table | repair | error    | Corrupt  |
+-----------------------------+--------+----------+----------+
1 row in set (3.92 sec)
 
MariaDB [database]>

================================

I'm unsure what causes the corruption, but a repair shouldn't be failing I presume?

I'll upload a copy of an example table to FTP.



 Comments   
Comment by Sergei Golubchik [ 2012-03-09 ]

Re: Corrupt ARCHIVE tables failing to repair
on irc:
<dreas> Basically if you kill MariaDB whilst it's writing to an archive table, you can throw it away afterwards

Comment by Elena Stepanova [ 2012-03-29 ]

Re: Corrupt ARCHIVE tables failing to repair
Hi Dreas,

There seems to be a workaround which allows to fix the corrupted tables: run REPAIR TABLE <tablename> EXTENDED twice on the table. See the output below (`t` it's pretty much a copy of one of your tables).

Both 'EXTENDED' and 'twice' is important. The second attempt of REPAIR .. EXTENDED takes much longer, and after that the table starts being usable again. I've seen the same effect on other tables you uploaded. Please try it out.

MariaDB [test]> check table t;
-----------------------------+

Table Op Msg_type Msg_text

-----------------------------+

test.t check error Corrupt

-----------------------------+
1 row in set (11.52 sec)

MariaDB [test]> repair table t;
------------------------------+

Table Op Msg_type Msg_text

------------------------------+

test.t repair error Corrupt

------------------------------+
1 row in set (10.51 sec)

MariaDB [test]> check table t;
-----------------------------+

Table Op Msg_type Msg_text

-----------------------------+

test.t check error Corrupt

-----------------------------+
1 row in set (1.65 sec)

MariaDB [test]> insert into t select * from corrupt_archive_tables.t1;
ERROR 1034 (HY000): Incorrect key file for table 't'; try to repair it

MariaDB [test]> repair table t extended;
------------------------------+

Table Op Msg_type Msg_text

------------------------------+

test.t repair status OK

------------------------------+
1 row in set (9.88 sec)

MariaDB [test]> check table t;
-----------------------------+

Table Op Msg_type Msg_text

-----------------------------+

test.t check error Corrupt

-----------------------------+
1 row in set (6.59 sec)

MariaDB [test]> repair table t extended;
------------------------------+

Table Op Msg_type Msg_text

------------------------------+

test.t repair status OK

------------------------------+
1 row in set (40.35 sec)

MariaDB [test]> check table t;
-----------------------------+

Table Op Msg_type Msg_text

-----------------------------+

test.t check status OK

-----------------------------+
1 row in set (6.60 sec)

MariaDB [test]> insert into t select * from corrupt_archive_tables.t1;
Query OK, 700000 rows affected (23.28 sec)
Records: 700000 Duplicates: 0 Warnings: 0

Comment by Elena Stepanova [ 2012-03-30 ]

Re: Corrupt ARCHIVE tables failing to repair
Hi Monty,

from IRC:
montywi: if you could just do a quick test of trying to repair an archive table that is 'too short' and check what happens...
montywi: ie, does it abort repair or does it get back most of the rows

One observation that seems to be an error rather than a limitation (I already mentioned it in the previous comment, will put in more details now, and with test SQL to reproduce instead of tables uploaded to our FTP).

After an archive table gets corrupted due to server being killed while writing into the table, REPAIR TABLE does not help, it leaves the table in the same corrupted state, no matter how many times we try. However, REPAIR TABLE .. EXTENDED does help, but only if it's executed twice.
I don't know why REPAIR does not work but EXTENDED does, much less why it requires two iterations, but it doesn't look right.
Noticeably, the second REPAIR..EXTENDED iteration takes much longer, so something definitely goes on there.

In the manual test below steps are described as comments, and actual SQL as commands, so you can copy-paste it into your client in two chunks (before and after server restart).

I tried MariaDB 5.2.10 and MariaDB 5.5.21, got the same results on both.
On MySQL 5.5.21, however, the first REPAIR (normal, not EXTENDED) returns OK, and all consequent commands too.

  1. Scenario to reproduce:

# start server, no specific parameters needed;
# create and populate a MyISAM table which we'll be reading from, as below: 
 
# BEFORE server restart
 
DROP TABLE IF EXISTS t1, t2;
CREATE TABLE t1 ( f1 CHAR(16) ) ENGINE=MyISAM;
INSERT INTO t1 VALUES ('abc'),('def'),('ghi'),('jkl');
 
# create and start populating an archive table, as below:
CREATE TABLE t2 ( f1 CHAR(16) ) ENGINE=ARCHIVE;
INSERT INTO t2 SELECT a.* FROM t1 a, t1 b, t1 c, t1 d, t1 e, t1 f, t1 g, t1 h, t1 i, t1 j, t1 k, t1 l, t1 m;
 
# End of "BEFORE server restart"

##########################
# while the insert is running, kill the server with kill -9
# (we insert ~65M rows, so there will be time)
# start the server again (no complaints in error log about archive tables)
##########################

# AFTER server restart:
 
CHECK TABLE t2;
# It says the table is corrupted.
# You can run it several times if you wish, it's still corrupted.
 
REPAIR TABLE t2;
# Again, it says the table is corrupted, 
# and you can run it several times, it's still corrupted.
 
OPTIMIZE TABLE t2;
# It says "Unknown - internal error 145 during operation"
 
# 145 is "table marked as crashed", so it's not wrong. 
# Running it several times doesn't help, either.
 
REPAIR TABLE t2 EXTENDED;
# It says OK, but if you run CHECK TABLE now, it's still corrupted:
 
CHECK TABLE t2;
# (says "corrupt")
 
# Run repair .. .extended again (takes much longer):
 
REPAIR TABLE t2 EXTENDED;
# (says "OK")
 
# And now CHECK TABLE suddenly says it's OK, too:
 
CHECK TABLE t2;
# (says OK)
 
# And it's usable again:
 
SELECT COUNT(*) FROM t2;
# Returns some count>0;
 
SELECT * FROM t2 LIMIT 1;
# Returns a row

Comment by Rasmus Johansson (Inactive) [ 2012-03-30 ]

Launchpad bug id: 881383

Comment by Elena Stepanova [ 2014-03-26 ]

Also attached an MTR testcase mdev678.test, but depending on the version, sleep_time value (the first line) should be tuned. It's the time between issuing INSERT and killing the server, it needs to be adjusted so that some rows get already inserted into the table, but INSERT is not finished yet.

Could also reproduce the problem on MySQL 5.1/5.5, but not on MySQL 5.6 – it just loses whatever was inserted, but technically REPAIR works.

Generated at Thu Feb 08 06:30:32 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.