[MDEV-9242] Innodb reports Assertion failure in file buf0dblwr.cc line 579 Created: 2015-12-07  Updated: 2016-04-29  Resolved: 2016-04-29

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, Storage Engine - XtraDB
Affects Version/s: 10.1.10
Fix Version/s: 10.1.14

Type: Bug Priority: Major
Reporter: chen yuanyuan Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Centos 6.7 x86_64


Attachments: File mysqld.log    
Sprint: 10.1.14

 Description   

I have installed mariadb 10.1.9 in virtualbox with some tables compressed using "ENGINE=Innodb,COLLATE=utf8mb4_bin,page_compressed=1;" in create statements.
After I power off the machine when mariadb is running,of course its data is corrupted.But what is important is that it reports the following assertion failure during recovery:

2015-12-07 09:33:52 7f6ecb1f1760  InnoDB: Assertion failure in thread 140113830942560 in file buf0dblwr.cc line 579 
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
151207  9:33:52 [ERROR] mysqld got signal 6 ; 
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware

Can mariadb automatically recover data in such situations?
PS:The full error log is in the attachment.



 Comments   
Comment by Elena Stepanova [ 2015-12-26 ]

Generally yes, InnoDB can often recover data in such situations. It can do it automatically if the damage is not big, or can do it upon request, controlled by innodb_force_recovery option. See MariaDB KB for a short reference on available values, or MySQL manual for a more detailed discussion.

However, there are two things that make your case more complicated.
First, according to your full log, InnoDB considers this particular corruption really bad:

InnoDB: Also the page in the doublewrite buffer is corrupt.
InnoDB: Cannot continue operation.
InnoDB: You can try to recover the database with the my.cnf
InnoDB: option:
InnoDB: innodb_force_recovery=6

innodb_force_recovery=6 is a really desperate measure, and if InnoDB correctly detects that it's the only option (and is not just being pessimistic), chances on binary recovery are nearly non-existent, the best you can hope for to be able to dump the data and load it into a clean server.

Secondly, there is a known upstream issue with high values of innodb_force_recovery: MDEV-8963 / MDEV-9121 . It is said to be fixed in InnoDB 5.6.27, but if you are using XtraDB (which is default), it is not there yet.

Nevertheless, you can try different innodb_force_recovery values to see if there is any luck with any of them.

Comment by chen yuanyuan [ 2015-12-27 ]

Thanks for your explanation.
But first because the server reports a assertion failure,I think this should be considered a bug.May be it should be somehow fixed to report a more detailed error message instead of this failure
.Secondly,I only encounter this problem with page compressed tables.So may be the related code should be reviewed to avoid the host crash situation?Dumping data and reloading them is an almost impossible thing on a large system.

Comment by Elena Stepanova [ 2015-12-29 ]

For the first point, it's the general nasty habit of InnoDB to handle errors by going down with assertion failures. I agree it is ugly and hard to deal with, but that's how it is now, and I doubt it is realistic to fix it everywhere in reasonable time.
However, I can't quite agree with the part about a more detailed message. InnoDB is almost as detailed as it gets – look at the log, it contains, in order of appearance

  • a general error message ("page corrupted");
  • page dump;
  • all kinds of technical information about the page;
  • some explanation for this technical information;
  • another page dump;
  • again technical information about the page;
  • again general error message ("this page is also corrupted");
  • surrender ("cannot continue operation");
  • specific advice (use innodb_force_recovery=6);
  • stack trace.
    How more detailed can it be? It just looks scary, but more detailed it goes, the scarier it will look.

For the second point, my guess (and it's really a guess) – the reason why only compressed tables are affected is the same as for everything compressed – it's far more prone to irreparable damage, on more or less obvious reasons. I don't quite understand what you mean by "avoiding the host crash" – you said before that the machine was powered off in the middle of operation, how can it be avoided? Or, if you mean the crash upon recovery, there is only so much damage the system can get – if the data is really, really damaged, the crash is not only unavoidable, but is also the only way to do, because trying to work with damaged data is far more dangerous.

In any case, it is just my opinion, I am re-assigning it to jplindst, the InnoDB expert, for further consideration.

Comment by Jan Lindström (Inactive) [ 2015-12-30 ]

Hi,

This looks like a bug. Can you share the database or some instructions how to repeat ?

R: Jan

Comment by Jan Lindström (Inactive) [ 2016-04-28 ]

Problem repeated using a customer database.

Comment by Jan Lindström (Inactive) [ 2016-04-29 ]

commit 037b78e5ec2e28d0d4573605f7dc8d5e2b36a66f
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Fri Apr 29 12:32:35 2016 +0300

MDEV-9242: Innodb reports Assertion failure in file buf0dblwr.cc line 579

Analysis: When pages in doublewrite buffer are analyzed compressed
pages do not have correct checksum.

Fix: Decompress page before checksum is compared. If decompression
fails we still check checksum and corrupted pages are found.
If decompression succeeds, page now contains the original
checksum.

Generated at Thu Feb 08 07:33:12 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.