[MDEV-4867] Incomplete Binlog writes on disk-full error, also breaking slave replication Created: 2013-08-09  Updated: 2022-09-08

Status: Stalled
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.3, 5.5.32, 5.3.12
Fix Version/s: 5.5

Type: Bug Priority: Minor
Reporter: Arjen Lentz Assignee: Michael Widenius
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Linux, probably any



 Description   

It appears the binlog and replication code still has write calls that aren't checked for successful completion. On disk full, the binlog gets truncated.

As a direct consequence and example, a slave will report this error:
Got fatal error 1236 from master when reading data from binary log: 'binlog truncated in the middle of event'

Obviously it's not just a problem with replication, but also for having correct binlogs for point-in-time recovery.

Suggested solution: writes should check for success, and in case of a disk-full error, report and retry correctly. The "wait until space has been freed up" approach that's used on other disk-full conditions, would be fine here also.



 Comments   
Comment by Elena Stepanova [ 2013-09-01 ]

Hi,

I'm getting "wait until" when the binary log cannot be written (on 5.5 and on 10.0, didn't check 5.3):

130902 0:25:22 [Warning] mysqld: Disk is full writing '/data/smalldisk/master-bin.000001' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
130902 0:25:22 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs

I do the following:

  • start server with default options and with log-bin=<small partition>/master-bin (the datadir is on a different partition)
  • set binlog_format=row (for faster growing)
  • create a table;
  • run some INSERTs until the next one hangs and the message above is printed into the error log.

Is there anything special in the use case you described that I've missed?

Comment by Martijn Otto [ 2013-09-02 ]

We recently ran into exactly this problem too. The server does indeed continue to write to the binlog after free space has been created again, however that does leave it with a broken binlog.

Say you are trying to write 1024 bytes to the binlog, but only 512 bytes are available. 512 bytes will be written, after which the disk is full and the message is written to the error log.

After creating free space, the remaining 512 bytes are not written to the file. Instead, the next binlog entry is written. Any slaves replicating this server will notice this and stop replicating. The only way to get them to work again is to guess at which position the binlog continues and set that as the new master_log_pos.

Comment by Elena Stepanova [ 2013-09-02 ]

I see, thanks for the info, I will check it out.

Comment by Elena Stepanova [ 2013-09-16 ]

In my manual tests on the current 5.5 tree the rest of the event (or the event group) was written to the binary log after the space was freed.
That's how it was:

  • there is a little space left;
  • statement/transaction gets committed;
  • binary log grows a bit, using up all remaining disk space;
    => the message in the error log appears, the statement/commit execution doesn't finish, the slave doesn't replicate the event, the binary log is unreadable;
  • disk space is freed
    => the binlog grows further and becomes readable again, the statement/transaction execution finishes, the event is successfully replicated on the slave.

I tried it with RBR (atomic transactions) and with SBR (atomic and multi-statement transactions).

Possibly the initially described erroneous behavior somehow depends on the event type(s) and their contents, or on server startup parameters, or on the file system, or it's caused by a specific combination of free/required disk space, or it is only possible in the concurrent scenario.

I'm assigning this to Kristian, maybe he'll be able to see the reason of the problem by code inspection (when he has a chance to inspect it).

Comment by Michael Widenius [ 2014-03-23 ]

Elena did all the rights steps to try to reproduce this, so there is not much more I can do regarding testing.

I took a look at all code in log.cc and related code but could not find anything wrong.

The logic for file full is quite simply and is the same for any event:
In the lowest file write call, mysys/my_write.c:my_write() we detect if we on write() get
an ENOSPC or EDQUOT error. If yes, we give a warning and wait 60 seconds and try again by writing the data we did not manage to write before.

I am reasonable confident in the my_write() code, so the error is very likely somewhere else. I would however need more data to be able to find out where.

If you could upload the last 100K of the binary log where things went wrong to the private folder at ftp://ftp.askmonty.org, then we would have more to go on to find and fix this.

Comment by Michael Widenius [ 2014-03-23 ]

I have changed the priority of this to "minor" until we find a way to repeat it or more information of what could be different from our test and the system where this happens.

Comment by Arjen Lentz (Inactive) [ 2014-03-23 ]

Thanks Monty
Unfortunately I don't think we had a copy of the log available from the last occurrence. 
We haven't spotted this in recent times, obviously we try to avoid disk full situations for clients.
We'll endeavour to provide a log should we spot it again. Or perhaps another user will encounter the issue...

Regards
Arjen. 

-------- Original message --------
From: "Michael Widenius (JIRA)" <jira@mariadb.atlassian.net>
Date:24/03/2014 04:13 (GMT+10:00)
To: arjen@openquery.com
Subject: [JIRA] (MDEV-4867) Incomplete Binlog writes on disk-full error,
also breaking slave replication

    [ https://mariadb.atlassian.net/browse/MDEV-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=44497#comment-44497 ]

Michael Widenius commented on MDEV-4867:
----------------------------------------

Elena did all the rights steps to try to reproduce this, so there is not much more I can do regarding testing.

I took a look at all code in log.cc and related code but could not find anything wrong.

The logic for file full is quite simply and is the same for any event:
In the lowest file write call, mysys/my_write.c:my_write() we detect if we on write() get
an ENOSPC or EDQUOT error.   If yes, we give a warning and wait 60 seconds and try again by writing the data we did not manage to write before.

I am reasonable confident in the my_write() code, so the error is very likely somewhere else. I would however need more data to be able to find out where.

If you could upload the last 100K of the binary log where things went wrong to the private folder at ftp://ftp.askmonty.org, then we would have more to go on to find and fix this.


This message was sent by Atlassian JIRA
(v6.2-OD-10-004-WN#6253)

Generated at Thu Feb 08 06:59:48 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.