[MDEV-14014] Multi-Slave Replication Fail: bogus data in log event Created: 2017-10-06 Updated: 2020-12-08 Resolved: 2018-07-01 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Affects Version/s: | 10.1.20 |
| Fix Version/s: | 10.1.35, 10.2.17, 10.3.8 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Nilnandan Joshi | Assignee: | Sergei Golubchik |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | upstream | ||
| Issue Links: |
|
||||||||
| Sprint: | 10.2.14, 10.1.32 | ||||||||
| Description |
|
After upgrading to 10.1.20, replication now intermittently stops (randomly and not all slaves of a master at the same time) with the following errors:
We can fix this by issuing a START SLAVE and everything works without issue but the regular replication failures occurs. It looks like upstream bug : https://bugs.mysql.com/bug.php?id=84752 With setting slave_compressed_protocol=1 and change the sync_binlog value to off, it's working fine. This is not easily reproducible. As per the comment in upstream bug, able to reproduce only with high load on master server and multiple slaves (like 5 or 6) |
| Comments |
| Comment by Andrei Elkin [ 2018-03-19 ] | |||
|
@serg: Hello. I have a patch to serg4:07 PM | |||
| Comment by Andrei Elkin [ 2018-04-11 ] | |||
|
Sergei, I need to speak with you again about You might have perceived the issue as mere partial read by Dump thread of a event group that is being append to the binlog by a concurrent writer (user) thread. Actually I did not confront that with clear pointing at possible dirty read. 'Dirty' here stands for data that may not exist at all. E.g:
The dirty read I believe can be explained by the docs and Posix. I dug out some relevant references. I also have something to say about your tail--follow counter-example. My understanding is as follows. Suppose we have concurrent Reader (R) and Writer threads (W). R may access data being written by W.
Here bi_written subrange reflects the partially written segment of data. The rest of
are bytes that the reader may (my claim) access. And if this possibility is real it answers
from our slack chat.
The EOF set affront of bytes recording does explain the above customer error: the Dump thread caught EOF advanced but data had not been yet written, those dirty data included the length of the event. The following link could give you some thoughts:
and rather confirms my (very early) guess 'visibility of the writes To your puzzling "negative" to my theory 'tail -f' use case notice that according to Could you please consider whether these arguments are reasonable. Cheers, Andrei | |||
| Comment by Andrei Elkin [ 2018-04-11 ] | |||
|
Sergei, please read my clarification to the patch. I think we can suspect dirty read and | |||
| Comment by Sergei Golubchik [ 2018-04-11 ] | |||
|
1. I think you misinterpret the standard. What it says is that "the file offset shall be set to the end of the file prior to each write" that is, before every write it does (in a sense) lseek(fd, 0, SET_END). It does not say that prior to each write the file length should be set to what it will be when the write will be completed. 2. "torn writes" mean, exactly, that one can read partial data. Like, a writer writing "123456789" and a reader reading only "1234<EOF>", so, yes, writes are not (necessarily) atomic. 3. inotify is much newer than tail. Even your linked wikipedia article says that tail used to probe the file, repeatedly reading till eof. | |||
| Comment by Andrei Elkin [ 2018-04-11 ] | |||
|
>1. I think you misinterpret the standard. Sorry, I must admit that. We can not use it as the standard's confirmation for dirty read possibility. 1234[non-written-byte]EOF I could not find any guarantee that 'non-written-byte' can not be found by the reader. Yes, I was content with the current implementation of tail-f not to neglect what was before. In particular it could use some other methods/api:s to achieve atomicity of EOF and data-written. | |||
| Comment by Andrei Elkin [ 2018-06-12 ] | |||
|
Sergei, could you please check out [Commits] c7a2d42c30b: Cheers, Andrei |