Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14014

Multi-Slave Replication Fail: bogus data in log event

Details

    • 10.2.14, 10.1.32

    Description

      After upgrading to 10.1.20, replication now intermittently stops (randomly and not all slaves of a master at the same time) with the following errors:

      2017-10-01 20:38:20 140079966804736 [ERROR] Master 'trades': Error reading packet from server: bogus data in log event; the first event 'mysql-bin.038184' at 156161357, the last event read from 'mysql-bin.038185' at 60018003, the last byte read from 'mysql-bin.038185' at 60018022. (server_errno=1236)
      2017-10-01 20:38:20 140079966804736 [ERROR] Master 'trades': Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'bogus data in log event; the first event 'mysql-bin.038184' at 156161357, the last event read from 'mysql-bin.038185' at 60018003, the last byte read from 'mysql-bin.038185' at 60018022.', Internal MariaDB error code: 1236
      

      We can fix this by issuing a START SLAVE and everything works without issue but the regular replication failures occurs.

      It looks like upstream bug : https://bugs.mysql.com/bug.php?id=84752

      With setting slave_compressed_protocol=1 and change the sync_binlog value to off, it's working fine. This is not easily reproducible. As per the comment in upstream bug, able to reproduce only with high load on master server and multiple slaves (like 5 or 6)

      Attachments

        Issue Links

          Activity

            Elkin Andrei Elkin added a comment -

            @serg: Hello. I have a patch to MDEV-14014 which adds a feature to IO_CACHE which fixes the bug. I would be good someone outside repl to look at. I am to publish it yet.

            serg4:07 PM
            sure, I can do it

            Elkin Andrei Elkin added a comment - @serg: Hello. I have a patch to MDEV-14014 which adds a feature to IO_CACHE which fixes the bug. I would be good someone outside repl to look at. I am to publish it yet. serg4:07 PM sure, I can do it
            Elkin Andrei Elkin added a comment - - edited

            Sergei,

            I need to speak with you again about MDEV-14014 as I have made my mind about the case.

            You might have perceived the issue as mere partial read by Dump thread of a event group that is being append to the binlog by a concurrent writer (user) thread. Actually I did not confront that with clear pointing at possible dirty read. 'Dirty' here stands for data that may not exist at all. E.g:

            2017-10-02 10:48:35 [ERROR] Error in Log_event::read_log_event(): 'Event truncated', data_len: 201326604, event_type: 1
            

            The dirty read I believe can be explained by the docs and Posix. I dug out some relevant references. I also have something to say about your tail--follow counter-example.

            My understanding is as follows.

            Suppose we have concurrent Reader (R) and Writer threads (W). R may access data being written by W.
            Let the following be a sequence of bytes being appended to the file:

               b1_written b2 ... b_last_written | b1_dirty... EOF
            

            Here bi_written subrange reflects the partially written segment of data. The rest of

               b1_dirty till EOF
            

            are bytes that the reader may (my claim) access. And if this possibility is real it answers
            your remark

            I don't understand why reading till EOF is wrong...

            from our slack chat.
            And the standard http://www.opengroup.org/onlinepubs/009695399/functions/pwrite.html seems to confirm it:

            If the O_APPEND flag of the file status flags is set, the file offset shall be set to the end of the file prior to each write and no intervening file modification operation shall occur between changing the file offset and the write operation.

            The EOF set affront of bytes recording does explain the above customer error: the Dump thread caught EOF advanced but data had not been yet written, those dirty data included the length of the event.

            The following link could give you some thoughts:
            https://stackoverflow.com/questions/1154446/is-file-append-atomic-in-unix

            However concurrent reads to atomic appends may see torn writes
            depending on OS, filing system, and what flags you opened the file
            with - the increment of the maximum file extent is atomic, but the
            visibility of the writes with respect to reads may or may not be
            atomic.

            and rather confirms my (very early) guess 'visibility of the writes
            with respect to reads may or may not be atomic'. The question naturally
            goes: atomic to what. And I read that as atomic to the file offset
            setting.

            To your puzzling "negative" to my theory 'tail -f' use case notice that according to
            https://en.wikipedia.org/wiki/Tail_(Unix) it actually relies on inotify before attempting
            to read. It would be reasonable to expect inotify respects atomicity (of EOF setting and last bytes written).

            Could you please consider whether these arguments are reasonable.

            Cheers,

            Andrei

            Elkin Andrei Elkin added a comment - - edited Sergei, I need to speak with you again about MDEV-14014 as I have made my mind about the case. You might have perceived the issue as mere partial read by Dump thread of a event group that is being append to the binlog by a concurrent writer (user) thread. Actually I did not confront that with clear pointing at possible dirty read. 'Dirty' here stands for data that may not exist at all. E.g: 2017-10-02 10:48:35 [ERROR] Error in Log_event::read_log_event(): 'Event truncated', data_len: 201326604, event_type: 1 The dirty read I believe can be explained by the docs and Posix. I dug out some relevant references. I also have something to say about your tail--follow counter-example. My understanding is as follows. Suppose we have concurrent Reader (R) and Writer threads (W). R may access data being written by W. Let the following be a sequence of bytes being appended to the file: b1_written b2 ... b_last_written | b1_dirty... EOF Here bi_written subrange reflects the partially written segment of data. The rest of b1_dirty till EOF are bytes that the reader may (my claim) access. And if this possibility is real it answers your remark I don't understand why reading till EOF is wrong... from our slack chat. And the standard http://www.opengroup.org/onlinepubs/009695399/functions/pwrite.html seems to confirm it: If the O_APPEND flag of the file status flags is set, the file offset shall be set to the end of the file prior to each write and no intervening file modification operation shall occur between changing the file offset and the write operation. The EOF set affront of bytes recording does explain the above customer error: the Dump thread caught EOF advanced but data had not been yet written, those dirty data included the length of the event. The following link could give you some thoughts: https://stackoverflow.com/questions/1154446/is-file-append-atomic-in-unix However concurrent reads to atomic appends may see torn writes depending on OS, filing system, and what flags you opened the file with - the increment of the maximum file extent is atomic, but the visibility of the writes with respect to reads may or may not be atomic. and rather confirms my (very early) guess 'visibility of the writes with respect to reads may or may not be atomic'. The question naturally goes: atomic to what. And I read that as atomic to the file offset setting. To your puzzling "negative" to my theory 'tail -f' use case notice that according to https://en.wikipedia.org/wiki/Tail_(Unix ) it actually relies on inotify before attempting to read. It would be reasonable to expect inotify respects atomicity (of EOF setting and last bytes written). Could you please consider whether these arguments are reasonable. Cheers, Andrei
            Elkin Andrei Elkin added a comment -

            Sergei, please read my clarification to the patch. I think we can suspect dirty read and
            explain its mechanism via docs which I refer. Thank you. Andrei

            Elkin Andrei Elkin added a comment - Sergei, please read my clarification to the patch. I think we can suspect dirty read and explain its mechanism via docs which I refer. Thank you. Andrei

            1. I think you misinterpret the standard. What it says is that "the file offset shall be set to the end of the file prior to each write" that is, before every write it does (in a sense) lseek(fd, 0, SET_END). It does not say that prior to each write the file length should be set to what it will be when the write will be completed.

            2. "torn writes" mean, exactly, that one can read partial data. Like, a writer writing "123456789" and a reader reading only "1234<EOF>", so, yes, writes are not (necessarily) atomic.

            3. inotify is much newer than tail. Even your linked wikipedia article says that tail used to probe the file, repeatedly reading till eof.

            serg Sergei Golubchik added a comment - 1. I think you misinterpret the standard. What it says is that "the file offset shall be set to the end of the file prior to each write" that is, before every write it does (in a sense) lseek(fd, 0, SET_END) . It does not say that prior to each write the file length should be set to what it will be when the write will be completed. 2. "torn writes" mean, exactly, that one can read partial data. Like, a writer writing "123456789" and a reader reading only "1234<EOF>", so, yes, writes are not (necessarily) atomic. 3. inotify is much newer than tail. Even your linked wikipedia article says that tail used to probe the file, repeatedly reading till eof.
            Elkin Andrei Elkin added a comment -

            >1. I think you misinterpret the standard.

            Sorry, I must admit that. We can not use it as the standard's confirmation for dirty read possibility.
            I think it's fair to find an answer to why the reader can't see the data state like

            1234[non-written-byte]EOF

            I could not find any guarantee that 'non-written-byte' can not be found by the reader.
            But the bug case all but exposes it as incorrect value of 'data_len'.

            Yes, I was content with the current implementation of tail-f not to neglect what was before. In particular it could use some other methods/api:s to achieve atomicity of EOF and data-written.

            Elkin Andrei Elkin added a comment - >1. I think you misinterpret the standard. Sorry, I must admit that. We can not use it as the standard's confirmation for dirty read possibility. I think it's fair to find an answer to why the reader can't see the data state like 1234 [non-written-byte] EOF I could not find any guarantee that 'non-written-byte' can not be found by the reader. But the bug case all but exposes it as incorrect value of 'data_len'. Yes, I was content with the current implementation of tail-f not to neglect what was before. In particular it could use some other methods/api:s to achieve atomicity of EOF and data-written.
            Elkin Andrei Elkin added a comment -

            Sergei, could you please check out [Commits] c7a2d42c30b: MDEV-14014
            which implements your simplification.

            Cheers,

            Andrei

            Elkin Andrei Elkin added a comment - Sergei, could you please check out [Commits] c7a2d42c30b: MDEV-14014 which implements your simplification. Cheers, Andrei

            People

              serg Sergei Golubchik
              niljoshi Nilnandan Joshi
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.